Learning from recent disaster events: Revisiting your redundancy plans
As unpleasant as they are, disasters happen—from the recent derecho that struck the Midwest to this year’s active hurricane season. However, disasters are not just limited to weather: they include fires, power outages, server failures, cybersecurity events or even human error. No industry or geography is immune to disaster, making frequent evaluation of your redundancy and disaster recovery plans a must.
In the aftermath of the recent Midwest derecho, we observed many companies’ gaps in redundancy plans that forced productivity to grind to a halt. Therefore, adjusting your disaster response strategies can help you avoid similar issues within your organization when disaster strikes.
Considering the distance between primary servers and backups
Within a disaster recovery plan, companies typically establish a backup data site, where information is replicated and stored away from its primary location. A traditional plan geographically separates the two locations, but in this disaster event, a significant number of organizations found both their primary and backup sites damaged—or in some cases, completely destroyed.
This event has demonstrated the real risk in lacking a geographically diverse backup data center. Sometimes, operating a backup site within a company-owned facility at a safe distance is not always realistic. For example, a community bank may operate several branches, with the primary data center in one and the backup in another—but they still remain in close proximity to each other. Luckily, colocation centers, available across the country, provide a safe data storage alternative at a reasonable cost.
The downside to battery backups and generators
Many companies in the path of the storm operated power protection for data centers in the form of battery backups or generators, which is a fairly common strategy. Unfortunately, we found that not only did a large percentage of them fail to stay operational until utility power was restored, some actually contributed to damage of the systems they were designed to protect: repeated power outages and dips in voltage, caused by failing generators and battery backup systems, resulted in destroyed server and storage hardware.
In addition, companies who utilized generators found that fuel was in short supply, and delivery trucks were diverted to hospitals and other critical infrastructure.
In these scenarios, a backup data center that safely replicates and stores critical data at a safe distance—namely, outside the storm’s path—is a more sound solution moving forward.
Redundancy for telecom and internet services
Among the first things to fail during the storm were telecommunication and internet services. When a carrier went down, cloud- or other remote resource-dependent companies who did not have carrier redundancy became vulnerable.
You can help avoid these scenarios by building as much redundancy as possible into your wide-area network design. Developing a design can be a complex process—and some options may not be available in certain areas—but a well thought-out design focused on redundancy is a major factor in ensuring business continuity in a disaster scenario.
For example, it may be wise to buy phone and internet service from separate carriers. During the Midwest derecho, overhead lines were completely destroyed, but buried lines and cellular networks remained viable. You may not always know whether your specific communication methods rely on overhead or buried lines, but diversification will increase your chances of staying connected during a disaster.
In addition, connection type can make a critical difference in whether communications remain available. For instance, companies with voice services relying on Primary Rate Interface (PRI) connections remained down for quite some time, while Session Initiation Protocol (SIP)-based networks, where lines can move around to find a more reliable connection, continued to serve. As an added benefit, SIP connections are more affordable due to their virtual nature.
Implementing carrier diversification does not necessarily mean doubling your cost, however. Even residential-grade connection backups are a key enhancement in maintaining communications during a disaster.
Should some companies consider bringing servers back in house?
Within any geographic area, companies manufacture products within a single building and store company data and applications in the cloud. However, in the midst of the storm, some companies lost their cloud-based lines of communication, stifling both productivity and efficiency. These organizations, which possess the ability to produce as well as the software to direct production, depend on the collaboration of these dual elements to keep the line moving.
This situation has led some companies to reconsider the benefit of the cloud if they have lost the ability to produce goods. Potential solutions include bringing servers back in house or developing more reliable and diverse connections to cloud infrastructure.
Events that were once seemingly unthinkable are now reality, and your plans should adjust accordingly. Your company will inevitably face a disaster scenario, and proactive planning can go a long way to ensure your critical operations continue to thrive, even under the most adverse conditions.