When a disaster strikes and takes down the IT systems that are essential to operations, the IT team is often called on to enable a quick recovery. A disaster recovery plan (DRP) can help get systems back online quickly and efficiently. This plan documents the procedures required recover critical data and IT infrastructure after an outage.
But, is a “disaster” plan different than a normal recovery plan? Surprisingly many IT teams rely on normal backup plan and consider it as “disaster recovery plan” which is the wrong approach. In this blog, I want to explain the differences and types of disaster recovery approaches your team can take to prepare for everything from power failures to broader scale issues.
Disaster Recovery Planning vs. Backup Planning
A “disaster” is categorized when the facility where your infrastructure is hosted is no longer operational. The reasons for failure could be small local issues like fires or power and utility failures. However, disasters can also include even broader scale problems like floods, tornadoes, storms, hurricanes, and civil disturbances which can have an impact on a regional level. A disaster recovery plan (DRP) includes procedures required to recover data, system functionality and IT infrastructure after an outage with minimum resources.
A disaster completely shuts down operations in an area rendering any backup plan associated with that area non-functional. The limitations occurred by the events mentioned above could result in buildings, equipment and IT systems being unusable. Without public utilities like water, electricity, heating and cooling life halts, and no system can work on its full capacity. Communication channels, particularly in IT Sector, are the backbone to carry out work. Disasters often cause widespread outages in communications, either because of direct damage to infrastructure or sudden spikes in usage related to the disaster.
In some cases, it’s actually mandatory to have disaster recovery plans due to compliance regulations, although it is good to have disaster recovery plan in place for every system where possible. When starting to create your plan, begin with building a “Disaster Recovery Project Team”. This team should consist of experts from both the business and technical teams. They should decide which business processes have a critical impact on the organization and what losses may happen if they go down. In addition to resource and critical failure planning, an additional plan should be put in place defining how teams will communicate in a disaster with the absence of infrastructure.
Now that you’ve got your team and resources planned, let’s consider some recovery strategies.
3 Recovery Strategies to Consider
There are different strategies that can be adopted in any disaster recovery plan. In this blog, I won’t go into full detail on all of them. However, I want to touch on a few strategies for IT infrastructure recovery in disaster situations.
“Cold Backup” is an IT infrastructure recovery strategy where all necessary data is kept safe at different locations according to the disaster recovery plan. The whole system is not in a state where it can be simply started with the flip of a switch, but needs to be recovered piece-by-piece. Everything from installation to data recovery will need to be done to bring services in an operating state. Normally, this doesn’t require any license as there are no working pieces that are in operation. Cold backup is the least expensive recovery strategy but requires the most time to get systems up, running and serving.
“Warm Backup” represents a setup where a reasonable hardware infrastructure and software installation are already available. The environment will simply need data from the latest backup to start serving. This setup does require a license (mostly non-production license) as the system is ready to serve but it’s not actively participating. It’s a more expensive setup than a cold backup setup, but requires far less time to get up and start serving.
This is the most expensive disaster recovery setup. It consists of matching the same hardware and software modules as your original system. It also remains as up-to-date as an original setup. It also may have access to the same data which is replicated to disaster recovery sites or may receive it on regular basis. At some organizations, this setup is also used as a geographical load balancer. It does require a production license if it is serving but that varies from vendor to vendor.
Anyone and everyone can reasonably say that “there is no chance of an earthquake in my area”. I hope and wish that this never happens, but IT teams need to be ready. Major disaster events may not happen at all, but you may encounter small level events happening quite frequently like fires, power outages and bad weather. If you do not have the right plan in place, believe me, that will cost a fortune in terms of business losses due to downtime. At Talend, we recommend taking disaster recovery seriously. Talend products support different disaster recovery strategies like the ones mentioned earlier (hot, warm and cold). My recommendation is to act before it’s too late. The right plan will improve business process, minimize disruption and will bring an edge over competitors.
Resources: IT Disaster Recovery Planning For Dummies® by Peter Gregory; Philip Jan Rothstein
Bigdata and data center