The Five Basics for Preparing Your Response to an Emergency Scenario in the Data Center
What is involved in a disaster recovery plan? In this article, you will learn:
- What is disaster recovery?
- When should you start a disaster plan?
- How do you prepare for a disaster?
- Who should be involved in the planning process?
- What is the business impact of a disaster on your organization?
- What is your recovery time objective and recovery point objective?
- How detailed should your disaster recovery plan be?
- Why is a disaster recovery plan important?
From Service Express CTO, Jake Blough
“Disaster recovery is no longer a nice-to-have for most companies. More frequently the choice is to house DR or high availability systems in colocation facilities that are built to handle natural and other disasters. DRaaS is also more common with companies utilizing cloud providers to house their secondary systems. Best practice is to have this infrastructure housed in a purpose-built redundant colocation center with multiple power and communication connections located within a 4-hour drive from the primary data center.”
Your business operations and productivity are intricately linked to your servers, workstations and web apps. How do you prepare for a natural disaster recovery situation? Does hurricane season threaten your productivity? Where does your IT disaster recovery plan stand? Have you performed a risk analysis? What and who should the disaster recovery planning process include?
What is disaster recovery? A disaster can be anything that puts an organization’s operations at risk. Your disaster recovery plan should be coordinated with each business area’s continuity planning process. Defined recovery point and recovery time objectives need to be determined in order to solidify overall process, technology and application readiness.
Start now. With a business continuity and disaster recovery plan in place, you are able to mitigate the impact of severe weather events and meet the challenge should your data center be hit by water, wind, fire or structural damage.
Be prepared: disaster readiness and risk reduction. The goal of a disaster recovery plan is to maintain technical operations and quickly restore your company’s ability to operate. To effectively prepare your IT infrastructure in the face of hurricanes, tornadoes, floods, fires, earthquakes, or other adverse conditions, consider the following five steps of disaster preparation for your data center(s).
Disaster Recovery and Business Continuity are the top deciding factors to deploy off-prem.
Read more in our Data Center & Infrastructure Report.
Top Off-Prem Drivers
Who Are Your Disaster Recovery (DR) Team Members?
A Disaster Recovery Plan (DRP) needs many points-of-view and operational focuses to be truly comprehensive enough to be effective. A planning process, action and communication will take place on multiple levels prior to, during and after a disaster. Ensuring the appropriate team members are involved will make for a smooth process.
Involve key business leaders to play a significant role in your disaster recovery planning process. It is a company challenge and responsibility, not just an IT problem. Facilities and customer support should also have a place at the planning table.
The DR plan conductor
Assign a Disaster Recovery plan owner/project manager who understands the scope and impact of the project, can clearly articulate what is needed from the various team members, and can hold all parties accountable.
Your disaster recovery team members should include those with IT infrastructure expertise – as well as collective problem-solving skills, decisive decision-making confidence, detail-oriented focus and the ability to communicate effectively.
Do not overlook your vendors and partners. These professional contacts can generate helpful input, experience and tactics to strengthen your final disaster recovery plan.
What Is the Business Impact of a Disaster on Your Organization?
Your disaster recovery plan should include identifying and ranking your key services and risk analysis so you can focus on your most critical outcomes during an emergency service outage. You need this clarity for both your disaster recovery plan and your business continuity plan (BCP) to be able to respond more effectively in the event of a disaster. What steps can you take to determine where you should invest resources into protecting and restoring vital functions? What risks are involved?
Start with a business impact analysis (BIA). A business impact analysis helps you analyze the effect of a business disruption on your critical business functions and perform a risk analysis.
Key questions in the BIA help uncover gaps within your BCP. Require each department to identify their top 3-5 critical processes with risk analysis and provide documentation for each process. These requirements build resiliency into your departments and overall organization.
What’s the difference between disaster recovery and business continuity?
The disaster recovery planning process targets IT infrastructure, uptime and data, and how it supports business operations. The goal of a business continuity plan is a larger strategy that focuses on maintaining regular business functions during a disaster.
What Is Your Recovery Time Objective and Your Recovery Point Objective?
You should take your business impact analysis a step further by defining your Recovery Time Objective (RTO) and Recovery Point Objective (RPO). Each involve a detailed risk analysis.
Technopedia defines Recovery Time Objective as “the maximum desired length of time allowed between an unexpected failure or disaster and the resumption of normal operations and service levels.” Your RTO designates the point in time after a failure or disaster at which the consequences of the interruption become unacceptable. Beyond downtime, RTO also includes the steps IT must take to resolve the application and its data.
RPO stands for Recovery Point Objective and addresses your organization’s definition of the maximum amount of data that can be lost before the impact to the business is unacceptable.
When thinking of RPO, focus on the point in time you need to be able to restore back to. Historically, organizations had an RPO of whenever nightly data backups were taken. This is changing with replication becoming more widespread, allowing for Recovery Point Objectives to be as short as 5-10 seconds.
Click image to view larger
Recovery Point Objective and Recovery Time Objective should be defined for each critical business function and application. Times will differ based on business priorities and objectives determined through risk analysis. Each of those defined goals will then help drive appropriate technology and data backup investments.
How Detailed Is Your Disaster Recovery Plan?
Your plan should include a host of business continuity details including staffing, technology (hardware, software, systems), power options, data backups, relocation sites, and internal/external communication.
Before the planning process begins, ensure that all key business systems are backed up and test the backups. This is the most important thing you can proactively do to combat the adverse effects of a disaster. As you start or continue down the planning path, at a minimum, your plan should include the disaster recovery team member contact information and roles, essential vendor/partner contact information, a list of essential/prioritized services, clear expectations as to what events will engage which specific actions, a communication chart of who is involved in the response actions, status checks, reporting and where process documentation can be accessed.
You can also create a Disaster Recovery plan for additional environments such as your smaller edge sites and remote offices. Depending on the size of your organization and level of expertise, this could be a recommended strategy in addressing the multiple steps needed for each focus area.
Why is it Important to Test a Disaster Recovery Plan?
As the windspeed increases or the flood waters rise, your goal is protecting and preserving uptime, not discovering weaknesses in your disaster recovery response. One popular method for testing your plan is a tabletop exercise. A tabletop exercise is a meeting to discuss a simulated emergency situation. The goal of a tabletop exercise is to uncover the strengths and weaknesses in your disaster recovery plan. Testing your plan for the first time during an actual disaster will not serve you well.
To set up for a successful tabletop exercise, plan for the following:
- Set goals. Determine what you want to achieve and how you will evaluate the state of your plan. Which issues do you want to address? For example, availability/accessibility of fuel for generators, slow restoration of city services due to skeleton crews, transportation and power challenges, communication strategies with employees, partners and customers.
- Select participants. For initial testing, it can be best to start small. Retest with larger group as your disaster plan evolves.
- Establish ground rules. Don’t throw in the kitchen sink. Stick to the goals. And maintain a no-fault/no-blame attitude.
- Develop a disaster scenario. Make it realistic. Use real-world situations and, if possible, use an emergency event that has happened within your organization.
- Conduct the entire exercise. Run through the exercise from start to finish. Document uncertainties for follow-up.
- Key vendors. Do you rely on any third-party vendors? Consider asking them to participate as well.
- Document results. Take detailed notes throughout. Discuss what went well and what didn’t. Update the plan!
Even if your business operations are located far from a hurricane-prone coastline, you could still find yourself dealing with the business impact of a natural disaster. Your disaster recovery strategy needs to ensure the security, performance and post-disaster operations of your data center(s).
Use the five essential questions above to help direct what your disaster readiness plan should include or to evaluate your current disaster recovery solution. With time, clarity, input and careful preparation, you can build an effective strategy for your IT team to successfully respond to any unwelcome disaster that may threaten your data center.
Interested in learning more about Service Express?