The True Value of Comprehensive Disaster Recovery

Or, “Why You Really Need to be More Serious About Your Disaster Recovery Program”

I believe that most people would agree with the following statement: “No business, school, or non-profit, wants their organisation to collapse due to unrecoverable technical issues.”

Seems like a no-brainer, right? Well, as true as that statement might be, what I have discovered is that very few organisations – especially small ones – actually put serious effort into ALL the following activities:

  • Developing a robust Disaster Recovery (DR) program
  • Implementing their robust Disaster Recovery program
  • Testing their Disaster Recovery program

Yes, many organisations have considered a disaster recovery program.

Many of those organisations have even had discussions to understand what such a program would entail and look like.

A smaller percentage of those organizations have taken the time to document an appropriate disaster recovery plan.

An even smaller percentage of those organizations end up testing their DR plans on a regular schedule, and in a consistent way.

Don’t be one of those organisations!  Here are some reasons why you really need to be more serious about your disaster recovery program:

1. Downtime is more costly than you think.

Unplanned downtime is expensive.  Some downtime calculations have put the number at over $5,000/minute!  But, this is only the tip of the iceberg.

For businesses, not only is your revenue stream and day-to-day operations impacted while your business is off-line, but you may have to spend urgent/emergency rates to get your equipment back online – especially if you are not well prepared for the event in question.

For schools, the sudden interruption of services can affect your school district’s ability to educate students in the classrooms, result in a losing access to important student records and financial data, and potentially result in safety (loss of cameras, PA systems, etc.) concerns within your school.

Rush shipping, overtime pay, weekend maintenance, and last-minute consulting costs are all ways in which downtime costs can exceed expectations.  And that doesn’t even touch the issue of refunds to cover service level agreement (SLA) violations, lawsuits for breach of contract, or outright customer defections.

Reputations can be very difficult to repair, and that can hurt the bottom line for months or years to come.

2. There are numerous potential failure points

The reason why your organization is down, or why your service is interrupted, often doesn’t matter to your customers – at least not while your business is down.

Some of the more likely “disasters” include:

  • hardware failure
  • software failure
  • sabotage from a disgruntled insider
  • denial of service from the outside
  • security breach
  • natural disaster
  • geopolitical event
  • pandemic (Zika, Swine Flu, etc)
  • sudden death/illness/injury of key personnel
  • terrorism

Once (if) your business is back online, your customers – or former customers! – may be concerned about the reason for the incident, your ability to prevent subsequent incidents, and your ability to recover more successfully in the future.  (You hope that they care about those last two items, because that would mean that they are still your customers.)

If your DR program does not take all of the possible “disasters” which could apply to your specific business into consideration, then it is likely that your recovery will not be timely or successful.  Needless to say, that would not be a good thing.

3. Staffing Changes

There are two main reasons to test your disaster recovery plan regularly.  One reason is to verify its actual effectiveness.  The other reason is to keep the plan firmly entrenched in the minds of all your relevant staff.  This second reason takes on even more significance if your organization is like most, in that it encounters turnover occasionally – if not more frequently.

Very few things are worse than being hit with a disaster and then realizing that your DR plan’s previous appearance of effectiveness was due to the presence of a now-former key employee that held everything together by sheer force of will, and by virtue of his or her vast institutional knowledge of your organization, its technology, and its practices.

Your disaster recovery plan should be documented well enough to still be effective even if totally new – but appropriately skilled – employees were thrown into mix in the middle of an incident.   If your plan is heavily dependent on specific people (not roles, but people) being present, then it is a DR prayer, not a DR plan.

4. Compliance and Regulations

Every year, the increase of security incidents and corporate malfeasance, has led to increasing industry/government oversight, and thus more and more regulations are coming into play for organizations of all sizes.  Disaster Recovery and Business Continuity are often key elements of any regulatory program.  These days, even schools are subjected to regulatory compliance from the state as it relates to open records requests.

This means that the failure to have a well-documented disaster recovery program can undermine your business even before any disaster has struck.  Each day, it grows more likely that your own customers are likely to want you to prove that you have a good DR program in place, if only so that they can satisfy their own compliance efforts.  Not having this could result in the loss of existing customers, or the inability to acquire new ones.

5. You may not get a second chance

For years, I have observed that the reason why many organisations fail to implement key security or operational practices is that the penalty for failing to do so has not been that great.  Finally, this may be changing…

In 2014, the company Code Spaces suffered a security breach that was quite epic.  They had backups, and an apparently competent staff, but what they did not appear to have was a comprehensive Disaster Recovery program that took into account the many ways that their service could be interrupted, and which tested appropriate recovery options on a regular basis.

As a result, what initially seemed like an embarrassing outage that would take days or a couple weeks to recover from, became a classic business-ending event.

Don’t let this happen to you.

  • Develop a Disaster Recovery program.
    • Build and implement a Disaster Recovery plan.
    • Train your employees in the purpose and execution of the plan.
    • Test your DR plan monthly or quarterly.
    • Report on the effectiveness of the plan to senior management and the board.
    • Evaluate the plan annually to ensure that it mitigates current business risks.
    • Adjust the plan to current business needs/risks.
    • Update the documentation for the plan.

Take some time at the highest level of your organization and build a disaster recovery program which includes a DR plan that takes into account the most likely types of business interruptions you will face, and provides reasonably quick mitigation and recovery options.

Next, implement this DR plan and test it regularly: monthly or quarterly is highly recommended.  Each test should evaluate the efficiency and effectiveness of the recovery operations, and ensure that changes are made to the process to remediate any deficiencies.

Disaster Recovery Plans in Summary

Lastly, make sure it is well documented, updated on a regular basis, and that the results are reported to senior management and the board.  It is a good idea to make sure that all relevant staff practice in the recovery drills/tests each time, since you never can be sure that the employees which have the most expertise and experience with your infrastructure and applications are the ones that will be present and available for a real event.

“No business owner wants their business to collapse due to unrecoverable technical issues.”

Now is the time to prepare to make this a true statement for your business.