Five Important Pillars for a Successful Disaster Recovery Strategy

Joe Tavin
3 min readMay 27, 2023

--

Consider these five essential pillars when developing your Disaster Recovery Plan and Strategy.

Photo by Daniel Tausis on Unsplash

Disaster is Inevitable

In the digital world, the question isn’t if disaster will strike but when that will occur. Many events could disrupt digital operations, from power outages to ransomware attacks, server failures, and even natural disasters. An essential step towards a good Disaster Recovery strategy is, first of all, acknowledging this inevitability. Then, you’re already closer to resilience. Accepting this fact places you in the proactive zone, preparing you for scenarios others do not want to consider. As they say, hope is not a strategy. I’d like to lay out five important pillars for developing a good Disaster Recovery strategy for your software system.

1. RPO

Your Recovery Point Objective (RPO) measures how much data you can lose before it impacts business operations. In other words, how long ago from the disaster T=0 moment do we need to recover from. This metric will determine how frequently you need to back up your data.

T = Time of Disaster
Max Backup Δ ≤ T - RPO

For instance, if you set an RPO of 24 hours, you're effectively saying your business can survive with up to 24 hours of data loss. Determining this metric depends on fully understanding your business goals and how they affect the system's need for different types of data. In practice, however, RPO tends to be defined by contractual obligations that companies promise their customers.

2. RTO

Your Recovery Time Objective (RTO) is the maximum duration your systems are down after a disaster until the full recovery. In other words, the time it takes for your Disaster Recovery process to kick in and restore your software to full business continuity.

T = Time of Disaster
T' = Time of Recovery
RTO = T - T'

This timeframe can range from a couple of minutes for high-availability systems to hours or days. Remember, the shorter the RTO, the more costly the recovery process becomes. The cost comes in many aspects, such as the difficulty of the recovery process, the expense of data storage, and the number of resources it takes to be ready for a short recovery at any given time.

3. Data Mapping

Backing up your data is an obvious and critical part of being able to recover successfully from a disaster. However, backing up all your data is not an efficient use of storage and can make the recovery process longer and more laborious. To ensure you are backing up only the relevant data you need for business continuity, it is important to map out different types of Data in your application and make sure to eliminate any unnecessary data from the process.

4. Data Retention

Backing up is essential, but do not save everything for the rest of eternity! Data tends to quickly accumulate over time, sometimes even exponentially, as our datasets grow. Retaining all data indefinitely doesn't only cost money; it also increases the complexity of the recovery process. Setting a proper Data Retention policy ensures your Disaster Recovery Strategy will also be sustainable over time.

5. Recovery Process

The recovery process itself must be well-defined and tested. It can be either a manual or automated process. The Recovery Process should wrap up all the pillars above by using the backups you have after doing the data mapping to recover within the RTO timeframes from the RPO. It’s also imperative to regularly test and update your recovery plan to ensure it works when disaster strikes and you need it the most. Since the process is vital to the Disaster Recovery strategy, and our software and systems tend to change, the process should also be tested and validated regularly. Remember to keep updating and testing the updated plan continuously to ensure it stays effective and current.

Summary

When facing a Disaster, being prepared is already half the victory. Using the pillars above to define your strategy will ensure you are on the way to full business continuity success.

--

--