What Is Recovery Time Objective, and Why Is It a Key Metric for Disaster Recovery Planning?

RTORecovery Time Objective (RTO) is the maximum amount of downtime your business can tolerate without incurring a significant financial loss. RTO goes together with Recovery Point Objective (RPO), or the interval of time during which your business can recover from data loss brought about by an outage. Reasonable RTO and RPO targets make for a smoother transition back to normality after a disaster and should form part of disaster recovery planning.

Definition of Recovery Time Objective (RTO)

RTO is the target time needed to recover your business and IT infrastructure after a disaster. For example, a two-hour RTO means that you give responsible personnel two hours to bring your services back up again. Data recovery falls within the scope of RTO.

When setting your RTO, you should ensure that it reflects the nature and state of your business. For example, if your business relies on online transactions, overly long downtimes may affect your viability in serious ways. Thus, your RTO should be short enough to minimize the impact. A good RTO, in this case, would be to have your operations up again in an hour or two (at most).

In contrast, an organization that can afford to operate using paper orders and manual invoicing for a day or two can afford to have a 1- or 2-day RTO, or even a one-week RTO, in extreme scenarios.

In some cases, long downtimes may be unavoidable due to natural disasters that down your infrastructure, your service provider’s infrastructure and that of those around you. If your business cannot afford such downtimes, you may have to spend more to ready your IT infrastructure for these types of disasters.

An option is to outsource your IT services to more reputable providers. Do not forego due diligence to ensure that you are getting the most capable provider. Negotiate with providers to get the best possible terms, and make sure that support availability, response times, and resolution times are stipulated in your service level agreements.

You may also set different RTOs depending on the severity of the outage. For example, if a server crashes, a 1-hour RTO may be enough. Make allowance for longer RTOs in worst-case scenarios such as natural disasters.

Ideally, your RTO should not extend beyond the maximum point in time where your business can still afford to shoulder revenue and other losses without substantial operational impact. You may need mitigation procedures to avoid failing your RTO. Testing these procedures should be part of your disaster recovery planning.

When disaster strikes, the performance of your IT team in meeting your RTO may depend on your recovery procedures. If your team plans and rehearses well, RTO may be shorter or equal to Recovery Time Actual (RTA), or the actual time it takes for your team to recover from downtime. In this case, congratulate your team for a job well done.

Definition of Recovery Point Objective (RPO)

RPO is the maximum amount of time you can afford to lose data without impacting your business significantly. Data loss beyond RPO may prove harmful for your business. For example, a two-hour RPO means that you should schedule your backups every hour to allow data recovery in case of downtime. Defining an RPO helps avoid any data loss in case any of your applications go down.

When computing RPO, account for data loss that is acceptable to your organization. Different applications may have different RPOs, depending on how crucial they are to your operations. Data backups are covered in RPO.

Like RTO, RPO depends on the nature of your business. RPO may be anywhere from near-zero to 24 hours. Near-zero is ideal for large enterprises required to maintain data integrity for regulatory purposes. Longer RPOs may be ideal for small businesses that can operate for up to a day without the need for records. Other organizations can operate with an RPO in between these extremes.

RPO is an important consideration when setting up data backup procedures in your disaster recovery plan. For example, if your business cannot afford to lose any data when disaster strikes, you can include cloud storage solutions for data backup and replication. In this case, if data loss even occurs, it is kept to a minimum, as failover strategies start automatically.

For organizations with less stringent data requirements, data backups can consist of regular and ongoing production snapshots. For those that can exist without records for up to a day, external storage backups or traditional tape backups may be enough.

In any case, a shorter RPO leads to availing of the more expensive data backup options. Seamless failover and failback cost more than production snapshots and storage backups.

When setting up your disaster recovery plan, make sure your RPO mirrors your tolerance for data loss. Both RTO and RPO should be considered when planning for disaster recovery.

Data recovery procedures should be part of your disaster recovery rehearsals. Strive to have a Recovery Point Actual (RPA), or the actual time it takes to recover data, that is shorter or equal to your RPO. You may need to revise your disaster recovery plan if RPA proves longer than your RPO.

Similarities and Differences Between RTO and RPO

There may be some confusion as to what RTO and RPO mean, and what comprises them. Let us clear this up by showing the similarities and differences between the two concepts.

First, let us discuss how the two are similar.

RTO and RPO also have their differences.

How to Calculate RTO and RPO

RTO involves the totality of your applications and systems. Generally, RTO considers RPO since data recovery is part of RTO. A substantial portion of the cost of achieving RTO may be devoted to RPO.

Consider the following factors when calculating RTO:

Since RPO deals with data only, it is easier to calculate than RTO. As mentioned previously, a shorter RPO may require more time and money. While a longer RPO may be cheaper, you run the risk of losing more data.

When calculating RPO, consider these factors:

Your organization’s IT staff, financial resources, and reputation are other considerations when calculating RTO and RPO. A survey of users, applications and systems must be undertaken, with the goal of knowing how crucial systems and the data residing in them are.

After the survey, calculate the costs of downtime versus its impact on revenues and other financials, then run the worst-case scenario for what happens when disaster strikes. Disaster recovery planning should be assessed constantly for overall effectiveness. During this time, you should also review the effectiveness of your RTO and RPO and revise them accordingly.

How Parallels RAS Monitoring will help you manage your RTO and RPO

Parallels® Remote Application Server (RAS), a streamlined solution for secure access to virtual desktops and applications, can help you manage your RTO and RPO with a suite of monitoring tools ideal for use in multi-cloud environments.

Parallels RAS facilitates centralized delivery of server-based desktops and applications, and backup of endpoints. It has built-in multi-factor authentication and smart card authentication for application and desktop access. Its Secure Client Gateway supports Secure Sockets Layer (SSL) or Federal Information Processing Standard (FIPS) 140-2 encryption in adherence to the Payment Card Industry Data Security Standard (PCI DSS), Health Insurance Portability and Accountability Act (HIPAA) and General Data Protection Regulation (GDPR) policies.

Parallels RAS Reporting Engine enables real-time reporting about your server farm, providing insights about users, devices, servers and applications. Using these reports, administrators can adjust the availability of devices to match traffic and group requirements, track most-used applications, reallocate resources where needed and plan for future expansion.

Parallels RAS Performance Monitor also allows administrators to analyze bottlenecks and resources, and resolve misconfigurations, if any turn up.

Get started with Parallels RAS by downloading the trial.

References

Wikipedia

Precisely

IBM