Many enterprises strive to differentiate themselves from their competitors by providing jazzy new features and functionality. But it’s the businesses that are on 24 x 7 x 365 that get (and retain) the most customers, and they do this by having data centers and databases that can support always-on mission-critical applications.
To put it simply, there’s no point in giving your customers fancy, multi-colored lights if you can’t reliably keep them on. Resiliency against failure is THE competitive differentiator today for any digital enterprise.
In a new paper titled, “IT Resilience — 7 Tips for Improving Reliability, Tolerability and Disaster Recovery,” Gartner drives this point home and lays out seven tips on what to focus on when kicking off an IT resilience initiative. One of these tips, “Evolve Beyond Recovery Theater,” discusses the importance of doing more than just disaster recovery drills.
Disaster Recovery in Public Cloud IaaS
As part of their recommendation to be truly prepared for disaster recovery—as opposed to just going through motions only to find out that you’re not prepared when the disaster really happens—Gartner recommends addressing disaster recovery at the public cloud IaaS level.
Per Gartner: “Ignoring disaster recovery or applying a one-size-fits-all approach for cloud IaaS is not advised. The challenge most organizations face is getting stakeholders to understand why disaster recovery is still needed and then sifting through the hundreds of possible architectural approaches.”
Instead, Gartner recommends applying the same tiered approach companies have used for on-premises workloads for decades and then creating the associated deployment models.
“Each organization’s examples will be different. But they will share a common purpose — to enable conversations around trade-offs toward balancing cost, risk and architectural decisions.”
Via the tiered approach, Gartner recommends enterprises slowly scale their disaster recovery infrastructure from running within a single availability zone with backups in another availability zone or region to using multiple availability zones and multiregional applications.
Gartner: “Then practice what you build, the whole gamut: detect, respond, fail over, operate testing in DR, rebuild production from scratch, cut back over and verify.”
Cross Data Center Replication to Achieve Resiliency for Increasingly Complex Applications
Gartner’s tiers described above apply to the increasing complexity of applications. The lower the tier, the less complex the application you can run on your system. You can think of low-complexity applications as read-only applications based on caching technologies where the data can be restored without any loss from point-in-time snapshots. But as you progress up the tiers and into supporting increasingly complex applications running on microservices-based architectures, the requirements around resiliency, availability, and disaster recovery begin to demand newer technologies, such as cross datacenter replication (XDCR).
To put the importance of XDCR into context, we need to understand that in a cloud-native, digital world, individual component failures are a fact of daily life and need to be handled before they cause service unavailability. Resiliency then becomes quintessential for avoiding data loss and ensuring business continuity.
But to achieve true resiliency, enterprises need to factor in both planned and unplanned downtime. That’s where XDCR comes in.
XDCR replicates data across two data centers to account for the scenario of one of them going down. While this works, traditional XDCR runs into problems when you add the possibility of unplanned downtime during planned downtime (ie, your backup data center going down), which is when you quickly realize that two data centers aren’t enough.
To ensure Gartner’s highest tier of resilience, which, as stated above, is now table stakes for enterprises needing to stay “on” in the face of infrastructure failures, our customers have been telling us that they need to replicate data across three or more data centers.
VoltDB’s Active(N)TM Lossless Data Center Replication provides this level of resilience by allowing three-plus data centers to be in an XDCR relationship in a cloud-native manner. This means that data can be changed in any of the data centers without operator intervention, and with the capability to fix conflicts at the application level. When combined with intra-cluster high availability where individual node failures will be tolerated and full durability with snapshots and command logs, VoltDB provides complete protection against failures ranging from individual nodes in a cluster to an entire cluster to multiple clusters. By managing the deployment of VoltDB across multiple availability zones in multiple regions all connected with our Active(N) replication, our customers provide the benchmark of resiliency compared to their competitors.
To read the full Gartner report, click here.