At base2Services, we live and breathe cloud infrastructure, but even we know how crucial it is to step back from daily operations every now and then and dive deep into the tools we use every day. That's exactly what we did for our latest Hackday: a focused research session on AWS Backup.
Our goal was twofold: to better understand its current features and limitations, and to explore how we can leverage the service to solve our customers' most complex Disaster Recovery (DR) and Business Continuity Planning (BCP) requirements.
Putting Features to the Test: EBS Item-Level Restoration
You can read all the documentation you want, but there's no substitute for hands-on testing. We decided to try out some of the more advanced features we're not currently using in all customer environments, focusing on one in particular: EBS item-level restoration.
In a typical DR scenario, restoring an entire EBS volume from a snapshot is the standard, but it can be slow and costly if you only need a single file or directory. The item-level restore feature promised a more granular, efficient solution.
Our test process was straightforward:
- Take an EBS Snapshot: We started by snapshotting a running EBS volume containing a sample file system.
- Create a Search Index: We then used AWS Backup's capabilities to create a search index for all the individual files on that snapshot.
- Perform Item-Level Restoration: With the index in place, we could search for and restore specific files or directories directly from the snapshot (without launching a new volume) and place them in an S3 bucket.
Our key takeaway? This is a game-changer for operational recovery. It gives us the ability to recover individual files for a user quickly and efficiently, dramatically reducing the Recovery Time Objective (RTO) for common "Oops, I deleted that" scenarios.
From Backups to Bunkers: Rethinking the DR Strategy
Our hands-on testing and deep-dive exploration led to a wider discussion about comprehensive DR/BCP strategies, inspired by some recent industry talks on the topic. We undertook a detailed technical review of AWS Backup, explored console scenarios, and debated implementation paths as we reflected on our existing "databunker" solution and how a modern backup strategy isn't one-size-fits-all. Instead, it must be layered to address different types of threats.
We broke it down into three distinct backup models:
- Operational Backups
The Problem: Accidental deletion, data corruption, or minor operational errors.
The Solution: This is what most people think of as "backups." They are frequent (e.g., daily snapshots or point-in-time recovery points) with a short-term retention policy. The EBS item-level restore we tested is a perfect tool for this, allowing for fast, granular recovery with minimal disruption. - Disaster Recovery (DR) Backups
The Problem: A site-level or regional disaster (e.g., a data centre flood, fire, or large-scale AWS outage).
The Solution: This is about business continuity. These backups must be automatically replicated to a separate, geographically distant AWS region. The key here is not just data availability but the orchestration to bring services back online in that new region. This involves testing, automation (like AWS CloudFormation), and clear RPO/RTO targets. - Cyber Attack / Ransomware "Databunker" Backups
The Problem: A malicious actor gains access to your environment and actively tries to destroy or encrypt your data and your backups.
The Solution: This is the "databunker" concept. These backups must be protected from your own systems. The key principles are:
- Immutability: Using features like AWS Backup Vault Lock (WORM - Write-Once-Read-Many) to ensure that once a backup is written, it cannot be deleted or altered, even by a root admin, for a set retention period.
- Logical Air-Gapping: Storing these critical backups in a completely separate, "databunker" AWS account with highly restricted, break-glass-only access.
- Least Privilege: Ensuring that the credentials used for daily operations do not have permission to delete or modify these long-term, immutable backups.
What We Learned
We not only validated powerful features like EBS item-level restore but also solidified our strategic approach to modern data protection.
AWS Backup has evolved from a simple snapshot scheduler into a robust, centralized platform that can and should be the cornerstone of a multi-layered DR strategy. By combining its features with a clear understanding of operational, DR, and cyber-threat models, we can design solutions for our customers that are not just resilient but truly hardened against the worst-case scenarios. Contact us today to discuss how you can recover from disaster quickly and ensure continuity for your business.