AWS Disaster Recovery Options

A useful comparison chart of DR options from a CloudAcademy AWS Architect (Associate) Certification course:
click on image for enlarged view

Options   Backup & Restore Pilot Light Warm Standby Multi Site
 DescriptionLike using AWS as a Virtual Tape Library Minimal version of environment running on AWS  Scaled down version of fully functional environment always running Fully operational version of fully functional environment always running off site or in another region
Services Used  AWS Storage Gateway, Import/Export, Glacier, S3  AMIs, ELBs, CloudFormation, RDS replication AMIs, ELBs, CloudFormation, RDS replication  All
RTO

 High
 (8-24 hours)
Moderate
 (4-8 hours)
Minimal
 < 4 hours
 Lowest
< 60 minutes
 RPO Since the last backup; up to 24 hours Since the last snapshot. While core pieces of system are in place, some installation and preparation may be required. Since the last data write if a master / slave multi-AZ DB. May be asynchronous only which would increase the RPO Choice of data replication influences RPO 
Cost considerations Low

Recovery time may involve getting tapes/media delivered to site
Disk/tape management
Low


Keeping all services/ libraries / patches up to date adds an administrative overhead

Medium

Environment can be used for dev/test off setting cost
High

The ongoing cost of maintenance / operation needs to be factored in.

Traditional DR vs DR on Public Cloud

  1. Update Status Page
  2. Restore Datastore(s) in prodY from latest prodX
  3. DB
  4. Authentication
  5. Authorization
  6. Cache
  7. Blob Storage
  8. Restore backend microservices
  9. Bootstrap services with particular focus on upstream and downstream dependencies
  10. Swap CloudFront distribution(s)
  11. Swap API endpoint(s) via DNS
  12. Update DNS records to point to prodY API endpoints
  13. Verify recovery is complete
  14. Redeploy stack from user account to verify service level
  15. Update Status Page

Comments