Skip to content

Disasterrecovery

Basics

  • Effective DR & BCP costs money.
  • There is an ongoing tradeoff between cost of maintaining DR/BCP readiness vs time and cost of faulure after the BCP event occurs
  • The 4 main types of DRBCP architectures are
  • Backup & Restore
  • Pilo Light
  • Warm Standby
  • Active/Active Multi Site

Storage

  • Instance Store Volumes - No resilience to failure.
  • EBS - Data is replicated within AZ. IF AZ fails data is inaccessible
  • S3 - Data replicated across multiple AZs. Hence EBS snapshots can be backed up in S3 to give more resilience. One can either backup the S3 objects into multiple regions or have multi region replication enabled to get global resilience
  • EFS - File system is replicated across multiple AZ.

Compute

  • No truely global compute services in AWS
  • EC2 - Host failure means there is no resilience. Auto scaling groups running in different AZs are usueful for regional resilience
  • Lambda - It is regionally resilient. Only a failure of entire region can cuase lambda to fail

Database

  • Dynamodb - Regionally resilient. Runs in public space
  • RDS - has one primary & one replica in a different AZ
  • Aurora - Multiple replicas in each AZ. Whole region to fail for storage to be impacted
  • Global databases can be created replicating replica tables in multiple regions

Networking

  • VPC,VPC Router & IGW are regionally resilient. Subnets though are limited by AZ
  • Elastic Loadbalancers are regionally resilient
  • Interface Endpoints are AZ resilient. Deploying one Interface Endpoint in each AZ can give Regional resilience