AWS EC2 Auto Recovery Using CloudWatch

CloudWatch includes a powerful feature that enables auto recovery of an EC2 instance if it ever fails a system status check. A key benefit of this feature is that it relaunches an instance with the exact same configuration, preserving any auto-assigned public IP addresses and using the current instance volumes.

Modern Update (2026): Automatic recovery is now supported for most deployed Amazon EC2 instances. Most current-generation instances (Nitro-based) support Simplified Automatic Recovery, which can be configured directly from the EC2 Instance console without manually building a CloudWatch alarm from scratch.

Every EC2 instance is monitored for two distinct types of status checks that report as metrics to CloudWatch:

  • System status checks: These identify AWS infrastructure issues, such as hardware failures, network connectivity loss, or power outages in the data center.
  • Instance status checks: These identify software or configuration issues, such as corrupted file systems, incompatible kernels, or exhausted memory.

The auto recovery option specifically targets system status check failures. It enables the automated migration of an instance to a new physical host when the StatusCheckFailed_System metric enters an alarm state.

Requirements and Considerations

  • This feature requires VPC EBS-backed instances.
  • It is available for the majority of current instance types in all AWS regions.
  • Placement Groups: Recovered instances remain in their original placement group.
  • Notifications: It is highly recommended to link these alarms to an Amazon SNS topic to receive immediate alerts when a recovery event is triggered.

For the most up-to-date configuration steps, see the official Amazon EC2 Instance Recovery documentation.

Author’s Note: This article reflects my personal professional experience and opinions. While my insights are informed by my professional history, these views are my own and do not represent the official position of my former employer.

About the Author: Jacob Marks is an engineering leader with over 20 years of experience, including a decade at Amazon Web Services (AWS) where he led teams in EC2 Core Platform and the development of the AWS Payment Cryptography service.

Labels

.NET .NET 10 .NET 3.5 Active Directory AD DS Adoption AI AI coding AI Ethics AI Hype Alerts Amazon Cognito Amazon DLM Amazon Q Anthropic AppDomain Architecture Artificial Intelligence Asia Pacific Sydney ASP.net ASPxGridView Audit Readiness Auto Recovery Automation AWS AWS Certified AWS Lambda AWS Payment Cryptography AWS SDK AWS Security Specialty Azure Azure DevOps Server Backup BIG-IP C# Career Growth Cartes Bancaires CB Certificate Bundle Certification ChatGPT Claude Cloud Cloud Certification Cloud Hosting Cloud Security CloudWatch CLR Content Query Cost Optimization Credentials CyberChef Database Defense Industry Deloitte Developer Tools Developers DevEx DevExpress DevOps DISA Disk Space DISM Distributed Systems DoD DoD CC SRG DUKPT EBS EC2 Engineering Engineering Leadership Engineering Management EnPasFltV2 Enterprise Event Receiver Exam F5 Federal IT FedRAMP Fintech FISMA GAC Generative AI GitHub gMSA GovCloud Government Compliance GridView Hardware Security Modules HSM IAM Identity Management IIS Infra Infrastructure as Code IT Tools Jacob Marks JavaScript jQuery Lambda Leadership Linqpad LLM lsass.exe LTM Memory Optimization Mentorship Microsoft Migration Multi-Region Keys NACL Native AOT Network Architecture Networking NIST ODBC Open Source Payment Cryptography Payments PCI Compliance Performance Platform Platform Architecture Power Tools PowerShell Python Python (if you reference CLI tooling) re:Invent Reachability Analyzer Redshift Relationships List Replace Root Volume SAA-C00 SAP-C00 Security Security Group Serverless SES SharePoint SharePoint 2010 Site Reliability SMTP Snapshot Software Engineering Solutions Architect Solutions Architect Professional SP 2007 SPAWAR SSL STIG Storage Strategy Sydney SysAdmin Team Foundation Server Team Utilities Tech Industry Technical Depth Technology TFS Tools Troubleshooting Upgrade Visual Studio VPC VPC Flow Logs Web Development WebPart WinDirStat Windows Server Windows Server 2025 WinForms