AWS Faces Challenges Again in US-EAST-1 Region

ago 22 hours
AWS Faces Challenges Again in US-EAST-1 Region

A recent incident in the Amazon Web Services (AWS) US-EAST-1 region has caused disruptions for its customers. On October 28, 2023, at 3:36 PM PDT, AWS reported complications affecting EC2 launches within a specific Availability Zone (use1-az2), leading to increased latencies. This incident follows a previous issue that already resulted in significant service outages.

AWS Service Disruptions in US-EAST-1 Region

The complications reported included throttled requests for EC2 resources. AWS advised users that retrying requests could potentially resolve the latency issues. However, customers faced task launch failures for Elastic Container Service (ECS) tasks using both EC2 and Fargate. The elevated failure rates particularly impacted a subset of users in the US-EAST-1 region.

Impact on ECS and EMR Services

AWS’s status page indicated that ECS operates within the affected AWS cells. Customers were warned that their container instances might disconnect from ECS, leading to unexpected task stoppages. Additionally, the EMR Serverless service that allows use of big data tools like Hadoop experienced complications due to these ECS issues.

  • EC2 launches had increased latencies.
  • ECS task launch failures were noted.
  • EMR Serverless services were also impacted.

At 5:31 PM PDT, AWS updated users, stating that they were refreshing ECS clusters which were functioning correctly to mitigate the problem. It reported progress in resolving the affected ECS cells, although users would not see immediate improvements. AWS estimated a resolution time of 2-3 hours from the time of the notification.

Wider Implications and Service Monitoring

Ten services were reported as affected by this incident, including App Runner, Batch, CodeBuild, and the Elastic Kubernetes Service. Despite these disruptions, it should be noted that not all services experienced outages, as AWS’s architecture allows for various resources to remain operational across the six availability zones within US-EAST-1.

As the situation developed, updates from AWS indicated that their team was working diligently to recover services for affected customers. By the following morning, AWS provided a status update confirming that the issues had been resolved, indicating the efficiency of their response despite initial setbacks.