- AWS outage in December linked to AI tool error.
- Financial Times reported 13-hour disruption to system.
- Amazon said incident affected single cost-management service.
- One AWS region impacted, broader infrastructure unaffected.
The cloud arm of Amazon, Amazon Web Services had gone offline in December, a crucial element of the company that handles cost, which Amazon confirmed on Friday.
The first case that was reported in Financial Times pertained to the mistakes associated with the artificial intelligence systems at AWS. The report showed that AWS experienced two outages in December, one of which was a 13-hour outage to a customer-facing system.
The Financial Times reported that persons involved in the case said engineers permitted an AWS AI code-writing system which Tung describes as an agentic system able to make autonomous decisions to make some changes. The tool was reported to have decided to overwrite and re-created the environment, which took a long time to create a disruption.
The wider description of the situation was challenged by Amazon. The spokesperson of AWS explained to Reuters in a statement that was emailed, that event disrupted an AWS feature, which is one service that manages costs, rather than AWS itself.
Limited Scope of Disruption
The spokesperson characterized the issue as short lived and explained it by user ignorance. One of the 39 global regions of AWS was impacted by the interruption in one of its systems that gave customers the opportunity to monitor usage costs.
The outage was in one of two Canadian mainland Chinese regions of AWS and spokesperson said that it was not all AWS infrastructure impacted, but only 1 service.
AWS manages dozens of globally dispensed cloud centers, delivering enterprise and government computer, storage and analytics services. Customers have been enabled to monitor their usage and optimize spending on clouds by use of cost-management tools which is of major importance to organizations with large workloads in cloud.
The report of Financial Times opined that the failure was a result of an engineering decision among the internal environment of the AI-enabled coding tool. Intended to be performed independently, agentic AI systems are intended to, for example, make changes to code or infrastructure configurations.
Amazon has not verified the information on the particular behavior of the AI tool in the described report but pointed out that the effects were limited and did not influence the work of AWS as a whole.
Automation - AI under the Microscope.
The event shows the growing adoption of artificial intelligence solutions to the management of cloud infrastructure. The development of software and its operation processes (such as configuration changes and system upgrades) are being automated with the help of AI-assisted coding systems.
Due to the growth of AI functions on the platforms of cloud providers, protections and control mechanisms are a point of inner control and customer trust.
AWS is one of the pillars of profitability in Amazon and it sustains a large customer base which thrives on the availability of constant services. The company stated the event that occurred in December was a very small one, and that just one of the services in one place was touched.
Also Read: Study Finds Most AI Agents Skip or Lack Safety Disclosure Raising Transparency Concerns
The statement by Amazon did not give any more information regarding the second outage that was reported in the Financial Times story.
Recommended FAQs
What caused the AWS outage in December?
Amazon confirmed that a December disruption involved an error linked to an internal AI-powered coding tool. The issue affected a cost-management service rather than AWS's broader cloud infrastructure.
Did the AWS outage affect all cloud services?
No, Amazon said the disruption was limited to a single cost-management feature in one region. The wider AWS infrastructure and most customer services were not impacted.
How long did the AWS disruption last?
According to reports cited by the Financial Times, one of the outages lasted about 13 hours. Amazon described the event as short-lived and limited in scope.
Was an AI system responsible for making changes that led to the outage?
The Financial Times reported that an AI code-writing system made autonomous changes that contributed to the issue. Amazon did not confirm those specific details but acknowledged the incident involved an AI-related error.
Why is the incident significant for AI in cloud operations?
The event highlights growing reliance on AI tools to manage infrastructure and automate coding tasks. It also raises questions about safeguards and oversight as companies expand AI-driven automation in critical systems.