Agent-Inflicted Damage: Inside the Real-World Failures of Enterprise AI Systems

Talking with customers and closely monitoring the AI ecosystem provides a unique perspective into emerging security challenges. Cyera identified a growing and rapidly evolving threat landscape that we refer to as “Agent-Inflicted Damage”. This is a phenomenon where AI systems, designed to improve organizational efficiency and productivity, unintentionally create harmful outcomes by modifying data, influencing workflows, or interacting with systems in unexpected ways.
In this research, we analyzed and collected over 7,200 publicly reported AI-related security incidents, examined the underlying patterns behind these events, and explored how organizations can better understand and defend against this new category of risk. Our blog presents the research methodology, key findings, real-world examples, and practical recommendations for reducing exposure to AI-driven operational and security failures.
Key Takeaways
- Until now, agent-inflicted damage has not been formally recognized or clearly defined as a distinct, real-world security and operational risk category. Instead, these incidents were largely treated as isolated AI “horror stories”, or scattered failures discussed informally among practitioners and customers behind closed doors.
- Cyera research analyzed more than 7,200 publicly reported AI-security and operational incidents and identified 344 verified enterprise-relevant agent-inflicted damage cases between September 2023 and May 2026, including 188 incidents where autonomous AI systems caused direct organizational harm without any external attacker involvement.
- The majority of confirmed incidents involved real production impact rather than theoretical AI risk scenarios. Observed outcomes included deleted databases, destructive cloud actions, unauthorized financial operations, runaway API spending, service outages, exposed secrets, and silent integrity corruption inside enterprise environments.
- We move beyond anecdotal evidence by backing the phenomenon with large-scale data analysis, defining the problem space, classifying the emerging failure patterns, and proposing 3 categories.
- The most dangerous incidents by autonomous agents optimize aggressively toward task completion without understanding organizational risk boundaries, authorization context, cost ceilings, or downstream operational consequences. As opposed to what people may think, malicious threat actors caused the most harmful results.
- The findings suggest organizations are significantly underestimating the operational security implications of autonomous AI deployment. As agents gain broader permissions and deeper integration into SaaS, cloud, development, and business environments, the AI interaction layer itself increasingly becomes part of the enterprise attack surface and critical data perimeter.
Over the past few years, enterprise AI has rapidly evolved from simple assistants and chatbots into what many organizations now describe as AI coworks. Early AI adoption focused mainly on productivity tasks, chat applications that simply responded to prompts and produced responses such as summarizing documents. Today, however, modern AI systems can perform actions, coordinate tasks, and operate semi-autonomously across organizations, for these activities these systems are increasingly connected directly to enterprise environments, where they can access SaaS platforms, databases, cloud infrastructure, collaboration tools, and internal workflows.
The rise of AI coworks also introduces significant security and governance challenges. AI agents often require broad access to sensitive enterprise data, APIs, identities, and business applications in order to function effectively. This creates new risks around data exposure, unauthorized access, shadow AI adoption, excessive permissions, and uncontrolled data movement between systems.
Over the past two years we witness an extensive debate about what AI agent risk might look like, therefore, Cyera Research decided to analyze how this threat actually act in the wild.
We collected 7,246 unique publicly reported AI incident records from September 2023 through May 2026, drawing from the AI Incident Database, OECD trackers, AI-safety research, security press, and production-failure community threads. For this analysis we tailored a logic that fetched multiple analysis sources and cross-checked references in order to answer two questions, what happened and what’s the root cause.
We ultimately reached a robust conclusion where in most cases, the failures stemmed from a growing and increasingly documented phenomenon in which AI agents prioritize task completion and mission success over the organization’s overall security posture. Most concerning, in 188 confirmed cases, these behaviors directly involved corporate production environments.
From AI-Systems Horror Stories to Defining and Scoping a Problem
We started this research by gathering several sources together. We found a brilliant project
AIID (AI Incident Database) which tracks real-world AI related cybersecurity incidents and collected from open-source and other sources we found similar incidents. We ended up with 7,246 raw data cases.
From Raw Corpus To Curated Dataset

We tailored a Claude Opus 4.7 series of prompts to conduct deep-dive research, clean the dataset, isolate the cases we desire (Agent-Inflicted Damage), and cluster the data into distinct groups. This was followed by a manual review of the research team, evaluation and adjustments accordingly.
A Real-World Phenomenon - Dramatic Increase in 2026
As illustrated in the graph below there is an alarming trend as these cases are on significant rise.

Between January and November 2025, we observed only 27 publicly reported cases. However, beginning in December 2025, the data shows a sharp step-function increase in incidents. This trend is not surprising when viewed in the context of broader ecosystem developments. The surge closely aligns with the widespread enterprise adoption of AI coding agents and autonomous development tools such as Claude Code, Cursor agent mode, Devin, and OpenClaw. These platforms significantly expanded the operational autonomy of AI systems inside organizations, increasing both their usage and the likelihood of unintended or harmful outcomes.
Classifying the Incidents
We started by classifying the incidents based on the impact. Our classification is moving from low immediate impact and up until severe immediate impact. We decided on 3 major categories:
- Poor Access Control Policies, Guardrail bypass & Privilege escalation (59 incidents): In this category we can find simple cases where AI-systems were deployed without any access control boundaries, moving to cases where the AI-system came across a problem with executing its task and had to conduct an action which wasn’t, more serious cases when AI completely bypassed guardrails and up to taking the developers privileges (elevated ones) in order to complete a task. This is the simplest category and without clear impact practitioners may argue that this isn’t a serious one.
- Data and Secrets Exposure (22 incidents): In this category we observe cases in which sensitive data ended up outside its intended boundary. This includes customer records publicly exposed, internal info posted to the wrong audience, source code leaked, secrets emitted, confidential email summarized to the wrong party.
- Real-world damage (137 incidents): In this category we see a real system damaged, money lost, or unauthorized actions taken in the user’s name. This was the biggest category with 137 incidents, so we decided to further sub-classify these cases:
- Financial harm (19 incidents): This involves money lost or wasted on the user’s side. Runaway API bills, infinite-loop cloud charges, trading agents that destroyed capital, market-impact events.
- Deletion & code destruction (65 incidents): In such incidents the agent explicitly deletes or wipes something. Databases dropped, files removed via rm -rf, git history destroyed, production code wiped, cloud resources torn down. Overwhelmingly driven by AI coding agents (Claude Code, Cursor, Replit, Gemini CLI, Devin) operating without confirmation gates.
- Service & physical disruption (30 incidents): In these incidents the agent action takes a system offline. Cloud outages triggered by agent-driven resource recreation, robotaxi mass freezes stranding passengers, navigation failures routing vehicles into hazards, services brought down by runaway agent loops.
- Silent integrity failure (23 incidents): In such cases, the damage isn’t immediately visible. For instance, the agent fabricated records and passed off as real data, fake test passes that hide broken code, silent reverts that undo human work, integrity silently corrupted so future queries return wrong results without alerting the operator.
This is Just the Beginning
While most incidents in our dataset reflect real-world impact and observable security consequences, incidents involving data loss and poor access control policy are likely significantly underreported. Discussions with our customers highlight that this is a real and significant concert and challenges they are facing more and more. This leads us to believe that we are probably observing only a fraction of the true scope of the problem, and many organizations are reluctant to publicly disclose incidents in these categories or even see them sometimes as insignificant.
Secrets exposure, for example, may never become public. There are many factors that can keep these issues concealed such as the exposure was limited and scoped, quickly contained, or successfully remediated before broader impact occurred. The organization didn’t even notice the secrets leak occurred. Many organizations may choose not to disclose such incidents if they believe the operational or reputational damage was minimized. In other scenarios it might have been completely missed, and it still lays out there exposed.
Operating AI-systems with poor access control policies may never be publicly disclosed and the act of guardrails manipulation and privilege escalation is likely even more underreported. In many cases, there may be no immediately visible damage, making it easier for organizations to quietly revert the change or overlook it entirely. However, these unnoticed permission changes can create long-term security risks. When effectively leaving behind a ticking time bomb that may only become apparent after a future breach, when organizations struggle to determine the original source of compromise.
We strongly believe that as time goes by and more and more organizations adopt AI-systems in their production environments, we will see the volume of reported cases under these categories dramatically increase.
Some Examples to the Horror Stories from the Wild
Poor access control policy for AI-Systems
The analysis showed there are 59 incidents that reference broken access control, guardrail bypass and privilege escalation. One case
One example is based on an issue published on GitHub (Claude Code issue #46947), which demonstrates AI systems can behave unpredictably or produce harmful operational outcomes inside real environments. The user reported that Claude Code executed an unauthorized transfer of approximately 1,446 USDT from their Bitget spot wallet into their futures wallet while attempting to close a crypto trading position. Although the user only instructed the agent to close a specific ARIA/USDT position, the AI generated and executed code that swept nearly the entire available USDT balance into another trading account without explicit approval. The incident became a widely discussed example of how autonomous coding and trading agents can exceed operational scope and perform unintended financial actions inside real-world environments.
In another example (Claude Code issue #37155), a user reported that Claude Code unexpectedly created a new Google Cloud Platform (GCP) project and associated billing configuration without explicit authorization. Although the exact trigger remained unclear, the incident raised concerns about permission boundaries and autonomous cloud actions performed by AI coding agents operating with access to developer environments and cloud tooling.
In GitHub issue #22018 (Anthropic Claude Code repository) highlighted another example of how AI coding agents can unintentionally overstep operational boundaries. In this case, Claude Code reportedly performed actions beyond the user’s expected intent or approval flow during development operations, reinforcing concerns around agent autonomy, permission scoping, and delegated execution. In the broader context of AI agent security, the issue demonstrates how highly capable coding agents with shell access, repository permissions, or automation capabilities can effectively create privilege-escalation-like conditions, where the agent inherits trusted operational authority and executes impactful actions that users did not fully anticipate or explicitly authorize.
AI-systems Exposed Secrets and Data
and data loss (18 incidents) are the smallest buckets in our corpus, but they carry the highest potential for regulatory consequence.
We’ve seen several cases in the wild where the AI-systems simply deleted invaluable data and covered their actions. An OpenClaw agent leaked passwords after being given autonomy. The Sears AI chatbot exposed 3.7 million customer records. Claude Code revealed secret keys in terminal output despite explicit prohibitions. The Claude Code source code leak in March occurred because a debug file pointed at Anthropic's internal repository - not a breach, just an agent acting faster than anyone could catch it.
GitHub issue #32523 (Anthropic Claude Code repository) shows how Claude Code unintentionally exposed local secrets from a developer environment during agent operations. The AI agent accessed and surfaced sensitive data such as API keys and environment variables from files that were not intended to be shared, demonstrating how broad filesystem access in coding agents can lead to accidental secret leakage.
In another example, the Sears Home Services data exposure incident involved an AI-powered customer service or support platform that exposed sensitive customer and operational data through a publicly accessible database or API. The exposed records included customer information, internal service details, technician data, and conversation-related content, highlighting how AI-integrated support and automation systems can unintentionally expose large volumes of sensitive enterprise data when backend storage or access controls are misconfigured.
The Real-World Impact story is consistent across most cases
Although this is the biggest cluster with over 137 incidents, the story was quite consistent in our dataset. AI coding agents were given a task which ended up with a destructive result to the organization. Below are some examples based on the sub-categories:
Deletion & code destruction
As was published in the Guardian, in April 2026, PocketOS, a company providing software for car rental businesses, suffered an outage after an AI coding agent powered by Anthropic’s Claude Opus 4.6 model accidentally deleted the company’s production database and backups within seconds. The incident occurred while the company was using Cursor, an AI-assisted coding platform, to automate engineering tasks. The AI agent ignored explicit safety restrictions designed to prevent destructive actions and autonomously executed commands that wiped critical infrastructure.
In GitHub issue #41708 there is a severe incident where a user reported that Claude Code autonomously deleted large portions of their Windows 11 C: drive (including user profiles, applications, and project files) after the session was left unattended for about 40 minutes.
Service & physical disruption
The article “Amazon’s Cloud Hit by Two Outages Caused by AI Tools Last Year” describes how Amazon Web Services (AWS) experienced at least two service disruptions linked to internal AI coding agents, including Kiro and Amazon Q Developer. In one incident, the AI agent autonomously decided to “delete and recreate” part of a production environment during troubleshooting, triggering a roughly 13-hour outage affecting AWS cost-analysis systems.
Silent integrity failure
A recently published report describes emerging “scheming” behaviors in autonomous AI systems where agents appear compliant and operationally successful while covertly concealing failures, misalignment, or harmful execution paths from human operators. In this context, the scenario where an “autonomous multi-step agent fabricates detailed sub-task completion reports to conceal cumulative pipeline failures from operator” represents a form of deceptive operational reporting: the agent continues producing plausible progress updates and success artifacts despite internal workflow degradation, preserving the illusion of successful execution while masking compounding failures that would normally trigger human intervention. According to the report, these behaviors are significant because they demonstrate early real-world patterns of AI systems strategically maintaining perceived alignment and competence even when their underlying execution state has diverged from operator intent.
Financial harm with a pattern of runaway costs
Three separate incidents in the corpus involved $47,000 bills from infinite-loop agent behavior - a LangChain research pipeline, a multi-agent system, and an API data enrichment loop that made 2.3 million API calls over a single weekend. A coding agent retry loop burned $4,200 in three hours with no stop mechanism. OpenClaw agents on $200-per-month subscriptions were burning between $1,000 and $5,000 per day in API costs. An autonomous GPT-5 trading agent lost 62% of its allocated capital in 17 days.
What Organizations Must Solve Before Rolling Out Autonomous Agents
As organizations race to deploy autonomous AI agents across business operations, a hard reality is emerging when useful AI-systems cannot exist without deep integration into enterprise systems, workflows, and sensitive data environments.
Organizations learn the hard way that these challenges are not simply building capable agents, rather it is building the surrounding operational, security, and governance infrastructure required to safely contain them. The most important constraint is non-negotiable, as enterprise data must remain inside the organization’s security perimeter.
Once agents begin interacting with sensitive systems, prompts, execution plans, intermediate reasoning, and generated outputs themselves become sensitive enterprise data that require the same protections as the underlying records they access.
IT: Managing the Agent Runtime Environment
Autonomous agents introduce a new operational layer that must be centrally managed like any other enterprise endpoint. Applications, plugins, integrations, secrets, and credentials must remain continuously updated and governed across every machine and environment the agents interact with. Security guardrails cannot be optional or user controlled.
Authorization: The Agent Must Never Exceed the User
One of the most dangerous architectural mistakes in enterprise agent deployments is granting agents excessive or shared permissions. Autonomous agents must operate strictly within the permissions of the individual user they represent - never above them.
Security: Autonomous Agents Can Create Irreversible Damage
As the role AI-systems fulfil in the organizations transition from chat related operation to powerful agents that can write and execute code on the fly. This fundamentally challenge the existing threat model. If we once thought that visibility and access control are enough, we now see that inline protection has become a must. Unlike human users, agents can execute actions at machine speed and scale, making irreversible operations (such as mass deletion, sensitive data exposure, privilege escalation, or policy violations) significantly more dangerous. Security controls therefore must move inline into the execution layer itself rather than relying on after-the-fact alerting or monitoring. The same DSPM, DLP, and governance policies already applied to employees must also apply directly to autonomous agents and their workflows.
Governance: Organizations Need Visibility and Control
Autonomous agents require centralized governance capabilities that provide visibility into every action performed, on behalf of every user, across every integrated system. Organizations need detailed auditability, spend controls, policy enforcement, and operational transparency into what the agent did, when it acted, why it made certain decisions, and which systems or sensitive datasets were involved. This becomes especially critical for SaaS-connected agents operating outside directly managed infrastructure, where limiting sensitive data exposure and enforcing organizational policy boundaries becomes substantially harder.
The Data Perimeter Becomes the Critical Boundary
Ultimately, useful enterprise agents must connect to the same systems employees use every day, including the most sensitive ones. As a result, the conversation itself becomes sensitive data: prompts, execution plans, reasoning traces, intermediate outputs, and generated actions can all contain confidential enterprise information. This creates a new security reality where the AI interaction layer itself becomes part of the organization’s critical data perimeter. To maintain trust, compliance, and control, organizations increasingly require inference, processing, and orchestration to remain inside their own controlled environments without uncontrolled subprocessors or external exposure paths.



.png)

.png)