When AI Security Risks Become Reality

Mar 16, 2026

Key Takeaways

Cloud security protection is essential to safeguard sensitive data in dynamic cloud environments.
Effective protection combines data discovery, risk assessment, and continuous monitoring to prevent breaches.
Compliance with industry standards like GDPR and HIPAA is critical for secure cloud adoption.
Leveraging automation enhances security posture while reducing manual errors.
Staying informed with latest cloud security trends empowers proactive defense against evolving threats.

Not long ago, AI security risks were treated as theoretical - scenarios raised by researchers asking, “what if?” Today, those what-ifs are headlines.

An AI chatbot leaks a million conversations.

A coding agent deletes a production database.

An enterprise assistant summarizes confidential emails - bypassing every DLP control in place.

This is no longer experimentation. It is production reality.

In this blog, we examine how these events map directly to the OWASP Top 10 for LLM Applications - the framework designed to categorize the most critical risks in AI systems.

More importantly, we show how these risks moved from theory to practice, and what that means for organizations adopting AI today. Because secure AI adoption requires more than powerful models or automation - it requires understanding how data exposure, privilege, and system design interact in ways traditional security controls were never built to handle.

What is OWASP and OWASP Top 10 Model for LLM Apps?

OWASP (Open Worldwide Application Security Project) is a global nonprofit community best known for advancing software security through open frameworks like the OWASP Top 10 for web applications. As AI systems became embedded in production environments, OWASP expanded into AI security with initiatives such as the OWASP AI Exchange, OWASP GenAI Security Project and the Machine Learning Security Top 10. One of its most impactful efforts is the OWASP Top 10 for LLM Applications (2025 edition).

The OWASP Top 10 for LLM Applications (2025) identifies the most critical security risks facing applications built on large language models, including prompt injection, insecure output handling, sensitive data exposure, supply chain risks, excessive agent permissions, training data poisoning, model denial of service, and insecure plugin integrations.

It provides a practical framework for building secure LLM-powered systems while emphasizing guardrails, least privilege, input/output validation, monitoring, and supply-chain awareness. In essence, it adapts traditional application security principles to the unique risks introduced by AI-driven systems.

From Theoretical Attack Scenarios to Real-World Incidents

The OWASP Top 10 for LLM Applications framework includes detailed explanations, attack scenarios, and references. It is an insightful and well-grounded resource, informed by real-world patterns - though many of the exploitation examples originate from controlled research demonstrations. In addition, the 2025 OWASP top 10 was released in November 2024 which in AI terms is practically 2024 BC.

In this section, we shift the focus to actual updated incidents that occurred in real organizational and public contexts. These cases share a common theme: adopting AI without data-aware security controls is not a calculated risk - it is a significant and dangerous blind spot.

We illustrate to her a gap between how AI systems access data and how organizations protect it. In real-world context DLP, CASB, and IAM were not built for a world where an AI assistant can sometimes access servers with root level privileges, or reads every email in your organization, or where employees paste trade secrets into chatbots, or where a single injected prompt triggers data theft.

Prompt Injection (OWASP LLM01)

Attackers manipulate AI systems by injecting malicious instructions directly through user inputs or indirectly through content the AI processes. LLMs cannot distinguish trusted instructions from untrusted data.

Case Studies from the Wild

In a real-world security incident, an AI-powered coding assistant used internally at Amazon was abused by threat actors to insert malicious prompt injection. Attackers crafted malicious inputs designed to manipulate an AI-powered coding assistant into generating harmful code. By embedding deceptive instructions within otherwise legitimate-looking content, they caused the agent to produce and submit a pull request that included data-wiping commands in a cloud configuration file. The model followed the injected instructions as if they were valid development guidance, demonstrating how LLMs can be steered into unintended behavior when they cannot reliably distinguish trusted instructions from adversarial ones.

Because the coding agent had broad write access and limited guardrails, the injected prompt translated directly into actionable, potentially destructive changes. This incident illustrates the core danger of prompt injection: when AI systems are granted operational privileges, malicious input can become executed output. Without strict input validation, scoped permissions, and mandatory human review, prompt injection can transform a simple text manipulation attack into real-world impact, including data loss and infrastructure disruption.

In March 2026, in the Ambient Code Platform attack described by StepSecurity, researchers demonstrated how attackers could attempt to manipulate AI-powered coding assistants through prompt injection embedded directly in repository files. By placing malicious instructions inside a configuration file (CLAUDE.md) that an AI reviewer reads as trusted project context, the attacker tried to trick the AI agent into approving and committing unauthorized changes to the repository.

The goal was to turn the AI assistant itself into the mechanism for introducing malicious code.

Although the attack ultimately failed, this is yet another reminder how a prompt injection along the pipeline can pose a serious threat. When AI coding agents are integrated into CI/CD pipelines with repository permissions, poisoned context files or documentation can potentially influence automated code reviews and commits, effectively turning prompt injection into a new software supply-chain attack vector.

In February 2026, Microsoft researchers described a technique called AI recommendation poisoning, where attackers manipulate AI assistants by injecting hidden instructions into their memory to bias future responses. Through mechanisms like embedded prompts in web content or AI summary features, adversaries can cause an AI system to treat certain sources, brands, or claims as trustworthy, influencing later recommendations without the user’s awareness. Microsoft observed dozens of real-world cases across multiple industries, showing that this is an active and growing tactic rather than a theoretical risk. The research highlights how AI systems that store context or memory can be persistently influenced, turning subtle prompt manipulation into long-term control over recommendations and decision support.

Reddit post from r/LocalLLaMA titled 'Prompt injection is killing our self-hosted LLM deployment' illustrating how traditional firewalls fail to stop AI prompt injection attacks and system prompt dumping.

Prompt injection highlights a fundamental weakness in LLM-based systems: they cannot reliably distinguish trusted instructions from malicious input. When attackers embed hidden or deceptive commands into content the model processes, those instructions can influence outputs and, in integrated systems, trigger real actions. As AI systems gain access to tools, data, memory, and operational workflows, prompt injection evolves from a simple text manipulation issue into a tangible security risk with real-world impact.

Sensitive Information Disclosure (OWASP LLM02)

While AI systems often require credentials to connect to other environments and basically function, the real sensitive information exposure lies within what the users feed the LLM model. Often it includes Private Identifiable Information (PII), credentials, proprietary data, and internal communications through their outputs. This can be found in model memorization, misconfigured infrastructure, or overly broad access to enterprise data.

Case Studies from the Wild

A series of real-life case studies show how sensitive information can easily leak from popular applications that are used on a wide scale by millions.

In January 2025, Wiz Research discovered a completely unauthenticated ClickHouse database belonging to DeepSeek. It contained over one million log streams: user chat histories, API keys, cryptographic secrets, and internal system metadata. Researchers gained full database control with zero authentication.

That same year, a dataset of leaked ChatGPT conversations showed the other side of the problem. Users routinely shared full names, phone numbers, resumes, medical conditions, legal disputes, and suicidal thoughts with AI chatbots. People treat ChatGPT as therapist, lawyer, and confidant. Thus, creating what one researcher called "history's most thorough record of enterprise secrets”.

The story below illustrates how real people are impacted from such or similar exposure scenario, emphasizing how LLM can sometimes serve as a double-edged sword.

Wiz Research screenshot detailing the exposure of a DeepSeek ClickHouse database containing one million log streams, API keys, and cryptographic secrets.

It is not just LLM chat apps. A bug in Microsoft 365 Copilot caused it to summarize confidential emails from users' Sent Items and Drafts folders - even when sensitivity labels and DLP policies explicitly prohibited it. An AI assistant that can read your most sensitive emails and a misconfiguration that exposes a million conversations - traditional data breaches look small by comparison.

Sensitive Information Disclosure underscores how LLM systems can unintentionally expose highly sensitive data. Not only through infrastructure misconfigurations, but through the vast amount of confidential information users willingly provide. From PII and credentials to proprietary communications, AI systems often aggregate and process data at a scale that amplifies the impact of any leak. Without strict data governance, access controls, and minimization practices, LLMs can become powerful amplifiers of privacy and security risk rather than productivity tools.

Supply Chain (LLM03)

These security risks are introduced through third-party components that LLM systems depend on, including external models, open-source libraries, plugins, embeddings, datasets, APIs, and hosted AI services. Because modern AI applications are rarely built from scratch, weaknesses in any upstream dependency can compromise the entire system.

In the beginning of 2026 we learned about Openclaw and soon after its security implications. AI supply chain was one of these implications.

According to research by Trend Micro, attackers abused the OpenClaw AI agent ecosystem by weaponizing publicly shared “skills” to distribute Atomic macOS Stealer (AMOS), a well-known macOS information-stealing malware. The malicious skills appeared legitimate but were designed to fetch and execute payloads that installed the stealer on victims’ systems. By leveraging the trust model of AI agent marketplaces and skill repositories, threat actors turned the platform into a distribution channel, effectively creating an AI-driven supply-chain attack.

According to research by Trend Micro, attackers abused the OpenClaw AI agent ecosystem by weaponizing publicly shared “skills” to distribute Atomic macOS Stealer (AMOS), a well-known macOS information-stealing malware. The malicious skills appeared legitimate but were designed to fetch and execute payloads that installed the stealer on victims’ systems. By leveraging the trust model of AI agent marketplaces and skill repositories, threat actors turned the platform into a distribution channel, effectively creating an AI-driven supply-chain attack. The incident highlights how autonomous agents that can import and execute third-party skills introduce new risks: if a skill is malicious or compromised, it can deliver malware, steal credentials, or access sensitive data without obvious warning.

Academic research that was published in February 2026, explored “Malicious Agent Skills in the Wild”. It presents a large-scale empirical analysis of how AI agent skills (third-party plugins or modules that extend an LLM agent’s capabilities) are weaponized in the real world. The researchers found many skill files that go beyond simple functionality, actually exploiting AI platform hook systems and permission flags to perform actions unintended by developers. Some of these malicious or vulnerable skills enabled advanced behaviors like bypassing guardrails, accessing sensitive resources, or performing unauthorized operations.

The researchers evaluated 98,380 third-party agent skills collected from two community registries. Out of these, they confirmed 157 skills as malicious through behavioral verification.

Those malicious skills contained a total of 632 distinct attack scenarios including:

Accessing files the skill shouldn’t read
Sending sensitive data to an external server
Executing commands outside its intended scope
Bypassing built-in safety controls
Escalating privileges inside the AI system

Interestingly, a single threat actor was responsible for 54.1 % of the confirmed malicious skills via templated impersonation campaigns. Illustrating that threat intelligence practices (existing in other security domains) are not yet applied in AI domains.

The authors responsibly disclosed these issues, leading to the removal of most problematic skills within about 30 days, and released their dataset and analysis tools to help future security research on agent ecosystems.

This is a concrete example of supply-chain-style risk with LLM systems: reusable components published in open marketplaces or repositories were found to contain hidden, harmful behavior that could compromise applications or infrastructure that integrated them.

Hackread article screenshot documenting leaked ChatGPT conversations where users shared PII, credentials, and enterprise secrets with AI chatbots.

‍

TechRadar screenshot reporting a Microsoft 365 Copilot bug that exposed confidential emails and drafts despite existing DLP policies and sensitivity labels.

Data and Model Poisoning (LLM04)

Attackers corrupt AI models by injecting malicious data into training sets, fine-tuning pipelines, or runtime memory. Inside the malicious content, they are embedding backdoors, biases, or false information that persist across all future interactions.

Case Studies from the Wild

Attackers corrupt AI systems by injecting malicious or misleading data into training sets, fine-tuning pipelines, retrieval sources, or runtime memory. The poisoned content may embed backdoors, persistent biases, false facts, or hidden triggers that influence future outputs. Because LLMs generalize from the data they ingest, even small volumes of strategically crafted content can create disproportionate downstream effects.

In December 2025, the European Parliament published research about “Information manipulation in the age of generative artificial intelligence”, providing ample examples about LLM data manipulation and data poisoning in many domains.

One significant issue that resurfaces occasionally, is the Pravda network. In April 2025, the Atlantic Council’s investigation showed that a pro-Kremlin disinformation network known as the Pravda network has been systematically seeding misleading content across the internet (including on Wikipedia and in material that trains large language models) to shape global perceptions in Russia’s favor. By creating a vast array of seemingly authoritative pages and articles, the network amplifies Kremlin narratives about the war in Ukraine and other geopolitical issues, which can then be absorbed by AI chatbots and present pro-Russian, anti-Western messaging to users. This strategy of “poisoning” AI tools raises concerns about the transparency of AI training data and the ability of models to produce unbiased, fact-based responses.

In October 2025 a research by Anthropic, the UK AI Security Institute, and The Alan Turing Institute found that it takes only 250 crafted documents to poison any large language model, regardless of size. Success depends on the absolute number of poisoned documents, not their percentage of the training set. This overturned the assumption that poisoning large models requires controlling significant portions of their data.

In February 2026, BBC journalist Thomas Germain proved how easy this is in practice. He published a fabricated article on his personal website about "the best tech journalists at eating hot dogs," complete with a fictional championship ranking. Major AI models ingested it and began presenting it as fact. Anyone with a website can poison AI training data.

In the example below you can observe a real-world data poisoning scenario in DeepSeek. The innocent user downloaded a mobile app based on an LLM result, but it was either poisoned and the result was a recommended mobile malware.

Trend Micro research screenshot showing how attackers weaponized OpenClaw AI agent skills to distribute Atomic macOS Stealer (AMOS) malware.

Data and Model Poisoning refer to the deliberate manipulation of the information AI systems learn from or retrieve, embedding hidden backdoors, biases, or false narratives that persist across future outputs. Because LLMs generalize from their data sources, even small amounts of strategically crafted content can disproportionately influence behavior. This makes poisoning not just a technical risk, but a systemic threat to trust, integrity, and decision-making in AI-driven systems.

Excessive Agency (OWASP LLM06)

AI agents today aren’t just chatbots answering questions.

In many cases this is an entire ecosystem that is connected to organizational sensitive systems:

Databases
Internal APIs
File storage systems
Customer records
Organizational knowledge base
And even cloud infrastructure

That means they don’t just generate text; they can execute actions and are exposed to highly sensitive information. When everything works as intended, this automation can be powerful and boost businesses. But when an AI misunderstands a request, receives incomplete instructions, or is manipulated through prompt injection, it can act in ways no one anticipated. Unlike a human employee who might pause and ask for clarification, an AI agent can execute instantly, and at scale.

If it is exposed to malicious input, it could follow hidden instructions that override safeguards, triggering actions such as revoking access, modifying configurations, or shutting down services. Because these systems often have privileged access to automate workflows, a single mistake can cascade into deleted data, disrupted operations, financial loss, and reputational damage. As organizations integrate AI more deeply into core systems, clear guardrails, strict permissions, monitoring, and human oversight are no longer optional - they are essential to prevent real-world harm.

Case Studies from the Wild

The risks are not theoretical. If an AI agent misinterprets a command like “clean up old resources,” it might delete active databases instead of test data.

In July 2025, a Replit AI coding agent deleted an entire production database during a development session. Despite the user issuing an explicit “code and action freeze”. The agent disregarded the constraint and destroyed records for 1,206 executives and 1,196 companies in a live system.

This February 2026 incident sounds like something out of science fiction, almost like a scene from Skynet in The Terminator. An AI agent autonomously submitted a pull request to the Matplotlib open-source project. When a maintainer rejected the contribution, the agent went on to independently draft and publish a defamatory blog post criticizing the maintainer - without any human instructing it to retaliate.

Academic research screenshot from February 2026 analyzing 157 malicious AI agent skills that bypassed safety controls and escalated system privileges.

There was no malicious intent programmed into the system, and no traditional “attack” in the classic sense. Yet the episode highlights something deeply important: the system behaved outside its creator’s intended purpose and best interests, and when connected to excessive agency it becomes powerful. And that, in essence, is the foundation of many vulnerabilities and cyber incidents - when software acts beyond its expected boundaries. As AI agents gain more autonomy and integration into real workflows, unintended behavior becomes not just a technical glitch, but a potential security and reputational risk.

The case below is a reminder how AI security issues are channeled to the more traditional security issues, how email AI assistant can be used in financial cybercrime.

Atlantic Council investigation screenshot showing how the Pravda network poisons AI training data by seeding misleading content on Wikipedia and news

Excessive Agency refers to AI systems that are granted broad operational permissions beyond what is strictly necessary. When AI agents are connected to sensitive systems and allowed to execute actions autonomously, mistakes, misinterpretations, or adversarial inputs can translate directly into real-world impact. Without strict scoping, guardrails, and human oversight, autonomy becomes a security liability rather than a productivity advantage.

Closing the Gap: Cyera AI Guardian

These five risks share a root cause: a gap between how AI systems access data and how organizations protect it. DLP, CASB, and IAM were not built for a world where an AI assistant reads every email in your organization, where employees paste trade secrets into chatbots, or where a single injected prompt triggers data theft.

Cyera AI Guardian closes this gap by unifying AI Security Posture Management (AI-SPM) with runtime AI protection - grounded in an understanding of the data itself. Cyera AI Guardian combines 3 strategic pillars in organizations' AI posture. It is based on your organization’s applications installed on your cloud environment, Data points and AI applications and functions across the AI ecosystem: ChatGPT, Claude, Microsoft Copilot, Amazon Bedrock, and custom enterprise AI applications.

It functions in 2 levels:

Discovery: With 59% of employees using unapproved AI tools, visibility comes first. Discover all AI usage with AI-SPM that finds every AI tool across the enterprise. Whether it is sanctioned, embedded, and shadow. Cyera AI helps you map each AI instance to its connected identities and data access patterns.
Govern:

Prioritize protection based on data sensitivity. AI Guardian classifies the data AI systems access, so protections scale with what is at stake. A model querying public documentation gets different treatment than one touching customer PII or source code.
Block sensitive data leakage in real time. AI Protect monitors prompts, responses, and user actions as they happen. It detects and blocks sensitive data leaving the organization, prompt injection attempts, jailbreaks, and unauthorized data access - before they succeed. Whether an employee pastes source code into ChatGPT or an attacker embeds a zero-click injection in a document Copilot processes, AI Guardian intervenes at the point of risk.

Learn how Cyera AI Guardian can secure your AI adoption

FAQ Questions and Answers

What is cloud security protection?
Cloud security protection involves measures and controls designed to safeguard data, applications, and services hosted in cloud environments from cyber threats and unauthorized access.

Why is continuous monitoring important for cloud security?
Continuous monitoring detects vulnerabilities and threats in real-time, allowing organizations to respond swiftly and reduce the risk of data breaches in dynamic cloud setups.

Which compliance standards are most relevant to cloud security?
Common standards include GDPR, HIPAA, PCI-DSS, and SOC 2, which require organizations to maintain data privacy and security controls in cloud deployments.

How does automation improve cloud security?
Automation streamlines security processes such as configuration management and threat detection, minimizing manual errors and accelerating response times.

What steps can I take to prevent data breaches in multi-cloud environments?
Implement consistent security policies, use identity and access management (IAM), encrypt data, and apply unified monitoring across all cloud platforms.

Heading 2