AI Agents are breaking data security, here’s how to fix it

May 20, 2026
Share

The data security playbook most enterprises run today was written for a world where humans were the primary identity. A user is authenticated into a system and based on authorization controls accesses data. Every meaningful control across multiple domains, including DLP, DSPM, IAM, ITDR, and CASB are based on that premise.

That world has been flipped upside down by agents. Agentic workflows, which are increasingly autonomous, are being adopted en masse by enterprises.

  • Developers running coding agents on their laptops.
  • Finance teams using agents to analyze proprietary data in Snowflake.
  • Marketing teams operating with customer data in CRMs.
  • And perhaps even a homegrown public facing application branded with your company making decisions you have no audit trail for.

This is a stark difference from prompt-and-response chatbots. Agents are autonomous and semi-autonomous identities that can read your sensitive data, call tools, hit APIs, and take action. The playbook you have was not designed for them. Below, we’ll detail what breaks followed by suggested fixes.

How agents are breaking traditional controls

1. Identities equal humans

Your IAM and ITDR stack are built around human identities. SSO, MFA, user roles, provisioning, etc., all are built with a human in mind. Agents are faster than humans and operate at much larger scale. They act 24/7, in bursts, across thousands of API calls, often inheriting permissions from the user who created them. Agents look like compromised accounts to traditional ITDR tools. The signal-to-noise problem is brutal.

2. Sensitive data lives in systems, and we control the systems

DSPM and DLP were built to discover and protect data at rest in known systems. Agents add a new failure mode: they retrieve sensitive data from one system, hold it in context, transform it, and act on it elsewhere. The data isn't moving in the way DLP was built to detect. It's being reasoned over and acted on. Your data visibility needs should give you context on the agent actions, not the file state.

3. Perimeter based defenses. 

Scanning prompts and responses is important, but the real risk in an agent occurs much earlier. It's in the permissions. The tool call you didn't know it had access to, the connector that gives it write access to a public knowledge base, the retrieval from a vector store that was never sanitized. It’s also about how the agent was built and the intent behind its actions.

4. Visibility and governance are the same

Spreadsheets full of approved AI use cases are the new shadow IT. The agent count inside any large enterprise is growing rapidly as low-code platforms get democratized to non-engineers. Static inventories built through manual processes won’t keep up. Governance has to be continuous and machine-driven or else it becomes nothing more than compliance theater.

The new playbook: discover, govern, protect, validate

Discover

Discovery for agents is harder than it was for cloud workloads or SaaS apps. Agents live in many places that traditional discovery doesn't reach:

  • Native agent platforms (Copilot Studio, Agentforce, Bedrock Agents, Foundry Agents)
  • Self-hosted agent services across AWS, Azure, and GCP
  • Endpoint agents, e.g., Claude Code, Cursor, or CodeEx, running on developer laptops, often with credentials that reach production
  • Browser-based public AI tools connecting to MCP servers and tool registries that few security teams have ever audited

Real discovery needs to account for these use cases by utilizing a mix of native platform APIs, network telemetry, OAuth signals, log-based detection, and endpoint mapping. If your agent inventory only covers what your engineering team self-reported, chances are you have only hit the tip of the iceberg.

Govern

Discovery is a list, while governance ensures accountability.

For each agent, you need three things: a clear sense of intention (what was this agent built to do?), a precise picture of its blast radius (what data and systems can it actually reach, given its tools and effective permissions?), and a risk which accounts for posture, behavior, autonomy level, and sensitive data exposure into a quantifiable metric a security leader can act on.

The most useful governance question is one almost no one can answer today: given everything this agent has access to, is its actual behavior consistent with its stated intent? An agent built to summarize support tickets shouldn't be reading the M&A folder. The fact that this is hard to detect today is exactly why agent governance has to be continuous and data-aware, not relegated to a single static point-in-time access review.

Protect

How to protect against threats in real-time? How to block malicious actions? Scanning prompts and responses is crucial, but alone it’s insufficient. Real agent protection means policy enforcement inside the loop:

  • Block a tool call before it executes if it would violate data residency
  • Redact sensitive context before it reaches a third-party model
  • Halt an agent that suddenly starts accessing data outside its purpose envelope
  • Detect prompt injection embedded in retrieved content, not just user input

The technical bar is high because the worst agent failures don't look like attacks. They look like an agent doing exactly what it was told, against data it shouldn't have been able to reach. Protection sits between the agent and the data. Close enough to the action to enforce, smart enough not to break legitimate workflows.

Validate

The final step is going to become popular once AI regulation matures. It centers around the question: How do you know your controls actually worked?

Validation means red-teaming your own agents automatically and at scale. Probing them with prompt injections, jailbreaks, exfiltration attempts, and out-of-policy tool calls, and producing evidence that the guardrails held. Without this layer, you have an idea of your posture but lack proof. With validation evidence you have something a board, an auditor, or a regulator can rely on.

The future of data security

Securing AI agents cannot rely on the status quo or a tech stack from a pre-AI era. It's a rearchitecture of how data security and identity security work together, along with a new agentic runtime layer that didn't exist before.

That's the principle Cyera has been building on. Our AI security solution already combines discovery and governance for AI assets and agents along with a protect layer for what crosses between models, agents, and data. Both sit on the same data graph and platform that powers our DSPM and DLP. This makes it possible to tie any agent, on any platform, directly to the sensitive data inside its blast radius, while accounting for human, machine, and agentic identities.

For the agentic vision of enterprises to take off, security will have to scale with it. The companies that get this right won't be the ones with the fanciest model. They'll be the ones that can answer a simple question, on demand, for any agent in their environment: what data did you touch, why, and was it allowed?

If your current playbook can't answer that, it's already broken. The good news is the new one is being written now, and you still have time to lead it.

Share