Detection Is Fast. Understanding Is Not. Why File-Access Incidents Stall - and How Impact Clarity Changes the Outcome

Daniel Eisenberg

Research Labs

February 5, 2026

Incident response has improved in a measurable way over the last few years. Organizations are detecting intrusions sooner and, in many cases, containing them faster. Large incident response datasets attribute these improvements to broader telemetry coverage, stronger internal detection, and automation that reduces manual effort in investigation and response pipelines¹².

But one thing has not kept pace: what happens after risky access is found.
In my work with security teams, I see this pattern often. Detection and containment are in motion, but teams still struggle to explain the impact clearly and quickly. That part remains slow and expensive.

Main Highlights

Detection and containment have improved, but impact understanding has not.
Organizations can now identify risky file access quickly, yet still struggle to explain what data was touched and why it matters when decisions are time-critical.
Behavior alone cannot define blast radius or risk.
Identical access patterns can represent radically different levels of exposure depending on data meaning, ownership, and regulatory obligation-context that access logs alone cannot provide.
Closing the gap requires translating access into data-centric impact.
Real progress comes from combining data awareness, full coverage across environments, and lineage over time-so teams can reduce uncertainty before costs and consequences escalate.

‍

A typical data incident begins with an alert, showing a user account rapidly accessing or deleting thousands of files. The activity is confirmed, the user identified, and the account is disabled. Credentials are rotated and timelines reviewed. Operationally, the system functions as intended.

The real challenge emerges when the focus shifts from the facts to their significance.

Legal wants to know if breach notification obligations are triggered. A business leader wants to know if customer or employee data is involved. Communications wants to understand what can be said confidently without later revision. The security team can answer exactly what they saw: the account, the paths, the volumes, the timestamps. What they cannot answer quickly is what the organization actually needs to decide: what kind of data was accessed, and why it matters.

‍

This gap is not abstract. It has a price tag.

Scoping too broadly has direct, repeatable cost. In IBM’s 2024 dataset, post-breach response activities and lost business together account for $2.8M on average, more than half of the total $4.88M average breach cost⁴. These costs come from concrete actions: standing up customer support, offering monitoring services, absorbing downtime, and managing churn. Scoping too narrowly brings its own cost: rework, follow-on disclosures, regulatory scrutiny, and reputational damage when the story changes.

The pressure to decide is amplified by attacker behavior. In many incident response cases, data exfiltration occurs within hours of compromise³. That compresses the decision window. Even when containment is fast, the organization still must decide what matters before the investigation is complete.

Part of what makes this hard is how incident response timelines are still framed. Response is often treated as a single clock, but in reality there are at least three: time to detect, time to contain, and time to understand impact. The industry has made measurable progress on the first two, and the drivers are well understood¹². The third clock operates at a different pace.

Understanding impact is not about collecting more access logs. It is about translating access into business meaning: what category of data was involved, who it relates to, how it is used, what contractual or regulatory obligations attach to it, and how confident the organization can be in its conclusions at a given moment. These answers rarely live in one system and almost never map cleanly to file paths.

In many organizations, file incidents surface through behavioral access alerts: sudden spikes in reads, unusual download volume, or a new principal touching a sensitive area. UEBA-style detections are effective at flagging “this access is worth attention.”

But that signal does not show the blast radius. It tells you something changed, but not what was touched in business terms, where else that data lives, or if exposure spread to other systems. This gap is also why file-access UEBA often produces false positives and low-value signals for investigations. When 'unusual' often means normal business activity and the system lacks data context, anomaly is a poor stand-in for risk. MDR reporting shows this at scale: only a small fraction of alerts turn out to be real incidents.

That weakness becomes obvious when behavior is identical but risk is not.

Two alerts might look the same: ten thousand files read in an hour, by the same role, on the same device, from the same location. In one case, the files are draft design documents and public marketing materials. In the other, they are customer contracts with personal data, employee records with payroll details, or regulated datasets that come with notification rules and contract requirements. The behavior looks the same, but the risk and cost are completely different.

This is why “sensitive or not” is not deep enough.

What matters is awareness. Not “customer data,” but a specific customer’s active contracts for a regulated region. Not “HR,” but current-employee payroll exports for a particular legal entity. Not “finance,” but revenue models used in board reporting. Not “legal,” but documents under privilege for a specific matter. Each carries different escalation thresholds, notification obligations, and cost implications. Without that specificity, uncertainty is treated as risk, and scope expands defensively.

The challenge compounds because data rarely stays in one place.

A common enterprise pattern looks like this: a user’s laptop syncs a “Customer Contracts” folder via OneDrive. The same files live in SharePoint, are mirrored into a shared drive for a legacy workflow, and a subset is exported into a SaaS system to support renewals. When suspicious access is detected, responders may have endpoint telemetry, SharePoint logs, file server events, SaaS exports, and collaboration sharing records. Each tool can show activity in its own domain. None can easily answer the question the organization is actually asking: what was the blast radius of the data across these systems, over time, and what obligations does that trigger?

This is why breaches spanning multiple environments take longer to resolve and cost more on average in major breach datasets². It is not simply that detection is harder. Impact reconstruction becomes cross-silo reasoning under time pressure.

At that point, many teams fall back on manual scoping: sampling files, asking owners, stitching timelines across endpoints, SaaS, and cloud storage. The work gets done, eventually. But it is slow, difficult to repeat, and poorly suited to moments where decisions must be made quickly and defended later.

This connects alert fatigue, slow scoping, and high breach cost. Anomaly detection is not broken. Access visibility alone cannot explain the blast radius when risk depends on data, meaning, ownership, and obligation.

So what closes the gap?

Not another alert. The missing capability is the ability to explain access in terms of the data itself: what it is, where it exists, how it moved, and what obligations follow.

In practice, that requires three things working together.

First, data awareness. Teams need context that turns a generic file access into a clear understanding of which dataset was involved, who it relates to, and what obligations apply. This context is what lets teams decide if an alert is an incident and respond accurately.
Second, coverage without blind spots. Responders need to see all the places the same data can exist: endpoints, shared drives, cloud storage, SaaS platforms, collaboration tools, and on-prem servers. Without full coverage, teams are left guessing or forced into slow manual checks.
Third, data lineage that supports real conclusions. Coverage shows where data lives. Lineage shows what happened to it: where it started, how it moved, when permissions changed, and where sharing expanded. For example, tracing a file shared externally back to a restricted contract repository with regulated customer data lets the team know the exposure is serious and respond with the right urgency.

These capabilities work together to reduce uncertainty when decisions need to be made quickly. They help teams avoid overscoping and focus on what matters, especially when the cost of mistakes is high.

The industry is getting faster at finding and containing incidents. But file-access risk depends on understanding the meaning of the data. To translate access into impact, teams need to know what the data is, where it exists, and how it moved. Speed alone does not reduce risk if the context is missing. It just moves the risk elsewhere.

References

Mandiant M-Trends 2024: Global median dwell time decreased to 10 days in 2023 from 16 days in 2022.
IBM Cost of a Data Breach Report 2024: Organizations using security AI and automation detected and contained incidents 98 days faster on average; breaches spanning multiple environments took longer and cost more.
Unit 42 Incident Response Report 2024: Approximately 45% of cases involved data exfiltration in less than one day after compromise.
IBM Cost of a Data Breach Report 2024: Average breach cost $4.88M, with $2.8M attributed to lost business and post-breach response activities.
Arctic Wolf Security Operations Report 2025: Only a small fraction of investigated alerts become confirmed security incidents, reflecting high volumes of low-signal behavioral alerts.

Download Report

Research Type

Research Blog