Detection is Not Protection: The Billion Dollar Blind Spot in AI & Data Security
In the first part of this series, I made the case that backup data is the most valuable dataset in the building. That every snapshot your organization takes contains not just a copy of your data, but a copy of your reality. And that reading that data, really reading it, changes the economics of data security in ways the industry hasn’t fully absorbed.
I want to build on that argument now with something more urgent. Because while I was writing about what backup data could become, the thing I was worried about started happening.
AI agents are operating inside production systems of record today. They are reading customer data in Salesforce, writing code in GitHub, modifying configurations in Okta, and sending communications through Microsoft 365. They are doing this at speeds that make every assumption the data protection industry was built on obsolete.
And the incidents have already started.
The math that should keep you up at night
Before I walk through what has gone wrong, I want to explain why the scale of the problem is different from anything the industry has faced before. Not different in degree. Different in kind.
A human knowledge worker navigating a CRM interface might update twenty to fifty records in an hour. That’s the speed at which enterprise software was designed to operate, and it’s the speed at which every governance framework, every audit trail, every change management process was built to keep pace.
Now consider what an AI agent sees when it connects to the same CRM. Salesforce’s Bulk API permits fifteen thousand batch submissions per day, each containing up to ten thousand records. That is a theoretical ceiling of 150 million record modifications in twenty-four hours. Even the standard REST API allows over a hundred thousand requests in a rolling day. Microsoft’s Graph API permits ten thousand requests every ten minutes per user per application.
Because the gap between human speed and machine speed is so large, it’s easy to read these numbers without feeling them. But think about it this way: an AI agent with Salesforce Bulk API access can modify more records in a single minute than a human employee could change in an entire career. That is not an exaggeration for effect. It’s just plain math.
When that agent works correctly, the productivity gains are real. When it doesn’t, the blast radius isn’t an incident report. It’s an infrastructure event. And the time between “something went wrong” and “we noticed” is no longer measured in the hours or days that human-speed errors afford. It’s measured in the minutes or seconds before the next API call.
Andrej Karpathy spent five years leading AI at Tesla and helped build some of the most impactful AI systems in the world. He has a framework I keep coming back to called the March of Nines. The idea is simple: every step up in reliability, from 90% accurate to 99% to 99.9%, takes roughly the same engineering effort as the step before it. A demo that works 90% of the time is at that first nine. Most enterprise AI deployments, in Karpathy’s view, are still stuck there. Why does that matter? Take a ten-step workflow where each step succeeds 98% of the time. That workflow will still fail roughly one in every five times you run it.
That number is worth sitting with. One in five. Not one in a thousand. Not one in a million. One in five. And that’s at 98% per step, which is generous for most production agent deployments today. These are the systems now operating inside your Salesforce instance, your GitHub repositories, your identity provider. They are making changes at machine speed with reliability margins that would be unacceptable for a human employee. The difference is that a human who fails one time in five does it slowly enough for someone to notice. An agent that fails one time in five does it thousands of times before the first alert fires.
The industry that I’ve spent my career in was designed for a world where data changed at human speed and errors happened at human scale. That world is gone.
The incidents that prove the point
I wish what follows were hypothetical. It is not. Every incident I’m about to describe is documented, vendor-acknowledged, and in most cases assigned a CVE.
In February 2025, an Israeli security firm called Lasso disclosed that Microsoft’s Copilot could retrieve code from over twenty thousand GitHub repositories that had been set to private. The exposure affected more than sixteen thousand organizations, including IBM, Google, PayPal, and Microsoft itself. Hundreds of private API keys and tokens were accessible through a caching mechanism that Copilot continued to query even after the repositories were locked down. Microsoft classified the issue as low severity. The organizations whose secrets were exposed may have a different view.
Four months later, a research team at Aim Security found something worse. They discovered EchoLeak, the first known zero-click prompt injection attack against a production AI system. A single crafted email, requiring no action from the recipient, could cause Microsoft 365 Copilot to access and transmit the contents of Outlook messages, Teams conversations, OneDrive files, and SharePoint documents to an attacker-controlled server. No phishing link. No malicious attachment. Just an email sitting in an inbox that Copilot decided to read. Microsoft confirmed that tens of millions of enterprise users were potentially affected. The vulnerability was assigned CVE-2025-32711 with a severity score of 9.3 out of 10.
Then, for four weeks beginning January 21, 2026, Microsoft 365 Copilot quietly stopped respecting sensitivity labels and data loss prevention policies. It read and summarized confidential emails it should never have been able to access. The UK’s National Health Service was among the affected organizations. The classification systems were working. The enforcement layer was working. The AI simply ignored both of them for an entire month before the issue was tracked and resolved.
Salesforce had its own reckoning. In September 2025, researchers at Noma Security demonstrated that an attacker could exfiltrate sensitive CRM data from Salesforce Agentforce by embedding hidden instructions in a standard web-to-lead form. When an employee later asked Agentforce about the lead, the AI executed the hidden instructions, gathered customer emails, pipeline details, and lead metadata, and transmitted them to the attacker. The total cost of the attack infrastructure was five dollars. That is what it cost to purchase an expired domain that Salesforce had previously whitelisted. Noma’s chief technology officer said it plainly: they could make the agent leak data, alter records, or delete databases.
The most visceral incidents involved AI agents destroying production data outright. In July 2025, Replit’s AI coding assistant deleted a live production database during an active code freeze. The instructions not to make changes were written in capital letters. The AI acknowledged the constraint and immediately violated it, wiping records for over twelve hundred executives and fabricating roughly four thousand fake entries in their place.
In December 2025, Amazon’s Kiro, an AI coding agent, caused a thirteen-hour outage of AWS Cost Explorer in mainland China. Assigned to fix a minor bug, Kiro autonomously decided to delete the entire production environment and rebuild it from scratch. The deletion executed faster than a human could have read a confirmation dialog.
These are not edge cases. They are the leading indicators of a pattern that will define the next several years of enterprise technology. And they share a common thread: in every case, the organization had no independent, time-stamped record of what their data looked like before the agent acted. Recovery was not a matter of pressing a button. It was a matter of forensic archaeology.
Data classification breaks at machine speed
The compliance exposure created by AI agents is not a scaled-up version of the old problem. It is structurally different. Data classification has always assumed that data moves at the speed of human decisions: someone copies a file, shares a folder, exports a report. Each action slow enough for governance frameworks to keep pace. AI agents break every part of that assumption.
When Microsoft 365 Copilot reads an email to answer a user’s question, it chunks the text, embeds it into vectors, and matches it against a query. The sensitivity label applied to the original email does not travel with those chunks. The access control that restricted the email to its intended recipients does not survive the embedding process. As CrowdStrike’s research team documented, the retrieval layer becomes an inadvertent privilege escalation vector. The AI doesn’t hack its way past your classification scheme. Your classification scheme simply doesn’t exist in the space where the AI operates.
The scale of this exposure is not speculative. Varonis analyzed data from a thousand organizations and nearly ten billion files. They found that 99% of organizations have sensitive data exposed to AI tools. Ninety percent have sensitive files accessible to every employee through Copilot. The average organization has twenty-five thousand folders that will surface their contents to anyone who asks the right question.
Regulators are responding, but on a timeline measured in years while agent deployment is measured in weeks. The EU AI Act’s high-risk requirements take effect in August 2026. The proposed HIPAA Security Rule update addresses AI systems processing electronic protected health information for the first time. DORA brought AI systems in financial services under ICT risk frameworks in January 2025. Italy’s data protection authority issued the first GenAI-specific GDPR fine, fifteen million euros against OpenAI, in December 2024. The common thread across all of them is a requirement that organizations demonstrate, with evidence, what their AI systems accessed and whether the processing complied with applicable rules. Application-level audit logs capture what the AI reported doing. Backup data captures what actually happened. The distance between those two records is the distance between compliance theater and compliance reality.
Configuration drift at a speed governance can’t match
Permissions have always drifted. Emergency changes at two in the morning that never get documented. Contractors with temporary access that quietly becomes permanent. Role-based policies that accumulate exceptions until the exceptions outnumber the rules. What AI agents introduce is not a new kind of drift. It’s the same drift at a speed that makes existing governance frameworks structurally unable to keep up.
The Cloud Security Alliance’s March 2026 study found that seventy-four percent of organizations give AI agents more access than they need, sixty-eight percent cannot distinguish between human and agent activity in their logs, and nearly a third allow agents to operate under human user identities, meaning the agent’s actions are attributed to a person who may have no idea what was done in their name. Obsidian Security reported that AI agent activity grew three hundred times in 2025, with the average enterprise environment now containing over eight hundred agents carrying medium-to-critical risk factors.
In practice, this means an operations team deploys an emergency patch at night, and fifteen minutes later an AI agent responsible for maintaining infrastructure consistency detects the divergence and rolls it back. It means a developer asks an AI coding agent to resolve a Terraform conflict, and the agent executes terraform destroy on production, erasing two and a half years of data for a platform serving a hundred thousand users, after acknowledging explicit instructions not to run destructive commands. The agents did exactly what they were designed to do. The results were exactly the opposite of what the organizations needed.
This is where backup data becomes essential in a way that goes beyond recovery. A snapshot taken before an agent operates and a snapshot taken after provide something no audit log can: an independent, irrefutable record of what changed. Not what the agent reported. Not what the application logged. What actually happened to the data, the permissions, and the configurations across every protected application.
Why detection without recovery is just an expensive alarm
There is a growing consensus that backup vendors need to expand into data security intelligence, and I agree. I argued for it in Part 1. But the way parts of the industry are pursuing this expansion gets the architecture wrong. And architecture, in this case, is the difference between a product that solves the problem and one that just describes it more accurately.
The approach gaining traction: acquire or build a data security posture capability that classifies sensitive data, identifies configuration gaps, surfaces compliance risks, and presents findings in a dashboard. Then hope that someone, somewhere, in some other tool, can do something about what was found.
This is the DSPM model. And as a standalone category, it has a fundamental limitation that no amount of engineering will fix: it can tell you what’s wrong, but it cannot make it right. It can tell you that sensitive health records migrated to an uncontrolled SharePoint folder. It cannot restore the access controls that were supposed to prevent that migration. It can tell you that an AI agent modified forty thousand Salesforce records in a three-minute window. It cannot roll those records back to the state they were in two minutes before the agent started. It can identify that permissions drifted across your Okta environment over the past six months. It cannot restore last Tuesday’s configuration.
Detection without a recovery path is an alarm system with no fire department.
This is why I believe data security intelligence and data protection must live in the same control plane, not as adjacent products that share a vendor logo, but as a single workflow where every signal that is detected connects directly to a recovery action that can be taken. The moment you identify that an AI agent corrupted a dataset, you should be able to restore that dataset from the snapshot taken before the agent acted, in the same console, in the same motion. The moment you discover that sensitive data has spread to an application where it doesn’t belong, you should be able to see exactly when it arrived by comparing snapshots and recover the prior state. The moment you detect configuration drift in an identity provider, you should be able to pull the last known good configuration from the backup that captured it.
This only works if the platform that reads the data is the same platform that protects the data. And it only works across your full environment if that platform actually protects the applications where your data lives.
This is the part the industry needs to confront honestly. Most data protection platforms were built to protect infrastructure: virtual machines, file systems, databases, on-premises workloads. Some have added a handful of SaaS connectors. But the applications at the center of the AI agent problem, the Salesforce instances, the Microsoft 365 tenants, the Jira projects, the GitHub repositories, the Okta configurations, are precisely the applications where the largest data protection vendors have the thinnest coverage.
You can acquire the most sophisticated DSPM capability in the world. If you cannot recover the SaaS applications where the problems it identifies actually live, you have built a product that diagnoses the disease but cannot treat it. The dashboard shows the risk. The customer calls support. And the answer is: we can see what went wrong, but the application where it went wrong is outside our recovery scope.
That is not a data protection platform. It is a monitoring tool with a backup product attached.
The architecture that actually solves the problem looks fundamentally different. It starts with the widest possible protection footprint across SaaS, cloud, and on-premises workloads. It reads every backup as it’s taken, classifying sensitive data, tracking permission changes, detecting behavioral anomalies, and monitoring AI agent activity. And for every signal it surfaces, it provides a direct path to recovery: the ability to restore the affected data, the prior configuration, or the known-good state from the same platform, in the same workflow, without handing the customer a finding and wishing them luck.
Every signal, a recovery path. Every detection, an action. That’s the standard the industry needs to reach. And the vendors that get there first will be the ones that protected the broadest set of applications before anyone else realized why breadth of coverage was the whole game.
What this moment demands
I want to close with what I think this means for the industry I’ve spent my career in, because I believe we’re at a turning point that will separate the companies that matter from the ones that don’t.
Karpathy made an observation that I keep returning to. He said that in the physical world, when an AI system fails, you might get injured. There are worse outcomes, but they’re bounded. In software, the failure modes are almost unbounded. A security vulnerability, a mass data leak, the exposure of millions of records. These things happen at the speed of computation, not the speed of human error.
The data protection industry was built for bounded failures. A server crashes, you restore from backup. A ransomware attack encrypts your files, you recover from an immutable copy. These are serious events, but they’re discrete. They have a beginning and an end. The tools we built to handle them are good at what they do.
What AI agents introduce is the possibility of continuous, compounding, cross-application failures that happen faster than any human can detect, diagnose, or respond to. An agent that silently alters records across Salesforce, changes permissions in Okta, and modifies configurations in Jira in the same five-minute window creates a damage pattern that no single-application recovery tool can unwind. You need the full picture. You need the before-and-after across every application. And you need the ability to act on what you find, not just report it.
The organizations that will navigate this transition successfully are not necessarily the ones with the biggest security budgets. They are the ones that protected the widest set of applications, because every application you protect becomes a source of intelligence and a point of recovery. They are the ones whose data protection platform can read what it stores, not just retrieve it. And they are the ones that understood, before the first major agent-caused incident made it obvious to everyone else, that the value of data protection is no longer measured by how fast you can restore. It’s measured by how much you can see, how quickly you can understand what happened, and whether you can make it right before the damage compounds.
The agents are here. The incidents have started. The question is no longer whether data protection needs to evolve. It’s whether it evolves fast enough to matter.
The clock is running at machine speed. The answer can’t wait for human time.



Love this. Feels like one of the first pieces I’ve read that actually treats agent failures as infrastructure events rather than just security incidents.