As AI gets more capable and more deployed, the failure surface grows with it. Stanford's 2025 AI Index recorded 233 AI-related incidents in 2024 — a record high and a 56.4% increase over 2023. That figure is almost certainly an undercount: it captures the incidents that were reported and catalogued, not the quiet failures that never made it into a public database. The direction of travel, though, is unambiguous.
The widening gap
- Incidents are rising as deployment rises — more systems, more inputs, more ways to fail.
- Standardised safety evaluation and responsible-AI benchmarking lag behind capability; organisations recognise risks faster than they mitigate them.
- Regulation is racing to catch up — US state-level AI laws jumped to 131 in the last year.
The pattern underneath all three is the same. We are very good at making models more capable and shipping them faster. We are much slower at building the evaluation, monitoring and governance that would tell us when one of those models is about to do something we'll regret. Capability is a research problem with enormous investment behind it; safety is an operational discipline that has to be built into every deployment by hand. The two are not advancing at the same rate.
The story isn't that AI is dangerous. It's that capability is compounding faster than the practices meant to keep it safe.
Where incidents actually come from
In practice, most failures are not exotic. A model hallucinates a fact and a user acts on it. A prompt-injection attack smuggles instructions through user-supplied text. A system that worked fine in testing drifts as real-world inputs diverge from the evaluation set. An automated action fires on a wrong inference and there's no human between the model and the consequence. These are mundane, and that's the point — they are preventable with ordinary engineering discipline, not heroics.
The reason they keep happening is a mismatch in incentives and pace. Shipping a capability is rewarded immediately and visibly; the safety work that would have caught the failure is invisible right up until the moment it would have paid off. A team under pressure to launch will, by default, spend its last week polishing the demo rather than red-teaming the failure modes — and the incident lands months later, when the connection back to that decision has faded. The 56.4% rise is what that systematic under-investment looks like in aggregate.
Why the gap is structural, not temporary
It would be comforting to assume the safety gap closes on its own as the field matures. It doesn't, because the two sides scale differently. Capability improvements are largely centralised — a handful of labs make models better and everyone benefits at once. Safety, by contrast, has to be re-implemented in every single deployment: your validation, your monitoring, your human-review thresholds, your logging. There is no central upgrade that makes everyone's production system safer. That asymmetry means the gap is the default state, and only deliberate effort by each team building on top of these models closes it locally.
What responsible teams do
- Treat model output as untrusted: validate, constrain, and never let it act unsupervised on anything irreversible.
- Red-team before launch, monitor after, and keep a human in the loop where stakes are high.
- Log inputs and outputs so an incident can actually be investigated rather than guessed at.
- Build an evaluation harness so you can detect regressions and drift instead of waiting for a user to find them.
What this means for a business
Safety isn't a launch checkbox; it's an operating discipline that runs as long as the system does. The organisations that come out of this period well will be the ones that resource safety like they resource reliability — with owners, budgets and on-call rotations, not a one-off review before go-live. That is not a tax on speed; it is what lets you ship ambitious things without your name appearing in next year's incident count.
A useful reframe for sceptical stakeholders: every one of those 233 incidents happened to an organisation that almost certainly believed its system was fine right up until it wasn't. The cost of the safety practices that would have caught most of them — validation, monitoring, a human checkpoint on irreversible actions, honest logging — is small and predictable. The cost of the incident is large, unpredictable, and lands at the worst possible time, often with reputational and regulatory tails attached. Framed as risk management rather than virtue, the investment is straightforwardly worth it. The teams that internalise this don't move slower; they move with the confidence that comes from knowing they'll see a problem coming. If you want a clear-eyed read on the risks in a system you're planning or already running, we can help.
Sources
- Stanford HAI — 2025 AI Index