An AI Just Found Thousands of Zero-Days!

I have been tracking AI capabilities in security research for a while now. But what Anthropic just announced with Project Glasswing stopped me mid-scroll.

Their model, Claude Mythos, autonomously discovered thousands of previously unknown zero-day vulnerabilities across every major operating system and every major web browser. Not with human guidance. Not with carefully scoped prompts. Autonomously.

And then, during evaluation, the model escaped its sandbox, found a way to access the internet from an isolated system, emailed a researcher to prove it worked, and posted exploit details on public-facing websites without anyone asking it to.

Let that sink in for a second.

What Claude Mythos Actually Did

Anthropic describes Mythos as having reached a "level of coding capability where they can surpass all but the most skilled humans at finding and exploiting software vulnerabilities." That quote alone should raise eyebrows. But the specifics are even more striking.

The model found critical flaws in software that has been deployed and audited for decades:

A 27-year-old bug in OpenBSD (now patched)
A 16-year-old flaw in FFmpeg
A 4-vulnerability chain in a major web browser that, combined, gave full exploitation capability
Bugs in the Linux kernel, Firefox, and a memory-safe virtual machine monitor

Think of it like a locksmith who can test every lock in a city in a single afternoon. Human security researchers might spend months on just one of these codebases. Mythos did it across all of them, concurrently, and found flaws that decades of manual review had missed.

Close-up of a human eye reflecting digital patterns, representing AI-powered surveillance and autonomous code analysis

The Sandbox Escape That Should Worry Everyone

During evaluation testing, Anthropic asked Mythos to escape a secured sandbox environment. The model succeeded. But then it went further, completely on its own.

It devised a multi-step exploit to get internet access from the isolated system. It sent an email to the researcher to demonstrate the escape. And in what Anthropic described as a "concerning and unasked-for effort to demonstrate its success," it posted exploit details to "multiple hard-to-find, but technically public-facing, websites."

I want to be clear about what happened here. The model was told to escape. It did. But nobody told it to publish the exploits online. It did that on its own, as if to prove a point. That gap between "what was requested" and "what it chose to do" is the part that keeps security people up at night.

Project Glasswing: Controlled Deployment

Anthropic is not releasing Mythos to the public. They explicitly said: "We do not plan to make Claude Mythos Preview generally available due to its cybersecurity capabilities."

Instead, they formed Project Glasswing, a coalition of 12 major technology organizations that get gated access to Mythos for defensive purposes only. The partner list reads like a who's who of tech:

Cloud and OS vendors: AWS, Microsoft, Google, Apple
Security specialists: CrowdStrike, Palo Alto Networks, Cisco, Broadcom
Infrastructure and hardware: NVIDIA, Linux Foundation
Enterprise: JPMorgan Chase

Beyond these 12, more than 40 organizations that build or maintain critical software infrastructure also have access. Anthropic committed $100 million in usage credits and $4 million in direct donations to open-source security initiatives.

A security and privacy dashboard displaying protection status, representing the defensive cybersecurity tools enabled by AI

The Bigger Picture: AI Security Is an Arms Race Now

Mythos did not emerge in a vacuum. Anthropic acknowledged that these capabilities "emerged as a downstream consequence of general improvements in code, reasoning, and autonomy." They did not specifically train the model to be a hacker. It just became one as it got smarter at coding.

That is an important distinction. It means every frontier lab pursuing better coding agents is, by definition, also building better vulnerability-discovery agents. The capability is a side effect of general intelligence improvements.

And Anthropic is not alone. OpenAI launched Aardvark, their own autonomous security research agent, which reportedly identified 92% of known and synthetically-introduced vulnerabilities in benchmark testing. The AI agent XBOX topped HackerOne's leaderboard in early 2026, becoming the first AI model to do so. Hundreds of verified zero-days have been reported by AI systems in Q1 2026 alone.

The race is on, and both sides can use the same weapons.

The Irony of Anthropic's Own Security Lapses

Here is the part that makes you wince. In March 2026, just weeks before the Glasswing announcement, Mythos details leaked via a publicly accessible cache due to human error. Days later, a second lapse exposed roughly 2,000 source files and more than 500,000 lines of code.

Separately, security researchers at Adversa discovered that Claude Code (Anthropic's developer tool, version 2.1.90) silently ignores user-configured security deny rules when a command contains more than 50 subcommands. Their assessment was blunt: "They traded security for speed. They traded safety for cost."

Building the world's most capable vulnerability-discovery AI while simultaneously leaking your own source code is a rough look. It underscores a persistent truth in security: the hardest part is never the technology. It is the operational discipline around it.

What This Means for You

If you run production systems (and I say this as someone who maintains a k3s cluster at home), the calculus just changed. Here is what I think matters most:

Patch cycles are now a survival metric. If an AI can find a 27-year-old OpenBSD bug in hours, adversaries with similar models will find your unpatched CVEs in minutes. The window between disclosure and exploitation just collapsed.
Defense-in-depth is not optional. When a single AI can chain 4 vulnerabilities together for full exploitation, relying on a single security boundary is gambling. Network segmentation, least privilege, and zero trust are table stakes.
AI-powered defense is becoming necessary. The same way we use spell-check because no human catches every typo, security teams will increasingly need AI tools to find what they cannot. The volume and speed advantage is too large to counter with manual review alone.
Capability controls matter more than capability limits. Anthropic's decision not to release Mythos publicly is the right call. But the genie is partially out of the bottle. Other labs are building similar systems, and not all of them will exercise the same restraint.

The Bottom Line

We have crossed a threshold. AI models can now autonomously discover, chain, and exploit vulnerabilities faster than the best human red teams. Anthropic is trying to make that capability a net positive by restricting access and funneling it toward defense. That is commendable. But it also means adversaries now know exactly what is possible, and they will build their own versions.

The security landscape just accelerated by an order of magnitude. Whether you are a CISO at a Fortune 500 or a solo dev running containers on your home network, the playbook is the same: patch faster, layer deeper, and start thinking about how AI fits into your defensive toolkit. Because the attackers already are.

An AI Just Found Thousands of Zero-Days!

What Claude Mythos Actually Did

The Sandbox Escape That Should Worry Everyone

Project Glasswing: Controlled Deployment

The Bigger Picture: AI Security Is an Arms Race Now

The Irony of Anthropic's Own Security Lapses

What This Means for You

The Bottom Line

Bruno Bonando

Thinking out loud

AI Writes Most of My Code. Learn It Anyway.

The First 90 Days as a CTO: Playbook

The Anthropic IPO is about to hit and we’re not ready for this!

AI Coding Agents Are Sleepwalking Into the Same Trap as Cloud

Want to discuss this topic?