Morning Overview

Anthropic’s Mythos AI is now restricted to a handpicked list of governments and utilities — the most capable cybersecurity model ever built, walled off from the public

In May 2026, the UK’s AI Security Institute confirmed something the cybersecurity world had been bracing for: an AI model completed a full, realistic corporate-network intrusion chain without human help. Anthropic’s Claude Mythos Preview solved all 32 steps of a simulated attack scenario called “The Last Ones,” succeeding in multiple runs where every other frontier AI system tested against the same benchmark fell short. Anthropic then locked the model behind a restricted-access program, limiting it to a handpicked group of governments and critical-infrastructure operators. The rest of the security industry is now grappling with a pointed question: what do you do when the most capable offensive cyber tool ever evaluated is one you cannot touch?

What the UK government found

The core evidence comes from the AI Security Institute, an independent body operating under the UK’s Department for Science, Innovation and Technology. AISI built a 32-step cyber range designed to mirror a realistic corporate-network intrusion, covering the full kill chain from initial reconnaissance through lateral movement to data exfiltration. Think of it as a controlled obstacle course that mimics the sequence a skilled human attacker would follow to compromise a real enterprise environment.

Mythos Preview completed the entire sequence in multiple evaluation runs. No other frontier model tested against the same benchmark finished all 32 steps. AISI published its evaluation methodology and findings through official government channels, and the technical design of the cyber range draws on research documented in a related arXiv paper that details token budgets, evaluation scaffolding, and scoring criteria.

AISI did test other frontier AI systems against the same benchmark, but it has not publicly named those models or disclosed how far each one progressed. That gap matters: a competitor that stalled at step 8 tells a very different story about the state of the field than one that reached step 30.

Three coordinated vulnerability disclosures connected to Anthropic’s offensive-capability research have also appeared in the National Vulnerability Database. One entry, CVE-2026-27654, along with CVE-2026-32316 and CVE-2026-33721, shows that the same research thread produced real-world vulnerability findings that were responsibly reported. Whether Mythos Preview discovered those flaws autonomously or served as an assistive tool for human researchers is not specified in the public filings. The distinction is significant: fully autonomous vulnerability discovery represents a fundamentally different risk category than AI-augmented human research, with direct implications for audit trails, logging, and attribution.

What we still don’t know

The biggest gap in the public record is the access policy itself. No primary document from Anthropic has surfaced that names the approved governments and utilities, explains the vetting criteria, or lays out the contractual conditions. The restriction to a handpicked group is consistent with the model’s limited availability and with Anthropic’s own Responsible Scaling Policy, which uses an AI Safety Level (ASL) framework to gate deployment of models that cross capability thresholds. But the specific terms of the Mythos Preview program remain undisclosed.

AISI’s evaluation, while institutionally credible, does not release raw run transcripts or full token-budget logs. Independent researchers cannot yet reproduce the 32-step completion or verify exactly how the model navigated each stage. The cybersecurity community is, for now, relying on AISI’s reputation rather than open replication.

The UK’s National Cyber Security Centre has weighed in on the broader picture. An NCSC blog post states that cyber defenders need to prepare for frontier AI, framing autonomous offensive capability as a near-term operational reality. That guidance aligns with the NCSC’s existing Cyber Essentials framework, which sets baseline security standards for UK organizations and is increasingly positioned as a floor rather than a ceiling.

Why this is different from existing offensive tools

Security professionals have worked with automated offensive tools for years. Frameworks like Metasploit and commercial platforms like Cobalt Strike can chain together exploits, but they depend heavily on human operators to select targets, interpret results, and adapt when defenses intervene. What sets Mythos Preview apart, based on the AISI evaluation, is the model’s ability to autonomously navigate a multi-step intrusion chain that requires decision-making at each stage: scanning, exploiting, pivoting, escalating privileges, and extracting data across a realistic network topology.

That does not mean Mythos Preview is a push-button weapon. The evaluation was conducted in a controlled environment with specific assumptions about network architecture, logging, and defensive posture. Real-world networks are messier, and active defenders add unpredictability that a simulated range cannot fully capture. But the benchmark establishes a new performance ceiling. For the first time, an AI system has demonstrated end-to-end competence across the full attack lifecycle in a setting designed to be hard.

What defenders and policymakers should do now

For cybersecurity teams, the practical response does not require waiting for Anthropic to publish its access list. A model that can autonomously chain 32 steps together is still constrained by the opportunities the target environment presents. Closing off the easiest pathways forces even advanced systems to work harder and increases the chance of detection at each stage.

The NCSC’s Cyber Essentials checklist remains a useful starting point: strong access control, disciplined patch management, malware protection, secure configuration, and well-defined network boundaries. Organizations that have treated those basics as optional now face a sharper incentive. An autonomous attacker that can chain exploits does not get tired, does not lose focus, and does not skip steps. Flat networks with unpatched services and exposed credentials are exactly the terrain where such a system would thrive.

For policymakers, the questions are harder. Anthropic’s decision to restrict access can be read as responsible stewardship: limiting a highly capable offensive system to trusted actors reduces the immediate risk of misuse. But it also concentrates power in a small, opaque group and leaves the broader security community without hands-on exposure to the tools that may soon define the threat landscape. The result is an information asymmetry where a few organizations can experiment with cutting-edge offensive AI while most defenders rely on secondhand descriptions of what it can do.

That asymmetry feeds into unresolved policy debates. Should there be formal export controls on models that demonstrate end-to-end offensive capability against realistic corporate environments? How should liability work if a restricted model is misused by an approved customer or compromised by a third party? What auditing and red-teaming requirements should apply before such systems are deployed operationally, even by governments? The United States has begun addressing some of these questions through executive orders on AI safety and CISA’s evolving guidance on AI-related threats, but no regulatory framework yet accounts for a model with Mythos Preview’s demonstrated capabilities.

A narrow window, but a clear direction

The public record on Mythos Preview is still thin. One government evaluation, three CVE entries, and a set of policy signals from the NCSC. No full run transcripts, no named competitors, no published access criteria. But what is available points in a consistent direction: autonomous offensive AI has moved from concept to demonstrated capability, validated by an independent government lab with no commercial stake in the outcome.

The techniques Mythos Preview used to clear that 32-step benchmark will not stay unique forever. Other frontier labs are working on similar capabilities, and the competitive pressure to match or exceed this result is already building. Defenders who treat this as a distant, theoretical concern risk finding themselves several steps behind in a race that has already started. The model is behind closed doors today. The capabilities it represents will not stay there.

More from Morning Overview

*This article was researched with the help of AI, with human editors creating the final content.


More in Cybersecurity