Anthropic’s new ‘Mythos’ AI model is so capable at cybersecurity that governments, banks, and utilities now need approval just to use it

An AI model built by Anthropic just forced the United Kingdom to do something it has never done with a piece of software: require banks, power companies, and government agencies to get approval before they deploy it. The model, called Mythos Preview, scored so high on classified cybersecurity evaluations that the UK’s AI Safety Institute concluded it could execute the kind of multi-step network intrusions that previously required a skilled human hacking team. In response, regulators have placed Mythos under a controlled-access regime that treats it less like a product and more like a weapon.

The decision, disclosed in government communications in early 2026, marks the first time a national regulator has imposed deployment restrictions on a specific AI model based on offensive cyber capability alone.

What the UK safety tests actually found

The AI Safety Institute, which operates under the UK’s Department for Science, Innovation and Technology, ran Mythos Preview through two rounds of testing designed to measure how well it could attack, not defend, computer networks.

The first round used capture-the-flag challenges, the same offensive-security puzzles that professional penetration testers use to sharpen their skills and compete against each other. Mythos cleared 73 percent of them, a threshold the Institute classified as expert-level in its published evaluation.

The second round was far more ambitious. Evaluators dropped Mythos into a simulated corporate network and gave it a 32-step attack scenario, starting from initial access and ending with full compromise of the target environment. Each step represented a distinct offensive action: exploiting a vulnerability, escalating privileges, moving laterally between systems, or exfiltrating data. Across all runs, the model averaged 22 of the 32 steps. In three out of ten attempts, it completed the entire chain from start to finish.

What unnerved evaluators was not just the completion rate but how the model got there. The Institute’s report noted that Mythos showed initiative in chaining vulnerabilities together, requiring less and less human guidance as the scenario progressed. When one exploit path failed, it pivoted to alternatives without being told to do so.

A supporting research preprint published on arXiv describes the methodology behind both test environments. Beyond the corporate network scenario, the researchers built a separate seven-step range modeled on industrial control systems, the kind of operational technology that runs power grids, water treatment plants, and factory floors. In that setting, Mythos navigated network segmentation, identified exposed control interfaces, and issued commands that, in a real facility, could have altered physical processes like valve positions or pump speeds.

Why more computing power makes the problem worse

One of the most consequential findings in the arXiv preprint involves what happens when you give Mythos more time to think. Researchers scaled the model’s inference-time compute budget from 10 million tokens to 100 million tokens and observed substantial jumps in attack success rates. In practical terms, this means the same model, with no additional training, became a significantly better attacker simply by being allowed to process longer.

Longer runs let Mythos explore alternative exploit paths when an initial approach failed, recover from dead ends, and refine its privilege-escalation techniques without any additional human prompting. The implication is straightforward and troubling: the main bottleneck between a partial breach and a full one is raw computing budget, and that budget gets cheaper every year as cloud prices fall.

For defenders, this introduces a variable that traditional patch cycles and perimeter defenses were never built to handle. A deployment that appears safe under tight compute limits could become far more dangerous if those limits are relaxed, bypassed, or simply outpaced by falling hardware costs. Standard risk assessments, which tend to treat a model’s capability as fixed at the time of evaluation, may not capture this kind of sliding-scale threat.

What Anthropic has and hasn’t said

Anthropic has not publicly detailed what internal safeguards it has placed on Mythos Preview or whether it has imposed its own access restrictions independent of the UK government’s requirements. The company’s Responsible Scaling Policy, published in 2023 and updated since, commits to evaluating frontier models for catastrophic risks, including cyber capabilities, before wide release. But the company has not issued a public statement explaining how Mythos fits within that framework or whether the UK evaluation results triggered internal escalation procedures.

That silence leaves a gap in the public record. Readers and policymakers are left to assess the risk based almost entirely on the government’s findings, without knowing what additional controls the model’s own creator may have already put in place.

The policy response is real but still vague

The approval requirement itself has been confirmed in government communications, but key details remain unpublished. The AI Safety Institute has not disclosed the exact criteria organizations must meet to receive clearance, the full list of covered entities, or the enforcement mechanism for noncompliance. No timeline for finalizing those criteria has been made public.

No bank, utility, or government agency has publicly described how it intends to use Mythos or what defensive applications it might serve. That gap matters because the same capabilities that make the model dangerous on offense could, in theory, make it valuable for automated threat detection, red-team simulation, or vulnerability discovery. An AI that can reliably chain exploits could map out how a real attacker would move through a network, giving defenders a chance to harden the most likely paths before they are used against them.

The National Cyber Security Centre has warned broadly that frontier AI is shifting the threat landscape faster than defenders can adapt, but it has not published a formal risk assessment specific to Mythos. Whether the approval requirement reflects a judgment about this particular model or a broader policy triggered by its timing remains unclear from public records.

There is also an open question about whether restricting access could hurt defenders as much as it constrains attackers. If the same compute-scaling mechanism that boosts offensive performance also improves defensive tasks, like finding vulnerabilities before adversaries do, then limiting access to Mythos could slow down the people trying to protect networks. No published study has tested whether defensive fine-tuning benefits from compute scaling at the same rate as offensive use, leaving regulators to make a judgment call without complete evidence.

Why the Colonial Pipeline precedent looms large

UK officials have drawn a direct line between the Mythos findings and the 2021 Colonial Pipeline ransomware attack in the United States. That intrusion, carried out entirely by human operators, shut down fuel distribution across the U.S. East Coast for days and exposed how brittle critical infrastructure can be when a single network is compromised.

The difference now is speed and scale. A human red team can only work so many hours and target so many systems at once. An AI agent like Mythos can, in principle, run continuously across multiple environments, probing for weaknesses at a pace no human team could sustain. If the Colonial Pipeline attack demonstrated what a small group of skilled hackers could do to a national fuel supply, the Mythos evaluation raises the question of what happens when that level of capability becomes available to anyone with enough cloud credits and a willingness to bypass safety controls.

What happens next will shape AI regulation for years

Two questions will determine how this story is remembered. The first is scope: how much of the risk comes from Mythos itself, and how much comes from the broader ecosystem of tools, cloud resources, and vulnerable systems that surround it? A model that scores 73 percent on CTF challenges is dangerous, but it operates in a world where many networks remain poorly patched and segmented. Fixing the model’s access controls without fixing the infrastructure it targets addresses only half the problem.

The second question is precedent. If the UK’s approval requirement works, other jurisdictions will study it. The European Union’s AI Act already provides a framework for regulating high-risk AI systems, and U.S. policymakers have been debating similar authorities. A successful UK model could accelerate those efforts. A botched one, heavy on restriction and light on clarity, could convince other governments that model-specific controls are unworkable.

Either way, Mythos Preview has crossed a line that regulators had been watching for but had not yet seen an AI model reach. The question is no longer whether an AI system can match a skilled human hacker. The UK government’s own tests say it can. The question now is what the world does about it.

More from Morning Overview

*This article was researched with the help of AI, with human editors creating the final content.

IG

FB

PIN

LI

X