Anthropic’s Mythos model, described in private testing as capable of constructing complex cyber exploits, has been distributed to roughly 150 organizations across more than 15 countries under a controlled-access initiative called Project Glasswing. The rollout raises a pointed question for governments and security teams worldwide: can selective distribution of an offensive-capable AI system generate defensive intelligence faster than it widens the attack surface? Available evidence traces the model’s evaluation to a tiered benchmarking methodology and a UK government research thread tracking rapid advances in autonomous cyber capability, but critical details about access terms, recipient identities, and usage restrictions have not been publicly disclosed.
What is verified so far
The strongest technical anchor for the Mythos rollout is a benchmarking framework called ExploitBench. The academic paper describes a structured method for measuring how well large language model agents perform exploitation tasks using 16 progressively harder flags, with each flag representing a distinct exploitation objective. That laddered design allows researchers to pinpoint exactly where a model’s offensive capability plateaus rather than relying on a single pass/fail score, and it supports reproducible comparisons between systems.
One of the paper’s central findings is that a private frontier model demonstrates higher-end exploit construction capability compared with publicly deployed models. The private system is able to clear more of the upper-tier flags, particularly those involving multi-step reasoning, tool use, and adaptation to partial failures. The paper does not name this private model, but the performance gap between it and open systems is the basis for concern about what happens when such capability moves beyond a controlled lab. If the private frontier model referenced in ExploitBench aligns with Mythos, as the Project Glasswing context suggests, then the 150-organization distribution represents the first large-scale field exposure of a system that outperforms anything the public has tested on this benchmark.
A separate research thread from the UK AI Safety Institute tracks how quickly autonomous AI cyber capability is advancing. That thread, hosted on the institute’s website and accessed via a blog analysis of capability growth, was discovered through the citation trail from Project Glasswing’s initial update, establishing a direct link between the UK government’s safety research and the controlled distribution program. The AISI post focuses on the pace of improvement in autonomous offensive tools, including scenarios where models chain tools together or operate with limited human oversight. That framing provides institutional context for why a staged rollout, rather than open release, was chosen for Mythos.
Taken together, these two primary records confirm three things: the evaluation methodology is real and peer-reviewable, a private model already outperforms public alternatives on exploitation tasks, and at least one national AI safety body is actively monitoring the speed of these gains in connection with Project Glasswing. They also show that the concern is not speculative; it is grounded in measured performance differences and in government-level attention to the emerging risk profile.
What remains uncertain
No publicly available document lists the roughly 150 organizations said to be receiving Mythos access. The 15-country figure circulates in descriptions of the program’s scope, but insufficient data exists to determine which nations are included or how recipient organizations were selected. Without an official registry or even a partial disclosure, independent verification of the distribution’s geographic and institutional reach is not possible at this time. This opacity makes it difficult to assess whether access is concentrated in well-resourced security teams or spread more broadly across industry and academia.
The ExploitBench paper provides a clear methodology for grading exploitation capability, yet it does not publish granular test results tied to a model identified as Mythos. The “private frontier model” label in the paper leaves room for interpretation. It could refer to Mythos, to an earlier Anthropic prototype, or to a system from another developer entirely. Readers should therefore treat the connection between ExploitBench’s top-performing model and the Mythos distribution as strongly implied by the Project Glasswing context but not explicitly confirmed in the paper itself. Any statement that equates the two should be read as an informed inference rather than a documented fact.
Access terms represent another gap. Controlled red-team exercises and security research are the expected use cases, but no official statement details the data-sharing agreements, usage restrictions, or incident-reporting obligations imposed on participating organizations. Whether recipients can share findings with third parties, retain model outputs for long-term analysis, or integrate Mythos into their own defensive toolchains is unknown based on available sources. Similarly, it is unclear whether any technical safeguards-such as usage logging, rate limits, or constrained tool integrations-are enforced at the API level to reduce the risk of offensive misuse.
The UK AISI blog post, while clearly linked to Project Glasswing through its citation trail, does not reproduce the full text of the Glasswing initial update. That means the institutional rationale for the program, any risk thresholds Anthropic set before distribution, and any conditions under which access could be revoked are all drawn from secondary references rather than a primary policy document. The absence of a public governance charter leaves open questions about how Anthropic and its government counterparts would respond if Mythos-derived techniques began appearing in real-world attacks.
How to read the evidence
Two categories of evidence support the Glasswing story, and they carry different weights. The ExploitBench paper is a primary academic source with a versioned methodology, a defined evaluation structure of 16 tiered flags, and a reproducible design. Any claim about how frontier models perform on exploitation tasks can be checked against that paper’s framework. It is the strongest available record for understanding what “cyber-capable” means in measurable terms, and it anchors discussions of Mythos in concrete performance metrics rather than anecdotes.
The AISI blog post sits one step removed. It is an institutional source from a government safety body, which gives it credibility on questions of policy intent and capability trajectory. But because it was surfaced through a citation trail rather than as a standalone disclosure of Project Glasswing, it functions more as contextual confirmation than as a primary record of the program’s structure or terms. Readers evaluating the initiative should weight the ExploitBench methodology more heavily for technical claims and treat the AISI thread as evidence that government researchers view the pace of autonomous cyber capability as a serious concern tied to this specific family of systems.
Absent from the evidence base are several elements that would normally accompany a program of this sensitivity. There is no public threat model describing how Anthropic expects Mythos to be used and misused, no detailed access policy outlining eligibility and revocation criteria, and no incident-reporting framework that would allow external observers to track whether the rollout is generating net defensive benefit. There is also no independent audit or third-party oversight mechanism described in available sources. These omissions do not prove that such safeguards are absent in practice, but they do mean that outside stakeholders must rely on trust rather than verification.
For now, the most defensible reading is that Mythos, evaluated under the ExploitBench framework and monitored in parallel by the UK AISI research thread, represents a meaningful step change in AI-enabled exploitation capability. Project Glasswing is an attempt to channel that capability into controlled hands, but its opaque access structure and limited public documentation make it difficult to judge whether the balance of risk and benefit is being managed effectively. Until more primary records emerge-such as a public governance charter, anonymized evaluation results explicitly tied to Mythos, or an official list of participating sectors-analysts will have to navigate a landscape where the technical evidence is solid, the policy context is suggestive, and some of the most important operational details remain out of view.
More from Morning Overview
*This article was researched with the help of AI, with human editors creating the final content.