Goldman Sachs CEO David Solomon says he is “hyper-aware” of the cybersecurity dangers posed by increasingly capable AI models, and his bank is now working directly with Anthropic to stress-test how advanced language models could be turned against financial infrastructure.
The collaboration, first reported by The Guardian in April 2026, centers on evaluating whether AI agents can execute realistic, multi-step cyberattacks on corporate networks. It comes as governments and researchers are reaching a shared conclusion: the offensive capabilities of frontier AI models are advancing faster than most organizations have prepared for.
What Goldman and Anthropic are actually doing
According to Solomon’s reported remarks, the effort is linked to testing conducted by the UK AI Security Institute, which designed a 32-step simulated attack scenario. That scenario replicates the full chain of actions a hostile AI agent would need to breach a corporate network: initial reconnaissance, vulnerability exploitation, lateral movement across internal systems, and data exfiltration.
The 32-step framework is not unique to this collaboration. According to The Guardian’s reporting, a preprint published on arXiv describes a nearly identical benchmark for measuring how far AI agents have progressed in executing long-horizon offensive operations. The overlap suggests that both the UK government and the academic research community may be converging on similar methods for quantifying AI-driven attack capability.
The Guardian report also references a specific vulnerability, identified as CVE-2026-4747, with a corresponding National Vulnerability Database page. According to the report, the NVD entry includes CVSS scoring and a CISA-ADP assessment, meaning the flaw would be documented, rated for severity, and something security teams can look up and patch in their own environments. This grounds the Goldman-Anthropic work in a real, specific weakness rather than a theoretical scenario, though readers should verify the NVD entry directly.
The broader threat picture
Goldman’s move does not exist in a vacuum. The UK National Cyber Security Centre published a forward-looking assessment projecting that AI-enabled tools will increase both the scale and speed of cyber operations through 2027. The report warns that AI will sharpen attackers’ ability to exploit known vulnerabilities and that embedding AI into critical systems will widen the attack surface available to adversaries.
Separately, benchmark-oriented research evaluating leading language models on automated exploitation tasks, using components from the DARPA AI Cyber Challenge, found that newer models show measurable progress in carrying out multi-step exploitation sequences. The study’s focus on reproducible, stepwise tasks mirrors the 32-step scenario described in connection with the UK AI Security Institute, reinforcing the pattern: AI models are getting meaningfully better at automating parts of the attack chain that previously required skilled human operators.
For the financial sector specifically, the stakes are acute. Banks sit on vast stores of sensitive customer data, process trillions of dollars in daily transactions, and operate interconnected systems where a breach at one institution can cascade. An AI agent capable of automating even a portion of a sophisticated attack sequence could compress what once took weeks of human effort into hours.
What is still unknown
For all the attention the collaboration has drawn, significant gaps remain in the public record.
Neither Goldman Sachs nor Anthropic has released a joint statement, white paper, or technical disclosure describing the scope or findings of their work together. The details available come from Solomon’s reported remarks and the UK AI Security Institute’s testing, not from a formal partnership announcement. Whether this is a narrow red-team exercise or a broader, ongoing program covering multiple models and threat scenarios is unclear.
The Guardian’s report references an Anthropic model by the name “Mythos,” but that name has not been independently confirmed by Anthropic or Goldman Sachs in any public disclosure. Readers should treat the model designation as sourced to the report rather than as an established product name.
The UK AI Security Institute’s 32-step test results have not been published in full. Whether the AI model under evaluation completed all 32 steps, partially succeeded, or failed at a particular stage is not established in any available source. That distinction matters enormously: an AI that can execute 5 of 32 steps poses a fundamentally different risk than one that can execute 30, particularly if the completed steps cluster around the most dangerous phases of an attack.
The precise CVSS severity score for CVE-2026-4747 has not been detailed in reporting beyond confirming its presence in the NVD. Security teams should consult the official entry directly rather than relying on secondhand characterizations of the flaw’s severity.
Perhaps most notably, no financial regulator has published guidance on AI-cyber risk mitigation frameworks designed for this kind of bank-AI company collaboration. Goldman and Anthropic appear to be operating ahead of any formal regulatory playbook, which means there is no external standard against which to measure the rigor of their approach.
Why this matters beyond Wall Street
The Goldman-Anthropic effort signals a shift in how major institutions are thinking about AI risk. Rather than waiting for regulators to set rules or for a breach to force action, at least one top-tier bank is proactively probing how frontier models could be weaponized against its own systems.
That approach raises its own questions. Other major banks, including JPMorgan Chase, have invested heavily in AI for internal operations but have not publicly disclosed comparable adversarial testing partnerships with AI developers. Whether Goldman’s move pressures competitors to follow suit, or whether similar efforts are already underway behind closed doors, remains to be seen.
For Anthropic, the collaboration serves a different purpose. The company has positioned itself as a safety-focused AI lab, and working with a systemically important bank to identify dangerous capabilities in its own models reinforces that brand. It also generates the kind of real-world stress-testing data that lab-only evaluations cannot replicate.
Regulators will eventually have to decide how to evaluate and standardize these arrangements. Until then, the clearest takeaway from the available evidence is narrower than the headlines suggest: one major bank is testing how a specific, documented vulnerability could be exploited by a state-of-the-art AI model, in a threat environment that national authorities already expect to worsen. The responsible move for the rest of the industry is to engage with the primary data, patch what can be patched, and stop treating AI-driven cyber risk as a problem for the future. It is already here.
More from Morning Overview
*This article was researched with the help of AI, with human editors creating the final content.