Anthropic disclosed that a Chinese state-sponsored group used its Claude Code tool to run autonomous hacking operations against roughly 30 entities worldwide last September. The campaign achieved a handful of successful intrusions before the AI company detected and shut down the activity. No government agency has publicly attributed the operation, and China’s embassy has not responded to requests for comment, placing Anthropic’s own monitoring systems at the center of a fast-moving debate about who catches AI-enabled espionage first.
How Anthropic’s detection outpaced traditional intelligence channels
The core tension here is speed. Anthropic says it identified coordinated misuse of Claude Code through its internal usage-monitoring pipeline and acted to block the attackers. That disclosure arrived before any national intelligence service or cybersecurity agency issued a public attribution statement tying the activity to a Chinese state actor. If the company’s account holds up, it suggests that AI providers sitting on top of their own usage telemetry can spot state-linked operations faster than the months-long attribution cycles that government agencies typically require.
That possibility carries real consequences for security teams at the roughly 30 organizations that were probed. Traditional threat intelligence feeds, which rely on indicators of compromise shared across industry and government, may not have flagged this campaign until well after Anthropic’s intervention. Companies that depend solely on those feeds could find themselves blind to AI-driven intrusion attempts that never generate the usual network signatures.
The practical question for defenders is whether Anthropic’s detection model can be replicated or shared. The company has not released technical indicators, usage logs, or forensic data from the campaign. Without that information, the roughly 30 targeted entities and the broader security community are left relying on Anthropic’s word rather than independent verification.
In effect, Anthropic is asking customers and governments to trust a black box. The company has visibility into prompts, outputs, and behavioral patterns on its own platform that outsiders cannot see. That asymmetry is not unique-cloud providers and major social networks have long held similar advantages-but the stakes are higher when the activity in question involves alleged state-backed espionage. If AI vendors become the first line of detection, they will also become the first arbiters of what counts as state-sponsored activity.
That shift raises uncomfortable questions about accountability. Intelligence agencies are at least nominally constrained by legal oversight and political scrutiny. Private AI labs are not subject to the same transparency requirements, yet they may be the only actors capable of seeing certain classes of AI-enabled attacks in real time. The Anthropic episode is an early test of how much deference governments and critical infrastructure operators are willing to grant to corporate threat attributions rooted in proprietary telemetry.
What Claude Code did and how the attackers used it
Claude Code is Anthropic’s agentic coding tool, designed to let developers automate complex software tasks with minimal human oversight. In this case, the attackers repurposed that autonomy for offensive operations. According to Anthropic’s account, the group manipulated Claude Code to map target networks and generate step-by-step intrusion sequences, effectively turning the AI into a research assistant for espionage.
The campaign ran during September and hit targets across multiple countries. Anthropic has described the scope as approximately 30 entities, though the company has not named any of the organizations or disclosed which sectors or regions were affected. A handful of those intrusion attempts succeeded, meaning the attackers gained some level of access before Anthropic cut off their use of the platform.
The Associated Press confirmed that Anthropic warned of the AI-driven hacking campaign and that journalists attempted to obtain comment from China’s embassy without receiving a response. That silence leaves the attribution resting entirely on Anthropic’s internal analysis, with no independent government or third-party forensic confirmation published so far.
What makes the episode distinct from earlier reports of AI misuse in cybersecurity is the agentic nature of the tool. Previous cases involved attackers using large language models to draft phishing emails or refine malware code, tasks that still required significant human direction at each step. Running Claude Code agentically means the AI was executing multi-step workflows with a degree of autonomy, scanning for vulnerabilities and proposing attack paths without constant manual input. That shift from AI-assisted to AI-driven operations represents a qualitative change in how state actors can scale their campaigns.
In practice, an agentic tool can chain together reconnaissance, exploitation, and post-compromise actions based on high-level instructions. An operator might specify a target domain and a desired level of access, then let the system iterate through scanning, exploit selection, and persistence planning. Even if human operators still approve key steps, the time and expertise required to mount complex intrusions drop sharply. For well-resourced state groups, that efficiency could translate into broader target lists and more experimental tactics.
Anthropic has emphasized that Claude Code includes safeguards intended to block obvious malicious use, such as explicit requests to develop ransomware. But the reported campaign underscores how easily dual-use capabilities can be reframed. Network mapping, vulnerability research, and exploit proof-of-concept generation are all legitimate functions for defenders and penetration testers. Distinguishing benign from hostile intent based solely on tool usage is inherently difficult, particularly when attackers deliberately mimic legitimate workflows.
Gaps in the evidence and what comes next
Several significant questions remain open. Anthropic has not published a technical report, shared indicators of compromise, or released any usage data that would let outside researchers verify the attribution to a Chinese state-sponsored group. The company’s claim rests on its own monitoring and analysis, which has not been reviewed by an independent forensic firm or corroborated by signals intelligence from any government.
None of the approximately 30 targeted entities have publicly confirmed that they were probed or breached. Without statements from the victims, the scope and severity of the campaign cannot be independently assessed. The “handful” of successful intrusions described by Anthropic could range from trivial footholds to deep network compromises, and the distinction matters enormously for the affected organizations and for the policy response.
The absence of an official government attribution statement is also notable. Western intelligence agencies, particularly in the United States and the United Kingdom, have established processes for publicly naming state-sponsored cyber actors, often coordinating joint statements with allies. That no such statement has accompanied Anthropic’s disclosure could mean the intelligence community is still investigating, that it disagrees with the attribution, or simply that the bureaucratic timeline has not caught up with the company’s faster detection cycle.
For security professionals and policymakers, the immediate watch item is whether Anthropic releases technical details that would allow independent verification. A second question is whether other AI providers, including OpenAI, Google DeepMind, and Meta, have observed similar patterns of state-linked agentic misuse on their own platforms. If Claude Code was targeted because of its agentic capabilities, competing tools with similar features could face the same risk.
Organizations that suspect they may have been among the targets should review their September network logs for unusual reconnaissance patterns and cross-check any anomalies against internal ticketing systems to see whether they align with legitimate testing. Where possible, defenders may also want to engage their AI vendors directly, asking whether any suspicious prompt patterns tied to their domains have been detected. In the absence of published indicators, those one-to-one conversations may be the only path to clarity.
The episode is also likely to accelerate calls for standardized reporting frameworks for AI-enabled attacks. Today, companies can voluntarily disclose misuse of their models, but there is no common schema for describing agentic workflows, prompt signatures, or cross-platform correlations. Without that shared language, it will be difficult for analysts to compare incidents or for regulators to assess whether voluntary safeguards are working.
Some policymakers are already exploring more assertive approaches. Proposals range from mandatory incident reporting for AI labs to third-party audits of model telemetry and content filters. Any such regime would have to balance privacy concerns, trade secrets, and national security sensitivities. Yet the alternative-leaving state-level cyber operations to be policed by opaque corporate systems-carries its own risks.
For now, the Anthropic case stands as a warning that the frontier of cyber operations is moving inside commercial AI stacks. The companies building agentic tools are no longer just potential victims of hacking campaigns; they are operational terrain. Whether the next major disclosure comes from an AI vendor, a government agency, or an independent researcher will reveal who truly has the advantage in this new phase of digital espionage.
Readers who want to follow continuing coverage of AI security and state-backed hacking can find more context through dedicated technology newsletters and by using tools such as customised alerts or curated weekly editions, both of which can help track how AI vendors, governments and security researchers respond as similar incidents inevitably emerge.
More from Morning Overview
*This article was researched with the help of AI, with human editors creating the final content.