A growing body of academic research shows that the internal architecture of deep neural networks running on NVIDIA GPUs can be reconstructed by remote attackers who never touch the target hardware. The technique, described in a preprint called Mercury, uses measured execution traces to recover model details automatically, raising hard questions about the security of AI workloads in shared cloud environments. The findings sit alongside at least three other independent research efforts that exploit different side-channel signals, from memory access patterns to electromagnetic emissions, to achieve similar results.
What is reported so far
The Mercury preprint, hosted on the arXiv repository, describes an automated remote side-channel attack against NVIDIA deep learning acceleration hardware. The attack is described as not requiring physical proximity or direct access to the victim’s system. Instead, it captures execution traces generated during inference and uses those signals to reconstruct the target model’s architecture without any prior knowledge of the network’s design. The work is significant because it removes the manual reverse-engineering step that earlier side-channel attacks depended on, potentially making the threat more practical for real adversaries.
Mercury is not an isolated result. A separate preprint titled NeuroUnlock reports that a deep neural network’s architecture can be recovered even when the model has been deliberately obfuscated. That work targets memory access patterns on an NVIDIA GPU as the leaky side-channel, arguing that standard software protections may fail to conceal structural details from a sufficiently motivated observer. The implication is direct: encrypting model files or hiding configuration parameters may not stop an attacker who can monitor how the GPU behaves while running the model.
A third line of research, EZClone, moves the threat from architecture snooping toward full model cloning. That preprint describes how aggregate GPU execution profiles can be used to predict a DNN’s architecture and then improve model extraction through a technique the authors call shape distillation. Where Mercury and NeuroUnlock focus on recovering what a network looks like, EZClone aims to bridge the gap to replicating what it does, a distinction that matters for anyone whose competitive advantage depends on proprietary model design.
The most recent addition to this research thread is Kraken, a preprint that extends electromagnetic side-channel attacks to deep neural networks and includes exploratory discussion of leakage from large language models. Kraken considers both near-field and far-field measurement scenarios, suggesting an attacker may not necessarily need to be in the same room, or even the same building, as the target hardware. The far-field dimension is especially relevant for data centers where tenants share physical infrastructure but assume electromagnetic isolation.
What remains uncertain
None of these preprints have been accompanied by public statements from NVIDIA addressing specific mitigation strategies for the side-channel vulnerabilities they describe. Whether NVIDIA has implemented hardware-level countermeasures, or plans to, is not confirmed by any source in the current reporting. Without that information, it is unclear how exposed current-generation data center GPUs remain to these attacks in practice.
There is also no publicly documented case of these techniques being used outside a controlled research setting. All four preprints describe laboratory demonstrations or simulations rather than incidents of real-world model theft. That gap matters because academic proof-of-concept attacks often depend on conditions, such as low noise environments or specific firmware versions, that may not hold in production deployments. At the same time, the absence of documented exploitation does not mean it has not occurred; organizations that lose proprietary models through side-channels may not detect the theft at all.
The business impact on AI companies remains speculative. No affected firm, whether a cloud provider or a model developer, has released data quantifying losses tied to side-channel model extraction. Analyst commentary in secondary reporting has suggested the risk is growing, but those assessments lack the empirical grounding that would make them reliable. Until an organization publicly attributes a breach to one of these techniques, the economic dimension of the threat stays theoretical.
Institutional guidance is similarly thin. ArXiv, which is supported by member institutions and operated through Cornell University’s infrastructure, does not appear to have published specific protocols for handling security-sensitive AI research disclosures. The platform’s role as a preprint server means these papers are available without peer review, which accelerates public awareness but also means the claims have not yet been subjected to formal academic scrutiny. ArXiv’s reliance on community donations further underscores that it is not a dedicated vulnerability coordination body with the resources or mandate to manage responsible disclosure across the global AI ecosystem.
How to read the evidence
All four papers, Mercury, NeuroUnlock, EZClone, and Kraken, are primary technical works published as preprints. They contain original experimental data and methodology rather than secondhand analysis. That makes them the strongest available evidence for the specific attack vectors they describe. However, preprint status means the results have not passed peer review, and readers should treat the claimed success rates and attack conditions with appropriate caution until independent replication or journal publication confirms them.
The research collectively points to a structural weakness in how GPU-based AI inference leaks information through observable physical and computational signals. Each paper attacks a different signal type: Mercury uses execution traces, NeuroUnlock targets memory access patterns, EZClone analyzes aggregate execution profiles, and Kraken captures electromagnetic emissions. The diversity of attack surfaces suggests the problem is not a single bug that a patch can fix but rather an inherent property of running neural networks on general-purpose accelerators.
One assumption that deserves scrutiny is the common framing that model architecture is the primary secret worth protecting. In practice, a model’s value often depends as much on its training data, fine-tuning recipes, and deployment-specific optimizations as on its layer structure. Recovering architecture through side-channels is a necessary first step for cloning, but it may not be sufficient to reproduce a model’s real-world performance. EZClone’s shape distillation approach begins to address that gap, yet the distance between extracting a skeleton and producing a competitive replica remains significant for large, heavily tuned systems.
The Kraken preprint’s inclusion of LLM leakage discussion is worth watching closely. Large language models represent the highest-value targets for model theft today, and if electromagnetic side-channels can reliably expose their structure or behavior, the incentive for attackers will be substantial. At the same time, Kraken’s experiments are early and limited in scope. The paper does not demonstrate theft of a frontier-scale LLM, and it is unclear how well the techniques would scale to models that span many GPUs, use aggressive parallelism strategies, or run inside heavily virtualized cloud environments.
Readers should also consider the role of the publication venue. Because these works are hosted on a preprint server that emphasizes rapid dissemination, they bypass the slower filters of journal review and conference rebuttal. ArXiv’s own help documentation describes its function as a distribution platform, not an arbiter of correctness. That makes it valuable for surfacing emerging threats quickly, but it also places more responsibility on security teams, cloud operators, and independent researchers to validate claims before redesigning infrastructure or policies around them.
What comes next
From a defensive standpoint, the research highlights an uncomfortable reality: many current protections for AI intellectual property focus on the static artifacts of a model—its weights, configuration files, and source code—rather than the dynamic behavior of the hardware that runs it. The side-channel attacks described in Mercury, NeuroUnlock, EZClone, and Kraken sidestep encryption and access control entirely by observing how GPUs execute workloads. That suggests future mitigation will need to include noise injection, scheduling randomization, or architectural changes that reduce the information content of observable signals, not just stronger file-level security.
Cloud providers and large model developers face a particular challenge. Multi-tenant GPU clusters are designed for efficient resource sharing, not strict isolation of physical side-channels. If even a fraction of the attack techniques in these preprints prove robust under real-world conditions, operators may need to reconsider how they colocate high-value workloads, how they expose low-level performance counters, and whether they can safely offer features like detailed profiling to untrusted tenants.
For now, the evidence base is narrow but consistent: multiple independent groups, using different methodologies and focusing on different leakage channels, report that they can infer meaningful details about neural networks running on NVIDIA hardware without direct access to the models themselves. Until those claims are either refuted by replication failures or blunted by publicly documented hardware countermeasures, organizations that depend on proprietary AI systems should treat side-channel exposure as a live, if still largely theoretical, risk.
More from Morning Overview
*This article was researched with the help of AI, with human editors creating the final content.