Researchers at Peking University have documented a striking disconnect between journal policies on artificial intelligence and what scientists actually disclose in their papers. Their analysis of more than 5.2 million papers published between 2021 and 2025 across 5,114 journals found that roughly 70% of those journals now require authors to declare AI assistance, yet actual disclosure rates remain vanishingly low. The finding points to a systemic transparency failure that threatens to erode trust in published science at a time when AI writing tools are spreading rapidly through academia.
Millions of Papers, Almost No Disclosure
The study by Yongyuan He and Yi Bu at Peking University is one of the largest empirical efforts to measure the gap between policy and practice. After examining 5,114 journals and more than 5.2 million papers, the researchers concluded that most AI policies amount to disclosure requirements rather than outright bans. Yet the volume of papers that actually acknowledge AI assistance remains a small fraction of total output, even as linguistic detection tools suggest AI-generated text is present in a growing share of submissions.
That mismatch raises a pointed question: are disclosure mandates functioning as genuine safeguards, or are they simply box-checking exercises that journals adopt to appear responsible? A separate bibliometric survey mapped publisher instructions on generative AI use, cataloging a range of approaches from disclosure requirements to authorship constraints to outright prohibitions. The variety of policies itself may be part of the problem. When rules differ sharply from one journal to the next, authors face inconsistent expectations, and enforcement becomes nearly impossible to standardize.
Telltale Phrases That Give AI Away
While policy debates play out in editorial boardrooms, independent researchers have been building a detection record from the ground up. Alex Glynn compiled a dataset of suspected undeclared usage in 768 published academic works, identified through idiosyncratic artifacts that large language models leave behind. These artifacts include phrases no human researcher would plausibly write in a peer-reviewed paper, and they appear across publications from well-known publishers and respected outlets, suggesting that undisclosed AI use is not confined to fringe journals.
Reporting by Nature staff documented hundreds of flagged papers containing telltale chatbot phrases such as “as an AI language model” and “as of my last knowledge update.” In some cases, those phrases were later quietly removed from published versions, suggesting that authors or publishers recognized the problem after the fact but chose silent correction over transparent retraction. That pattern of stealth editing, rather than open acknowledgment, deepens concerns about how seriously the research community treats undisclosed AI use and how often problematic text may be sanitized without leaving a visible trail.
Retractions Signal Real Consequences
The stakes are not hypothetical. PLOS ONE issued a formal retraction for a 2024 article on blended learning and traditional instruction, originally published as PLoS ONE 19(3): e0298220, with the retraction recorded as PLoS ONE 19(4): e0302484. The notice cites serious concerns about the integrity of the work, including issues linked to AI-generated content and the reliability of the underlying scholarship. In this case, the journal opted to remove the article from the citable record rather than rely on corrections alone, underscoring that AI-related breaches can rise to the level of full withdrawal.
Radiology Case Reports took a similar step, issuing a removal notice for a case report on managing an iatrogenic portal vein and hepatic artery injury in a four-month-old patient, originally published in volume 19, pages 2106 through 2111, with the removal notice appearing at 19(8):3598. Here, too, questions about the authenticity and reliability of the text contributed to the decision to excise the article from the literature. These cases illustrate a pattern that extends well beyond any single discipline: when AI artifacts slip into clinical case reports or education research alike, the contamination risk touches fields where published findings directly shape medical decisions and classroom practices.
Why Researchers Stay Silent
If disclosure policies exist and enforcement actions are visible, why do so many researchers still avoid acknowledging AI assistance? A recent review of disclosure behavior, published in October 2025, found that self-reported AI usage patterns across multiple fields were sharply inconsistent with what objective detection studies revealed in the same bodies of work. In other words, researchers say they are not using AI tools at rates that contradict the linguistic evidence in their own papers. The gap suggests that non-disclosure is not simply an oversight but a deliberate choice shaped by professional incentives and perceived risks.
Research from the University of Arizona offers one explanation: disclosing AI use can backfire. Their findings, published in May 2025, showed that honesty about AI assistance in tasks like writing cover letters can trigger negative judgments from evaluators, who tend to rate disclosed AI-supported work as less impressive or less authentic than ostensibly “human-only” output. Applied to academic peer review, this dynamic creates a perverse incentive structure. Authors who disclose risk having their work viewed as less original or less credible, while those who stay quiet face minimal consequences unless a detection effort happens to flag their paper.
Policy Gaps Demand Structural Fixes
The conventional assumption among publishers has been that clear policies will produce honest behavior. The Peking University analysis shows that this assumption is flawed: even where journals explicitly require disclosure, compliance remains low and undeclared AI usage appears to be widespread. Combined with the heterogeneous guidance documented across major publishers, the result is a fragmented regulatory landscape in which authors can easily rationalize silence or cherry-pick the most permissive interpretation of the rules. The problem is not simply a lack of instructions; it is the absence of credible incentives and verification mechanisms to back those instructions up.
Addressing this gap will require structural changes rather than incremental tweaks. Journals could move toward standardized disclosure templates that distinguish between different kinds of AI assistance (such as language polishing, data analysis, or figure generation), so that authors can report use without fearing that any mention of AI will be treated as misconduct. Editorial workflows might incorporate routine checks for obvious chatbot artifacts, not as a punitive dragnet but as a quality-control step akin to plagiarism screening. Funding agencies and institutions, for their part, could explicitly recognize responsible AI use as compatible with research integrity, signaling that transparency will be rewarded rather than punished.
Ultimately, the evidence from millions of articles, retractions in high-stakes fields, and behavioral studies of disclosure all point in the same direction: voluntary honesty under vague, inconsistent rules is not enough. Without coordinated standards, credible enforcement, and cultural norms that treat transparent AI use as a professional obligation, rather than a liability, the disconnect between policy and practice will continue to widen. The scientific community now faces a choice between quietly tolerating that gap or building the infrastructure needed to close it before trust in the literature frays beyond repair.
More from Morning Overview
*This article was researched with the help of AI, with human editors creating the final content.