
Artificial intelligence was supposed to accelerate discovery, but a growing number of researchers say it is flooding science with junk. Instead of careful experiments and hard-won insights, they describe a rising tide of auto-generated manuscripts that look like scholarship yet collapse under basic scrutiny.
What is emerging is not just a quality problem but a crisis of trust, as academics warn that AI-written “research” is eroding confidence in peer review, distorting the record and making it harder to tell signal from noise. I see a field that risks losing its own sense of reality if it cannot distinguish rigorous work from what experts now bluntly call slop.
How AI slop became a defining problem for research
The core complaint from specialists is not that artificial intelligence exists in the lab, but that it now shapes the scientific literature itself in ways that are shallow, repetitive and often wrong. In technology and computer science, where AI tools are closest to the subject matter, academics describe a wave of papers that recycle the same architectures, benchmarks and buzzwords with little genuine novelty, a pattern that has led some to argue that artificial intelligence research has a slop problem from top to bottom. One detailed account of this trend notes that even within the field of artificial intelligence (AI) itself, senior figures see a culture of “vibe coding,” where models are tweaked until they appear to work without clear hypotheses or robust evaluation, and where the state of AI research is increasingly hard to judge behind the glossy claims embedded in conference submissions and preprints, a concern captured in reporting on how artificial intelligence research has a slop problem.
Outside computer science, the same pattern is spreading as generic language models are pointed at every discipline from medicine to management studies. I see manuscripts that follow identical templates, padded with boilerplate literature reviews and synthetic citations, yet offering no new data or theory. The result is a research ecosystem where volume is rewarded over substance, and where the line between genuine inquiry and automated pastiche is increasingly blurred, a dynamic that critics now describe as the normalization of AI slop in the very venues that once set the standard for rigor.
Experts’ blunt verdict: “complete slop”
When senior researchers describe AI-written manuscripts as “complete slop,” they are not reaching for a metaphor, they are giving a technical assessment of work that fails basic tests of method, originality and coherence. In one widely discussed account, experts reviewing AI-generated submissions reported papers that stitched together plausible-sounding text with fabricated references, misused statistics and superficial experiments, leading them to conclude that many AI “Research” Papers Are Complete Slop, Experts Say, a phrase that has since become shorthand for a broader collapse in standards, as captured in coverage of how Research Papers Are Complete Slop, Experts Say.
What I hear from reviewers is that these manuscripts often look polished at first glance, with tidy abstracts, neat figures and exhaustive reference lists, yet fall apart when anyone checks whether the cited work exists or whether the described experiments could actually have been run. The slop label reflects that mismatch between surface and substance, a sense that the text has been optimized for passing automated checks and overworked editors rather than for advancing knowledge, and that the people deploying these tools are more interested in padding CVs than in answering real questions.
Imaginary journals, fabricated citations and the hallucination problem
One of the most alarming symptoms of AI slop is the sudden appearance of references to journals, conferences and articles that do not exist. Librarians and information specialists report a surge in requests for phantom publications, as readers try to track down citations that were confidently invented by language models and then copied, unchecked, into manuscripts. A detailed investigation describes how The International Committee of the Red Cross warned that artificial intelligence models are making up research papers, and how this has spurred record numbers of requests for imaginary journals, a trend documented in reporting on how AI slop is spurring record requests.
In practice, this means that a clinician reading a paper on a new treatment, or an engineer checking a claimed breakthrough, may be relying on a chain of citations that collapses into thin air. I see this as more than an annoyance, it is a direct attack on the scaffolding of scholarship, where the ability to trace claims back through verifiable sources is what allows science to correct itself. When AI tools hallucinate references and authors fail to verify them, the literature becomes a hall of mirrors, and the burden of sorting fact from fiction shifts from authors to already overstretched librarians and readers.
Publishers, profits and the business model of slop
Behind the flood of low-quality AI manuscripts sits a publishing system that has powerful incentives to accept as much content as possible. Large commercial publishers earn staggering profits from article processing charges and subscription bundles, while outsourcing much of the quality control to unpaid reviewers who are now expected to spot not only traditional errors but also AI-generated nonsense. A searching examination asks Is the staggeringly profitable business of scientific publishing bad for science, and argues that the current model makes it harder to spot low quality papers, a concern that aligns with the question of whether Is the business of scientific publishing broken.
In that environment, AI slop is not an accident, it is a feature of a system that rewards volume and speed. I see journals that advertise rapid turnaround and high acceptance rates, then quietly rely on overburdened editors and superficial checks to move manuscripts through the pipeline. When the marginal cost of generating another AI-written paper is close to zero, and the marginal revenue from publishing it is significant, the result is a marketplace where slop is profitable, and where the reputational damage is spread across the entire scientific community rather than borne by the companies that cash in.
Script kiddies, “prompt patsies” and the human drivers of AI slop
It is tempting to blame the models, but the real engine of AI slop is human behavior. Commentators who track procurement and technology trends describe a new class of users who lean on generative tools to churn out reports, white papers and pseudo-academic articles with minimal expertise. One sharp critique points to Script kiddies, “prompt patsies” (they are not prompt engineers, that is utter BS), consultants and analysts with no deep domain knowledge as key contributors to the deluge, arguing that they are inundating organizations with derivative content that adds little value, a pattern analyzed in a piece asking why we are inundated by AI slop.
In academia, the same archetypes appear in the form of students, early-career researchers and even senior authors who see AI as a shortcut to publication. I have seen drafts where the methodology section reads like a generic template, the discussion repeats clichés about “future work,” and the only real effort went into prompting a model and lightly editing the output. The problem is not that these users are evil, it is that the incentives around them reward quantity, buzzword compliance and surface polish, and AI tools make it trivial to hit those marks without doing the underlying intellectual work.
“Research slop is human slop”: the deeper cultural failure
Some of the most thoughtful critics argue that AI is not creating slop so much as amplifying habits that were already present in research culture. One commentator puts it bluntly: research slop is human slop, and the arrival of generative tools simply makes existing failures more visible at scale. In a detailed response to recent articles on AI research tools and the future of research, Noam Segal writes that Yes, Antin and Holbrook acknowledge, failures more visible at scale, and uses that line to argue that the real issue is how people design studies, interpret data and chase prestige, a perspective laid out in the essay titled Research slop is human slop.
I find that framing useful because it shifts the focus from tools to norms. If a lab already cuts corners, cherry-picks results or treats publication as a numbers game, then giving it a language model will predictably produce more of the same, just faster. Conversely, groups that prize transparency, preregistration and open data tend to use AI as a drafting aid or coding assistant while keeping humans in charge of the reasoning. The technology magnifies whatever culture it encounters, which means that cleaning up AI slop ultimately requires confronting the human slop that came first.
Peer review under strain and the physicists’ warning
Nowhere is the impact of AI slop felt more acutely than in peer review, where volunteers are asked to sift through an ever-growing stack of submissions. In physics, where preprints and rapid dissemination are the norm, researchers are already sounding the alarm that there are simply too many papers, not just to review, but also to read and use. One widely shared discussion among physicists captures this mood, with a commenter stating that the grim truth is that there really are just too many papers and warning that this is especially true in fields where AI tools are used to generate text, a sentiment recorded in a thread on how physicists are split on AI use in peer review.
As AI-written manuscripts swell the pipeline, reviewers face a choice between spending hours disentangling synthetic prose and fabricated citations, or skimming and hoping that obvious problems will surface later. I have spoken to scientists who now decline review requests because they suspect the paper was drafted by a model, and to others who quietly paste sections into their own AI tools to summarize or check for inconsistencies. That feedback loop, where AI is used to police AI, risks turning peer review into a procedural formality rather than a substantive check, and it leaves early-career researchers in particular wondering whether the system can still recognize and reward careful work.
Inside the AI labs: “vibe coding” and the illusion of progress
Within AI research itself, some of the harshest criticism comes from insiders who see a widening gap between the rhetoric of breakthroughs and the reality of incremental, poorly understood tweaks. One researcher quoted in detailed coverage of the field says, “I’m fairly convinced that the whole thing, top to bottom, is just vibe coding,” referring to the practice of adjusting models until they appear to work without clear theoretical grounding. That same reporting notes that in some cases, AI tools are used to draft entire manuscripts, and that reviewers like Zhu said that he suspected many papers were written with AI, concerns laid out in an account of how physicists and computer scientists see AI-written papers.
I read that phrase, “vibe coding,” as a diagnosis of a deeper methodological drift. When models are so complex that even their creators struggle to explain why they work, it becomes tempting to treat performance metrics as the only arbiter of truth and to let AI tools generate the narrative that wraps around them. The result is a literature full of confident claims about alignment, generalization and emergent abilities, but thin on falsifiable hypotheses or reproducible protocols, a pattern that makes it easier for slop to pass as innovation because the standards for what counts as understanding have quietly eroded.
Societal stakes: from model collapse to public distrust
The consequences of AI slop are not confined to academic careers or journal reputations, they reach into how societies understand reality itself. Analysts at Nov Khazanah Research Institute warn that the greatest risk in today’s information environment is not simply false information, it is losing faith in what is true, and they argue that AI-generated content can accelerate that loss by flooding public discourse with plausible but inaccurate narratives that crowd out careful reporting and scholarship, a concern explored in their analysis of AI slop, society and model collapse.
In science, that erosion of trust can be subtle but profound. When clinicians, policymakers or journalists encounter retracted studies, fabricated citations or AI-written reviews that gloss over uncertainty, they may start to treat the entire research enterprise as just another content stream, interchangeable with social media feeds and corporate white papers. I see a real risk that if slop continues to spread unchecked, the public will not only doubt specific findings but will also lose confidence in the idea that careful, cumulative inquiry can still deliver reliable knowledge, a loss that would be far harder to repair than any single flawed paper.
Regulators, platforms and the scramble to respond
As AI slop spills beyond academia, regulators and platforms are being forced to react. In technology policy circles, the issue is now framed alongside concerns about data scraping and model training, with authorities scrutinizing how companies like Google use online content for AI models and how that, in turn, shapes the information environment. One prominent report notes that in the Most viewed in Technology section, readers are tracking how the EU opens investigation into Google’s use of online content for AI models and how Australia launches its own probes, while experts in the same piece warn that artificial intelligence research has a slop problem that complicates oversight, a cluster of issues captured in coverage that highlights Most viewed in Technology and the Google and Australia investigations.
I see a parallel scramble inside universities, where ethics boards and research offices are rushing to draft policies on AI use in writing and analysis. Some institutions now require authors to disclose when they have used language models, while others are experimenting with automated detectors that try to flag AI-generated text, even though those tools themselves are prone to error. The risk is that policy will lag far behind practice, and that by the time formal rules catch up, the norms of cutting and pasting from models into manuscripts will be so entrenched that rolling them back will feel like an attack on productivity rather than a defense of integrity.
Can science dig itself out of the slop?
For all the gloom, there are concrete steps that could curb AI slop without banning useful tools. I hear growing support for measures such as mandatory data and code sharing, stricter requirements for preregistration in fields prone to p-hacking, and clearer authorship standards that make it harder to hide behind anonymous “contributions” when a paper turns out to be largely machine written. Journals could invest in better training for editors and reviewers on how to spot fabricated citations and generic AI prose, and they could slow the publication treadmill by prioritizing fewer, higher quality articles over endless special issues and conference proceedings.
Ultimately, though, the fix will have to be cultural. If hiring committees, grant panels and promotion boards continue to reward sheer publication counts, then AI slop will remain a rational response to a warped incentive structure. I believe the only sustainable path is to realign prestige with practices that are hard to fake at scale: careful experimental design, transparent methods, meaningful collaboration and a willingness to publish negative or ambiguous results. Artificial intelligence can support that work, but only if researchers treat it as a tool to sharpen thinking rather than a shortcut to pad their CVs. Otherwise, the experts calling today’s AI “research” papers complete slop may look, in hindsight, like the ones who were still being polite.
More from MorningOverview