
Artificial intelligence is supposed to be the sharp edge of modern science, yet a growing chorus of researchers now argue that a large share of AI papers are little more than padded, low‑value output. Instead of careful experiments and hard‑won insights, they see a flood of formulaic manuscripts, recycled ideas and automated prose that they say is warping incentives across universities and conferences. The result, in their view, is a research ecosystem where quantity is rewarded over quality and where “slop” is not a punchline but a structural feature.
As funding, prestige and policy attention converge on AI, the stakes of this critique are high. When experts warn that the literature is clogged with weak or misleading work, they are not only complaining about bad writing, they are questioning whether the field can still separate genuine breakthroughs from noise. That tension now sits at the center of debates about peer review, academic careers and how much trust anyone should place in the latest AI study.
How “slop” became a diagnosis for AI research
The word “slop” has migrated from internet culture into academic criticism because it captures something specific about the current AI moment: the sense that content is being produced faster than it can be meaningfully evaluated. In conversations with researchers, I hear the same pattern described again and again, a torrent of papers that look impressive on the surface but collapse under scrutiny, with vague methods, cherry‑picked benchmarks and little connection to real‑world problems. The label is not about a single flawed study, it is about a system that seems optimized to churn out more of the same.
One flashpoint came when an AI author publicly boasted of having written over 100 papers on artificial intelligence, a volume that some experts described as a “disaster” for the state of AI research. That case crystallized a broader worry that the field is rewarding sheer output rather than depth, with reviewers and editors struggling to keep up as submissions spike. When critics call the situation “a mess,” they are pointing to a structural imbalance between the ease of generating new manuscripts and the much slower work of validating whether any of them actually advance knowledge.
The incentives that reward volume over rigor
Behind the slop diagnosis sits a familiar academic story: incentives that push scholars to publish as much as possible, as quickly as possible. Hiring committees, promotion panels and grant reviewers often rely on crude metrics such as paper counts and citation tallies, which makes it rational for researchers to slice their work into the smallest publishable units and to chase trendy topics. In AI, where conferences can make or break careers, the pressure to appear prolific is especially intense, and that pressure now intersects with tools that make it easier than ever to generate text and code.
In a widely shared Post, Luke Nottage, a Professor at Sydney Law School, described how academic “slop” publications are encouraged by systems that prize publication volume more than quality, a critique he later Edited but did not retract. His argument is that when careers hinge on counting outputs, AI tools become a tempting way to inflate those numbers, whether by drafting boilerplate introductions or spinning off marginal variants of the same experiment. The result is a feedback loop in which institutions reward the very behavior that critics say is degrading the literature.
Peer review under strain from AI’s paper explosion
Peer review was never designed for a world in which thousands of AI manuscripts arrive in a single conference cycle, many of them produced with the help of generative tools. Reviewers describe being overwhelmed by sheer volume, forced to make quick judgments on complex methods and to trust that authors are accurately reporting their results. In that environment, it becomes easier for weak or derivative work to slip through, especially when it is dressed up with fashionable terminology and large language model gloss.
Researchers discussing the state of AI conferences have pointed to review standards that are buckling under the load, with some venues reportedly facing submission numbers in the five figures. One widely cited discussion of review standards described how one major AI conference was dealing with around 11,000 submissions for its 2025 meeting, a scale that makes thorough evaluation of each paper nearly impossible. When acceptance decisions are made under that kind of pressure, the line between rigorous research and polished slop can blur, especially for work that sits at the edges of reviewers’ expertise.
From “Artificial intelligence research” to a broader slop ecosystem
The complaints about AI papers do not exist in isolation, they are part of a wider reckoning with how generative tools are reshaping knowledge production. In discussions of Artificial intelligence research, academics have warned that the same dynamics driving low‑quality online content are now appearing in scholarly work, with automated systems helping to fill journals and preprint servers with text that looks authoritative but adds little. The concern is not only that some authors are cutting corners, but that the entire ecosystem is drifting toward a model where speed and surface polish matter more than careful reasoning.
Those worries echo a broader critique of the internet’s AI‑generated sludge. Commentators have described how generative systems are flooding websites with low‑effort articles, product reviews and listicles, a trend one analysis framed as the Worse side of AI’s impact on the web. The argument is that slop does not stay confined to screens, it shapes what people read, how they make decisions and which voices get heard. When that same logic seeps into academic publishing, the risk is that the scientific record itself starts to resemble a content farm, with a few solid contributions buried under layers of algorithmically assisted filler.
Why some disciplines are pushing back harder than others
Not every corner of academia is equally comfortable with AI’s encroachment into research and evaluation. Surveys of university staff show sharp divides between fields, with some disciplines embracing AI tools as productivity aids and others treating them as a threat to core scholarly values. The arts and humanities, in particular, have emerged as strongholds of resistance, arguing that automated systems cannot replicate the interpretive and critical work that defines their scholarship.
One recent study of attitudes toward using AI in national research assessments found that most academics were strongly opposed to deploying such tools in the next cycle of the Research Excellence Framework, known as REF 2029. The report highlighted that Watermeyer observed how opposition to AI was concentrated in certain disciplines, particularly in arts and humanities, where scholars worry that automated evaluation would flatten nuance and reward formulaic writing. Their stance is a reminder that the slop problem is not just about bad papers, it is about who gets to define quality and whether that judgment can be safely outsourced to machines.
How AI tools quietly shape the look and feel of papers
Even when AI is not explicitly credited as a co‑author, its fingerprints are increasingly visible in the structure and language of research papers. Large language models can draft introductions, smooth awkward phrasing and suggest plausible citations, all of which can make a manuscript appear more polished than the underlying work deserves. For overworked researchers, these tools are attractive shortcuts, but they also risk standardizing the voice of scientific writing, making it harder for reviewers to spot when something is off.
In the AI community itself, some of the most pointed criticism has focused on how generative tools enable what might be called “template research,” where authors plug new datasets or minor algorithm tweaks into a familiar narrative arc. The case of the author claiming over 100 AI papers is emblematic of this pattern, with critics arguing that such volume is only possible when much of the writing and even some of the experimental framing are automated. The danger is that the literature fills up with papers that look rigorous but are, in effect, variations on a script generated by the same underlying models they purport to study.
The human cost of navigating a slop‑filled literature
For students, policymakers and practitioners trying to make sense of AI, the proliferation of low‑quality papers creates a different kind of burden. Instead of being able to trust that peer‑reviewed work has cleared a meaningful bar, they must learn to sift through a noisy landscape, separating robust findings from hype and half‑baked claims. That task is especially daunting for people outside the core research community, who may lack the technical background to spot subtle methodological flaws or inflated performance metrics.
Researchers themselves are not immune to this fatigue. Many describe spending increasing amounts of time triaging which papers are worth reading, relying on informal networks, social media recommendations and reputation cues to filter the flood. Discussions on platforms where AI scientists gather often circle back to the same frustration: that the signal‑to‑noise ratio is deteriorating, with slop papers clogging search results and citation graphs. When a widely shared thread about quality in academic contributions went viral, it resonated precisely because so many readers recognized the experience of wading through pages of derivative work to find a single genuinely insightful study.
Why fixing the slop problem will require institutional change
It is tempting to frame the AI slop problem as a matter of individual ethics, a story about a few bad actors gaming the system with automated tools. The reality is more uncomfortable. As long as universities, funders and conference organizers reward volume, speed and buzz, researchers will have strong incentives to use every tool at their disposal to maximize output. AI simply amplifies those incentives, making it easier to produce more papers, more quickly, with less effort.
Some academics are calling for a reset that would shift evaluation away from raw counts and toward deeper engagement with a smaller number of substantial contributions. That could mean promotion criteria that emphasize a handful of major works, conference policies that cap submissions per author, or funding schemes that reward replication and negative results. It could also mean more transparent discussions about how AI tools are used in drafting and analysis, so that reviewers can better judge where human judgment ends and automation begins. Without such changes, the risk is that the field continues to drift toward what critics on forums like Artificial intelligence research already describe as a mess, with slop normalized as the background noise of scientific life.
More from MorningOverview