AI product summaries boost buying intent even with a 60% hallucination rate

A preprint study on large language models found that AI-generated product summaries made readers 32% more likely to say they would buy, even though the same models hallucinated facts roughly 60% of the time. The finding, drawn from experiments using a well-known Amazon review dataset, raises a pointed question for online shoppers and the platforms that serve them: what happens when the text that nudges people toward a purchase is fluent, persuasive, and frequently wrong?

The Numbers Behind the Persuasion

The research, titled Quantifying cognitive bias, measured several ways that language models distort how people process product information. Participants who read AI-generated summaries showed a 32% higher likelihood to purchase compared with control conditions. The same models exhibited a 26.42% framing change, meaning they systematically shifted the way product attributes were presented, and a 10.12% primacy bias, where items mentioned first received disproportionate weight. On questions about facts beyond the models’ training data, the hallucination rate hit 60.33%.

Those four metrics tell a coherent story. The models do not just summarize; they reshape how information lands. Framing effects and primacy bias are well-documented cognitive shortcuts in human psychology. When a summary leads with a product’s strengths and buries its drawbacks, or when it fabricates a specification the buyer cannot quickly verify, the result is a reader who feels informed but may not be. The 32% purchase-intent lift is not happening despite the errors. It may be happening partly because of them, since confident, detail-rich text reads as authoritative even when the details are invented.

How the Experiment Was Built

The study drew its product reviews from a large-scale Amazon corpus commonly referenced as Ni et al. (2019). That dataset compiles millions of user-written opinions across dozens of product categories, from electronics to books, and has become a standard benchmark for recommendation and summarization research. By feeding real consumer reviews into language models and then testing how the resulting summaries affected human judgment, the researchers isolated the gap between what shoppers actually said and what the AI told new readers they said.

This experimental design matters because it mirrors what major e-commerce platforms are already doing. When a retailer uses a language model to condense hundreds of reviews into a quick paragraph, the output carries the authority of crowd wisdom but the voice of a single algorithm. The study’s results suggest that voice is not neutral.

When Labeling AI Backfires

A separate line of research complicates the picture further. Experiments from Washington State University found that simply using the term “artificial intelligence” in product descriptions reduced purchase intentions across diverse categories. The negative response was even stronger for high-risk offerings such as medical devices or financial tools, where consumers appear especially wary of automated decision-making.

Put these findings side by side and a paradox emerges. AI-generated summaries increase buying intent when readers do not know the text came from a machine. But the moment a platform discloses the AI’s role, trust drops, especially for purchases where accuracy matters most. That tension creates a perverse incentive: retailers benefit from deploying AI summaries quietly and lose customers when they are transparent about it. For shoppers, the practical takeaway is blunt. The most persuasive summary on a product page may be the one least likely to carry a disclosure label.

Speed, Price, and the Decision Shortcut

Econometric research from Arizona State University and Nankai University adds another dimension. Ziru Li and Jialin Nie examined how AI-generated product summaries affect purchasing speed and found, through robust econometric analysis, that the effect was most pronounced for products with lower prices. That aligns with a basic consumer behavior principle: when the financial stakes are small, people rely more heavily on shortcuts. A concise, confident summary eliminates the need to scroll through dozens of reviews, and at a $15 price point, few buyers will cross-check the AI’s claims against the original text.

Separate work on online platforms confirms that product reviews have become a primary reference point for purchase decisions. As AI summaries replace the act of reading individual reviews, they concentrate influence in a single algorithmically generated paragraph. The efficiency gain is real, but so is the information loss.

Cognitive Bias as a Measurable Output

The bias patterns identified in the preprint are not new to AI research. Earlier work by overlapping authors, including Echterhoff, Alessa, and McAuley, introduced a framework called BiasBuster along with a prompt dataset of thousands of test prompts designed to evaluate and mitigate systematic distortions in model outputs. The new study extends that agenda from model behavior in the abstract to concrete consumer outcomes: not just whether a summary is biased, but whether that bias reliably nudges people toward different choices.

Framing change in this context captures how the model reorders or rephrases information relative to the underlying reviews. A product that receives mixed feedback on durability but glowing comments on style might emerge from the summarization process as “stylish and well-built,” with durability complaints relegated to a vague mention of “some minor issues.” Primacy bias then amplifies the opening claim; readers anchor on the first attributes they see and discount later qualifications.

Hallucinations add yet another twist. When the model confidently states that a blender is “BPA-free” or that headphones support a specific codec, it can create an illusion of due diligence. The study’s 60.33% hallucination rate on out-of-training questions suggests that, in many cases, the most specific-sounding details are precisely where reality and text diverge. For low-cost items, this may translate into minor disappointments. For higher-stakes products, it can mean safety risks or financial exposure.

Platform Incentives and Policy Gaps

For e-commerce platforms, these findings sharpen an uncomfortable trade-off. AI summaries demonstrably increase conversion, especially for cheaper goods, by speeding up decisions and smoothing over conflicting reviews. At the same time, labeling those summaries as machine-generated can depress demand, particularly in sensitive categories. The rational business move, absent regulation or reputational pressure, is to lean into AI while minimizing conspicuous disclosures.

Regulators and consumer advocates are beginning to focus on transparency around automated recommendations, but the research suggests that disclosure alone may not be enough. If a brief label reduces trust without equipping shoppers to detect hallucinations or bias, it risks becoming a box-ticking exercise. More substantive safeguards might include side-by-side access to raw reviews, standardized summaries of negative feedback, or automated flags when a model appears to invent specifications not present in user comments.

There is also a competitive dimension. Retailers that invest in careful prompt design and post-processing to curb hallucinations may find themselves at a short-term disadvantage against rivals that tolerate more aggressive, flattering summaries. Until accuracy and fairness become part of how platforms are evaluated (by regulators, watchdogs, or consumers), the market will tend to reward persuasion over precision.

What Shoppers Can Do Now

For individual buyers, the safest response is not to abandon AI-assisted summaries entirely but to treat them as a starting point rather than a verdict. When a product matters (because it is expensive, safety-critical, or hard to return), skimming a sample of original reviews can reveal whether the summary downplays recurring complaints. Specific technical claims, like compatibility or ingredient details, are worth checking against the manufacturer’s description rather than trusting a generative model’s confident tone.

Consumers can also be alert to linguistic tells. Overly enthusiastic language, vague references to “some users” without quantification, or oddly precise but unverified specifications are all signs that the summary may be optimizing for persuasion rather than accuracy. In that sense, the rise of AI-generated content simply raises the stakes on an old rule of online shopping: if a description sounds too perfectly tailored to your hopes, it deserves a second look.

The emerging research record, spanning cognitive bias induction, disclosure effects, econometric analysis of purchase speed, and the centrality of reviews in digital commerce, points in the same direction. Generative models are not just another interface layer on top of existing information. They are active participants in shaping what people believe about products and how quickly they decide. As platforms race to deploy these tools at scale, the question is no longer whether AI will influence what we buy, but whether anyone will be accountable when its fluent mistakes push us toward the wrong choice.

More from Morning Overview

*This article was researched with the help of AI, with human editors creating the final content.

IG

FB

PIN

LI

X