Cybersecurity researchers just found that ChatGPT implicitly trusts the Markdown links around it — and hackers are already exploiting that trust for phishing

Picture this: you paste a link into ChatGPT and ask for a summary. The model obliges, returning a clean, confident breakdown of the page’s contents. What it doesn’t tell you is that it just followed a hidden instruction embedded in that page’s Markdown, one that told it to slip a phishing link into its response. To the model, that instruction looked no different from the ones OpenAI’s own developers wrote.

This is not a hypothetical scenario. Cybersecurity researchers have identified a structural flaw in how large language models process user-supplied content: the models treat Markdown links, pasted text, and fetched web pages as trusted input, blending them seamlessly with developer-set instructions. The technique used to exploit this flaw is called prompt injection, and as of mid-2026, it remains one of the most stubborn unsolved problems in AI security.

Why the flaw exists

The UK’s National Cyber Security Centre (NCSC), the government body responsible for advising on digital threats across the country, published a detailed technical analysis explaining the root cause. In that analysis, the NCSC states that prompt injection arises because developers combine trusted instructions with untrusted content and then treat the model’s output as though a reliable boundary separates the two. No such boundary exists.

Traditional software solved a version of this problem decades ago. SQL injection, the classic database attack, was tamed because engineers could enforce a strict separation between executable code and user-supplied data. Prepared statements, parameterized queries, input sanitization: these tools worked because databases have clearly defined channels. A large language model does not. It processes everything (system prompts, user messages, fetched documents, pasted Markdown) in a single stream of tokens. A Markdown link occupies the same processing space as the instruction telling the model to refuse harmful requests.

The OWASP Foundation, which maintains the widely referenced security standards used across the software industry, ranked prompt injection as the number-one risk in its Top 10 for Large Language Model Applications. That ranking reflects a consensus among security professionals that this is not a niche bug but a fundamental architectural limitation.

What makes Markdown links especially dangerous

ChatGPT and competing models are no longer standalone chatbots. They are embedded in email clients, search tools, browser extensions, and enterprise productivity suites. When a model fetches a webpage, summarizes an email, or processes a shared document, it ingests content that third parties can manipulate.

Markdown links are a particularly effective attack vector because they look innocuous. A malicious actor who controls even a small portion of a web page can embed hidden instructions in Markdown-formatted text that the model will parse and follow. Security researcher Johann Rehberger has published multiple proof-of-concept demonstrations showing how attackers can use this technique to exfiltrate conversation data by tricking the model into rendering a Markdown image tag that sends information to an external server. The attack is invisible to the user: the model simply appears to be doing its job.

The NCSC’s incident reporting portal, which aggregates threat data from organizations across the UK, provides the institutional framework through which these risks are tracked. The agency treats prompt injection as an active category of risk, not a theoretical concern, that organizations deploying AI tools need to account for in their security planning.

What we still don’t know

Despite the clear technical basis for the threat, several significant gaps remain in the public record.

No official incident report or telemetry dataset has been published quantifying how many successful phishing attempts have originated specifically from Markdown-based prompt injection in ChatGPT. Researchers have demonstrated the technique repeatedly in controlled settings, and security teams have flagged it in threat advisories, but the scale of real-world exploitation remains unmeasured in any publicly available study.

OpenAI has implemented safety layers over time, including content filtering and an instruction hierarchy system designed to give developer-set prompts priority over user-supplied input. But the company has not released detailed public documentation explaining whether those defenses specifically address the Markdown trust problem the NCSC describes. Without that transparency, outside researchers cannot assess whether current mitigations target the root cause or only its surface symptoms.

The NCSC itself stops short of prescribing a definitive fix. Its position is that prompt injection may be fundamentally harder than SQL injection precisely because no clean architectural separation between instructions and data exists within current model designs. Whether future architectures will introduce such a separation, or whether the field will develop reliable runtime defenses, remains an open question.

There is also the question of attacker adaptation. Security researchers have demonstrated proof-of-concept injection attacks using image alt-text, embedded document metadata, and hidden text rendered in white-on-white formatting. If Markdown link handling is hardened in frontier models, attackers have multiple fallback channels. No public data confirms these alternative vectors are being used at scale in active phishing campaigns, but the technical groundwork is already laid.

Why skepticism is now a security requirement

Traditional phishing is tracked through email security gateways, browser warnings, and financial fraud reports. Prompt injection phishing does not trigger those same detection systems. The attack happens inside the model’s output, which users typically trust because it appears to come from a helpful assistant rather than an external source. That trust is the vulnerability.

Until model architectures evolve to enforce a real boundary between instructions and data, the most practical defense is skepticism. Treat any link a language model generates with the same caution you would apply to a link in an unsolicited email. Hover before you click. Verify URLs independently. And if a model’s response includes an unexpected call to action, such as logging into an account or downloading a file, assume the instruction may not have come from the model’s developers.

The NCSC’s core finding is worth restating plainly: large language models cannot currently tell the difference between a legitimate instruction and a malicious one disguised as content. Every Markdown link, every pasted document, every fetched web page is a potential entry point. That is not a bug that a patch will fix. It is a property of how these systems are built, and it will shape how we use them for years to come.

More from Morning Overview

*This article was researched with the help of AI, with human editors creating the final content.

IG

FB

PIN

LI

X

Global Font

Cybersecurity researchers just found that ChatGPT implicitly trusts the Markdown links around it — and hackers are already exploiting that trust for phishing

Why the flaw exists

What makes Markdown links especially dangerous

What we still don’t know

Why skepticism is now a security requirement

Cassian Holt

Author

A new study finds no safe level of drinking, tying one daily drink to cancer and heart disease

A wall of dust dropped Arizona interstates to near-zero visibility in seconds

A flesh-eating bacteria has killed eight along the Gulf and is creeping north

Regulators demanded a fix after robotaxis kept driving into fire and flood scenes

Experts say the US has quietly lost the measles-free status it held for 25 years

More in Cybersecurity

Cybersecurity

The FBI says over 10,000 fake toll and package texts are hitting drivers in 10 states

Cybersecurity

Crypto ‘pig butchering’ rings bled victims for billions before police froze $701 million

Cybersecurity

Hackers claim they stole personal records on 45 million people from Rite Aid

Cybersecurity

Fake IRS letters now carry QR codes that lead to bank-draining copycat sites

Cybersecurity

Card skimmers at the gas pump are draining more than $1 billion a year from drivers

Cybersecurity

A jury duty scam nearly cost an Ohio woman $13,000 in fake bail demands

Cybersecurity

GPS spoofing is now misleading the navigation on more than 1,500 flights a day

Cybersecurity

Global crackdowns froze $700 million as ‘pig-butchering’ scams kept spreading

IG

FB

PIN

LI

X

IG

FB

PIN

LI

X

Cybersecurity researchers just found that ChatGPT implicitly trusts the Markdown links around it — and hackers are already exploiting that trust for phishing

Why the flaw exists

What makes Markdown links especially dangerous

What we still don’t know

Why skepticism is now a security requirement

Author

Get weekly updates with the latest news and tips!

More in Cybersecurity

IG

FB

PIN

LI

X