Picture this: you paste a link into ChatGPT and ask for a summary. The model obliges, returning a clean, confident breakdown of the page’s contents. What it doesn’t tell you is that it just followed a hidden instruction embedded in that page’s Markdown, one that told it to slip a phishing link into its response. To the model, that instruction looked no different from the ones OpenAI’s own developers wrote.
This is not a hypothetical scenario. Cybersecurity researchers have identified a structural flaw in how large language models process user-supplied content: the models treat Markdown links, pasted text, and fetched web pages as trusted input, blending them seamlessly with developer-set instructions. The technique used to exploit this flaw is called prompt injection, and as of mid-2026, it remains one of the most stubborn unsolved problems in AI security.
Why the flaw exists
The UK’s National Cyber Security Centre (NCSC), the government body responsible for advising on digital threats across the country, published a detailed technical analysis explaining the root cause. In that analysis, the NCSC states that prompt injection arises because developers combine trusted instructions with untrusted content and then treat the model’s output as though a reliable boundary separates the two. No such boundary exists.
Traditional software solved a version of this problem decades ago. SQL injection, the classic database attack, was tamed because engineers could enforce a strict separation between executable code and user-supplied data. Prepared statements, parameterized queries, input sanitization: these tools worked because databases have clearly defined channels. A large language model does not. It processes everything (system prompts, user messages, fetched documents, pasted Markdown) in a single stream of tokens. A Markdown link occupies the same processing space as the instruction telling the model to refuse harmful requests.
The OWASP Foundation, which maintains the widely referenced security standards used across the software industry, ranked prompt injection as the number-one risk in its Top 10 for Large Language Model Applications. That ranking reflects a consensus among security professionals that this is not a niche bug but a fundamental architectural limitation.
What makes Markdown links especially dangerous
ChatGPT and competing models are no longer standalone chatbots. They are embedded in email clients, search tools, browser extensions, and enterprise productivity suites. When a model fetches a webpage, summarizes an email, or processes a shared document, it ingests content that third parties can manipulate.
Markdown links are a particularly effective attack vector because they look innocuous. A malicious actor who controls even a small portion of a web page can embed hidden instructions in Markdown-formatted text that the model will parse and follow. Security researcher Johann Rehberger has published multiple proof-of-concept demonstrations showing how attackers can use this technique to exfiltrate conversation data by tricking the model into rendering a Markdown image tag that sends information to an external server. The attack is invisible to the user: the model simply appears to be doing its job.
The NCSC’s incident reporting portal, which aggregates threat data from organizations across the UK, provides the institutional framework through which these risks are tracked. The agency treats prompt injection as an active category of risk, not a theoretical concern, that organizations deploying AI tools need to account for in their security planning.
What we still don’t know
Despite the clear technical basis for the threat, several significant gaps remain in the public record.
No official incident report or telemetry dataset has been published quantifying how many successful phishing attempts have originated specifically from Markdown-based prompt injection in ChatGPT. Researchers have demonstrated the technique repeatedly in controlled settings, and security teams have flagged it in threat advisories, but the scale of real-world exploitation remains unmeasured in any publicly available study.
OpenAI has implemented safety layers over time, including content filtering and an instruction hierarchy system designed to give developer-set prompts priority over user-supplied input. But the company has not released detailed public documentation explaining whether those defenses specifically address the Markdown trust problem the NCSC describes. Without that transparency, outside researchers cannot assess whether current mitigations target the root cause or only its surface symptoms.
The NCSC itself stops short of prescribing a definitive fix. Its position is that prompt injection may be fundamentally harder than SQL injection precisely because no clean architectural separation between instructions and data exists within current model designs. Whether future architectures will introduce such a separation, or whether the field will develop reliable runtime defenses, remains an open question.
There is also the question of attacker adaptation. Security researchers have demonstrated proof-of-concept injection attacks using image alt-text, embedded document metadata, and hidden text rendered in white-on-white formatting. If Markdown link handling is hardened in frontier models, attackers have multiple fallback channels. No public data confirms these alternative vectors are being used at scale in active phishing campaigns, but the technical groundwork is already laid.
Why skepticism is now a security requirement
Traditional phishing is tracked through email security gateways, browser warnings, and financial fraud reports. Prompt injection phishing does not trigger those same detection systems. The attack happens inside the model’s output, which users typically trust because it appears to come from a helpful assistant rather than an external source. That trust is the vulnerability.
Until model architectures evolve to enforce a real boundary between instructions and data, the most practical defense is skepticism. Treat any link a language model generates with the same caution you would apply to a link in an unsolicited email. Hover before you click. Verify URLs independently. And if a model’s response includes an unexpected call to action, such as logging into an account or downloading a file, assume the instruction may not have come from the model’s developers.
The NCSC’s core finding is worth restating plainly: large language models cannot currently tell the difference between a legitimate instruction and a malicious one disguised as content. Every Markdown link, every pasted document, every fetched web page is a potential entry point. That is not a bug that a patch will fix. It is a property of how these systems are built, and it will shape how we use them for years to come.
More from Morning Overview
*This article was researched with the help of AI, with human editors creating the final content.