ExpressVPN says it found 3.7M leaked AI chatbot messages and recordings

ExpressVPN has flagged a significant data exposure involving 3.7 million AI chatbot records, including chat logs, transcripts, and audio recordings, all found sitting in databases that required no password to access. The discovery, detailed by cybersecurity researcher Jeremiah Fowler, raises pointed questions about how companies deploying AI-powered customer service tools are handling the sensitive data those systems collect. With businesses racing to adopt AI chatbots for cost savings and efficiency, this incident exposes a gap between the speed of deployment and the rigor of data protection.

What is verified so far

The core facts trace back to Jeremiah Fowler, a cybersecurity researcher who described the exposure in an interview broadcast by Illinois Public Media. According to Fowler, the leak involved “three separate databases with 3.7 million records.” Those records contained chat logs between users and AI-powered customer service bots, written transcripts of those interactions, and audio recordings. None of the three databases were protected by a password, meaning anyone who located them could browse their contents freely.

The nature of the exposed data is what makes this case particularly alarming for affected users. Chat logs and transcripts from AI customer service interactions can contain personal details that people share while troubleshooting account issues, processing returns, or asking about billing. Audio recordings add another layer of risk, since voice data can be used for biometric identification or social engineering attacks. Fowler’s account confirms the records were accessible without authentication, which means the window of exposure could have been wide and prolonged, though the exact duration has not been publicly established.

The 3.7 million figure is specific and consistent across available reporting. Fowler used that number directly in his interview, and the framing points to a single company’s AI customer service infrastructure rather than a broad platform breach affecting multiple vendors. The identity of the company whose databases were exposed, however, has not been confirmed in the publicly available transcript. ExpressVPN’s role appears to have been in surfacing or publicizing the discovery, though the primary technical findings come from Fowler’s research.

What stands out here is the simplicity of the security failure. This was not a sophisticated hack or a zero-day exploit. The databases lacked basic access controls. That distinction matters because it suggests the exposure resulted from a configuration oversight rather than a targeted attack, a pattern that security professionals have flagged repeatedly as AI tools scale faster than the teams responsible for securing them.

What remains uncertain

Several important details are missing from the public record. The company responsible for the exposed databases has not been named in the available reporting. Without that identification, affected users have no way to know whether their data was part of the leak or to take protective steps such as changing passwords or monitoring accounts for fraud. It is also unclear whether the databases have since been secured or whether they remained accessible after Fowler’s disclosure.

The specific types of personal information contained in the 3.7 million records have only been described in general categories: chat logs, transcripts, and audio recordings. Whether those records included names, email addresses, phone numbers, payment details, or other directly identifiable information has not been confirmed. The severity of the exposure for individual users depends heavily on that distinction. A database of anonymized chatbot transcripts poses a different risk profile than one containing full customer profiles paired with voice recordings.

There is also limited information about how long the databases were publicly accessible. Security exposures of this type can persist for days, weeks, or months before discovery, and the timeline shapes both the likelihood that bad actors accessed the data and the regulatory consequences the responsible company may face. Data protection laws in many jurisdictions, including the European Union’s General Data Protection Regulation and various U.S. state privacy statutes, impose notification requirements that hinge on when a breach was discovered and how quickly affected individuals were informed.

The relationship between ExpressVPN and Fowler’s research also deserves clarification. ExpressVPN is primarily known as a virtual private network provider, not a security research firm. Whether ExpressVPN commissioned Fowler’s investigation, collaborated on it, or simply amplified his findings through its platform is not spelled out in the university materials that host the interview transcript. That distinction affects how readers should weigh the motivations behind the disclosure. Security research published through a commercial brand sometimes serves dual purposes: genuine public interest and brand positioning.

How to read the evidence

The strongest piece of primary evidence is Fowler’s own account, delivered in a recorded interview and preserved in transcript form by Illinois Public Media, an institutional broadcaster affiliated with the University of Illinois system. That institutional context lends credibility to the reporting. Fowler is an established cybersecurity researcher whose previous work on database exposures has been cited by major outlets. His description of the three databases and the 3.7 million record count is specific, internally consistent, and delivered in a format where follow-up questions were possible.

What the evidence does not include is independent verification from a second researcher or a public disclosure from the affected company. In cybersecurity reporting, the gold standard involves coordinated disclosure: a researcher finds a vulnerability, notifies the responsible party, and then publishes findings after the issue is resolved. Whether that process was followed here is not confirmed in the available sources. Readers should treat the 3.7 million figure as credible but single-sourced until additional confirmation emerges.

The audio and broadcast archives from Illinois Public Media provide supporting context for the interview but do not add independent data points about the breach itself. They confirm that the interview took place and that Fowler made the claims attributed to him, which is useful for attribution but does not expand the factual base. No primary database logs, screenshots, or technical documentation have been made publicly available to corroborate the specific contents of the exposed records.

For readers trying to assess personal risk, the practical takeaway is limited by the gaps in the public record. Without knowing which company’s AI chatbot system was involved, individuals cannot determine whether they interacted with the affected service. General precautions still apply: monitoring financial accounts for unusual activity, being cautious about unsolicited communications that reference past customer service interactions, and reviewing privacy settings on platforms that use AI-driven support tools.

The broader pattern

The broader pattern this incident fits into is worth examining on its own terms. As companies roll out AI chatbots to handle customer inquiries, they are effectively creating new data collection points that can capture highly sensitive information at scale. Each chatbot session may seem mundane, but across millions of interactions, the logs can reveal detailed profiles of customers’ identities, habits, and vulnerabilities. When those logs are stored in cloud databases without even basic password protection, the risk is no longer hypothetical.

Misconfigured databases have long been a common source of data leaks, but the rise of AI adds new dimensions. AI systems often require large volumes of training and feedback data, which encourages organizations to retain more information for longer periods. Customer service teams may also assume that because a chatbot is an automated interface, the conversations it handles are less sensitive than those with human agents. Fowler’s findings suggest the opposite: people share the same kinds of details with bots that they would with a person, and sometimes more, because the interaction feels impersonal and routine.

This incident also underscores how uneven governance can be when AI tools are introduced quickly. In many organizations, the teams responsible for deploying chatbots are not the same as those charged with information security or privacy compliance. Without clear oversight, basic safeguards (such as enforcing authentication on databases or limiting which fields are stored long-term) can fall through the cracks. The result is what Fowler appears to have uncovered: vast repositories of conversational and audio data exposed to the open internet.

From a regulatory perspective, anonymous or pseudonymous chat transcripts might fall into a gray area, but voice recordings and any data tied to identifiable accounts are more clearly within the scope of modern privacy laws. Regulators have increasingly signaled that “security by configuration” failures, such as leaving a database exposed without a password, will not be treated leniently simply because no confirmed malicious access has been documented. If the company behind these 3.7 million records operates in regions with strict breach-notification rules, it may eventually face questions from watchdogs and data protection authorities.

For now, public understanding of the case rests heavily on Fowler’s testimony and the institutional credibility of the outlet that aired it. Until the responsible company is named or an official incident report is released, the story remains an illustrative warning about leaked AI chatbot messages and recordings rather than a fully mapped breach. It highlights how easily AI customer service deployments can turn into large-scale privacy liabilities when basic security practices are overlooked, and it leaves consumers with a familiar but unresolved concern: sensitive data may be at risk, even when the systems collecting it are marketed as cutting-edge and intelligent.

More from Morning Overview

*This article was researched with the help of AI, with human editors creating the final content.

IG

FB

PIN

LI

X

Global Font

ExpressVPN says it found 3.7M leaked AI chatbot messages and recordings

What is verified so far

What remains uncertain

How to read the evidence

The broader pattern

Cassian Holt

Author

Mineral-rich forest clearing in Central Africa draws elephants into the open

Studies find AI-generated code can outperform humans in biomedical analysis

Artemis II crew names a lunar bright spot for Reid Wiseman’s late wife

UnitedHealth’s $3B AI push raises new questions for patient care

Researchers recycle centuries-old bullets to recover lead for solar tech

More in Cybersecurity

Cybersecurity

Iran issues threat against OpenAI-linked ‘Stargate’ data center project

Cybersecurity

Hasbro confirms cyberattack, warns system recovery may take weeks

Cybersecurity

CNET survey finds many Americans still skip basic device security steps

Cybersecurity

Wells Fargo fraud team warns AI scams are getting harder to spot

Cybersecurity

‘BrowserGate’ report alleges LinkedIn scans extensions and devices

Cybersecurity

Report alleges LinkedIn scans 6,000+ Chrome extensions for fingerprinting

Cybersecurity

Police are using browser cookies to help identify suspects

Cybersecurity

Suspected North Korea supply-chain hack could take months to fix

IG

FB

PIN

LI

X

IG

FB

PIN

LI

X

ExpressVPN says it found 3.7M leaked AI chatbot messages and recordings

What is verified so far

What remains uncertain

How to read the evidence

The broader pattern

Author

Get weekly updates with the latest news and tips!

More in Cybersecurity

IG

FB

PIN

LI

X