Hospitals roll out AI chatbots for patients, splitting doctors

When Hartford HealthCare flipped the switch on a new AI chatbot inside its patient portal, the promise was simple: describe your symptoms, and the system will figure out whether you need a virtual visit, an in-person appointment, or just reassurance. No phone tree. No waiting days for a callback.

By spring 2026, the tool, called PatientGPT, is live across the Connecticut-based system’s network of hospitals and clinics. Built in partnership with K Health, a company whose AI-driven primary care platform has been described in its own press materials as serving millions of users, PatientGPT pulls from a patient’s electronic health record in real time and runs around the clock. It is one of the most visible examples of a broader shift: hospitals are embedding generative AI directly into the communication layer between patients and clinicians, and the medical profession is deeply divided over whether that is progress or peril.

What PatientGPT actually does

The chatbot sits inside Hartford HealthCare’s existing patient portal and mobile app. A patient who logs in with a new concern, say persistent headaches or a rash that will not clear, can type a description and answer follow-up questions generated by the AI. Based on the conversation and the patient’s medical history, PatientGPT routes them to one of several outcomes: a same-day virtual visit, a scheduled appointment with a specialist, or guidance that the issue can be monitored at home. The system is designed to act as a triage layer, not a diagnostician. It does not prescribe medications or deliver test results.

For patients, the practical change is speed. Portal messages sent the traditional way often sit in a queue until a nurse or medical assistant can review them, a process that can take hours or, over weekends, days. The AI layer compresses that wait by sorting urgency up front. For the health system, the bet is that smarter routing will reduce unnecessary emergency department visits and free clinical staff to focus on patients who need hands-on care.

The NYU Langone evidence

Hartford HealthCare’s launch did not happen in a vacuum. Peer-reviewed research from NYU Langone Health, published in npj Digital Medicine, has been tracking what happens when generative AI is woven into clinician workflows at a major academic medical center.

One study drew on electronic health record audit logs from October 2023 through August 2024 and documented how AI-generated draft replies were embedded directly in Epic’s InBasket, the messaging interface most large U.S. hospitals use. Clinicians could accept the AI draft as a starting point, edit it, or ignore it and write from scratch. The research also detailed how NYU Langone refined its AI prompts over time, moving from an initial configuration to updated versions shaped by clinician feedback. That iterative process matters because it shows these deployments are not one-and-done installations. They require ongoing tuning.

A separate peer-reviewed analysis, also in npj Digital Medicine, examined the guardrails and failure modes surrounding AI-drafted replies. (Notably, the DOI for this paper contains a “2026” identifier, suggesting it may have been published online ahead of its formal print date, a common practice in academic journals.) Among the findings: the technology sometimes generated drafts for messages that did not actually require a response, adding review burden rather than reducing it. Turnaround-time improvements were modest. And in controlled testing, clinically meaningful errors or omissions in AI drafts were sometimes missed by the reviewing clinician. A wrong medication suggestion, a missed allergy, a dosage pulled from the wrong part of a chart: any of these could slip through if a busy doctor trusts the draft and clicks send.

Why doctors are split

That error risk sits at the heart of the divide. Physicians who support AI-assisted messaging point to the sheer volume of portal traffic. A single primary care doctor can receive hundreds of patient messages a week, many of them routine refill requests or follow-up questions that consume time without requiring complex clinical judgment. For those messages, an AI draft that gets the tone and content mostly right can save real minutes across a shift.

Physicians on the other side of the debate argue that the time saved composing a reply gets eaten by the cognitive effort of verifying whether the draft is accurate. Checking medication names, confirming the advice aligns with current guidelines, making sure the response reflects a specific patient’s history rather than generic language: all of that demands attention. And the risk is asymmetric. A correct draft saves a few minutes. An incorrect draft that goes unnoticed can cause harm that takes far longer to undo.

This article draws on published studies and a corporate press release rather than original interviews. No clinicians, patients, or hospital executives were quoted directly, and readers should weigh the sourcing accordingly.

Without standardized metrics for draft quality or agreed-upon thresholds for acceptable error rates, individual health systems are making deployment decisions based on internal pilots rather than shared benchmarks. That variability means a patient’s experience with AI-assisted communication may differ substantially from one hospital to the next.

What remains unproven

Hartford HealthCare’s announcement describes what PatientGPT is designed to do, but independent outcome data on its accuracy, patient satisfaction, or clinical safety have not been published as of May 2026. Whether the chatbot reduces emergency department visits, shortens wait times, or changes diagnostic accuracy for the system’s patient population is an open question. The tool’s ability to access a patient’s medical record also raises unresolved issues around data governance, consent workflows, and how errors are flagged when the AI misreads a chart entry or surfaces outdated information as current.

At NYU Langone, the audit-log study captured utilization patterns but did not measure patient outcomes directly. Knowing that clinicians chose to start with an AI draft tells us about adoption, not about whether the final reply improved care. The guardrails analysis flagged the risk of undetected errors, yet the frequency and severity of those errors across a full patient population have not been quantified in a large-scale safety trial. Researchers have called for such trials, but as of spring 2026, none have reported results that would settle the question.

What patients and clinicians should do now

For patients encountering these tools at their own hospital or clinic, the practical guidance is straightforward: read any AI-generated message or recommendation carefully before acting on it. If a message about a new symptom, a medication change, or a test result seems confusing or incomplete, request clarification from a human clinician or call the office directly. The convenience of an instant answer should not override the need for clear, personalized guidance when the stakes are high.

Clinicians reviewing AI drafts face a parallel responsibility. That means checking medication names and doses, confirming alignment with current guidelines, and making sure the message reflects the specific patient’s history. Some health systems are already building in safeguards, such as requiring explicit attestation that a draft has been reviewed, or limiting AI use to lower-risk message types like administrative questions and routine follow-up scheduling.

Health system leaders, meanwhile, are setting the boundaries. Decisions about which message types are eligible for AI assistance, how clinicians are trained, and what monitoring catches problems early will shape whether these tools earn trust or erode it. Audit logs, spot checks of AI-assisted replies, and patient feedback surveys can all surface patterns of error or dissatisfaction. Transparency matters too: clearly labeling AI-assisted messages and explaining, in patient-facing materials, how the technology works gives patients the information they need to make informed choices about their own care.

Where the safety evidence trails the deployment pace

The technology is moving faster than the safety evidence. Hartford HealthCare is not alone; health systems across the country are piloting or expanding AI tools in patient communication, often on timelines driven by competitive pressure and vendor partnerships rather than published outcome data. Used cautiously, tools like PatientGPT and AI-generated message drafts may ease access and reduce some of the friction that has built up in digital health communication over the past decade. Used carelessly, they risk amplifying problems that already plague healthcare: uneven quality, rushed decisions, and gaps in understanding, now wrapped in a layer of algorithmic confidence.

The question facing every hospital board considering these tools is not whether generative AI will appear in more patient portals. It will. The question is whether the investment in evaluation and human oversight will match the investment in deployment. So far, the evidence suggests it has not.

More from Morning Overview

*This article was researched with the help of AI, with human editors creating the final content.

IG

FB

PIN

LI

X