When a patrol officer in Oklahoma City finishes a domestic disturbance call, the paperwork used to take the better part of an hour. Now, in a growing number of departments, body camera audio from that call is fed into a generative AI system that spits out a structured incident report in minutes. The officer reviews it, makes corrections, and files it before the next call comes in. Detectives get written accounts faster. Supervisors clear backlogs. On paper, everyone wins.
But a federal courtroom in Chicago has already shown what can go wrong. U.S. Magistrate Judge Jeffrey Cummings found in early 2025 that an immigration enforcement agent used ChatGPT to draft a use-of-force report that directly contradicted what body camera footage captured. The AI-generated text was fluent and confident. It was also wrong. That case has become a reference point for a question now facing police departments, courts, and city governments nationwide: what happens when the tool that saves officers time also distorts the evidentiary record?
How the technology works in practice
The U.S. Department of Justice’s Office of Community Oriented Policing Services published a brief overview in its January 2025 Dispatch newsletter describing how some police agencies convert body camera recordings into written narratives through AI-based drafting and automated transcription. The piece offers a high-level summary rather than a detailed study, and it does not include error-rate data or independent evaluations of the tools it describes. Still, it confirms the basic workflow: an officer’s body camera captures audio during an encounter, software transcribes and organizes that audio into a structured report, and the officer is expected to review and edit the draft before submitting it.
Oklahoma City’s police department is among the municipal agencies cited in that federal reporting as early adopters. Exactly how many departments nationwide have deployed AI-assisted report writing is unclear; no federal agency or industry group has published a comprehensive count, and estimates from vendors and press coverage vary widely. The pattern driving adoption, however, is consistent across the departments that have gone public: staffing shortages, rising call volumes, and the persistent reality that officers often spend more time writing about incidents than responding to them. Vendors marketing these products describe them as “public safety-grade” tools, a label that implies reliability standards but has no uniform definition across the industry.
At the federal level, the DOJ now maintains a public inventory of its own AI systems, linking each entry to associated Privacy Impact Assessments. The inventory reveals how broadly artificial intelligence has already been woven into justice-related functions, from case triage to document analysis. A separate 2024 report from the DOJ’s Office of Legal Policy lays out risk categories and governance frameworks for criminal justice AI, emphasizing transparency, human oversight, and mechanisms for redress when automated systems fail.
The standards that exist and the gaps they leave
The National Institute of Standards and Technology published its Generative Artificial Intelligence Profile, known as NIST AI 600-1, in July 2024. The framework addresses validity and reliability of AI outputs, privacy protections, explainability, and accountability when generated content is used in high-stakes settings. For law enforcement specifically, NIST treats each application of generative AI, whether for report drafting, transcription, redaction, or analysis, as carrying a distinct risk profile that agencies must evaluate before deployment.
What NIST and the DOJ have not done is mandate specific accuracy benchmarks or require departments to publish error rates. No publicly available data from any local police agency, including Oklahoma City, documents how often officers correct machine-generated narratives before filing them. The DOJ’s AI inventory points to Privacy Impact Assessments, but the component-level documentation for specific policing tools has not been released in a form that allows independent review. The gap between vendor promises and operational reality remains difficult to measure from the outside.
Where the evidence points to trouble
The Chicago case is the clearest public example of that gap collapsing. According to the Associated Press report, the immigration agent relied on ChatGPT to draft a use-of-force account based on minimal inputs rather than a careful review of what the body camera actually recorded. Judge Cummings’ findings laid bare a core vulnerability: generative AI produces polished, authoritative-sounding prose regardless of whether the underlying facts are accurate. In a courtroom, that matters enormously. Police reports shape charging decisions, influence plea negotiations, and become part of the permanent court record.
The risk is not limited to one rogue agent. Generative AI models are built to predict plausible language, not to verify facts. When an officer’s body camera captures a chaotic scene with overlapping voices, background noise, and ambiguous actions, the AI still produces a clean narrative. The question is whether that narrative faithfully reflects what happened or smooths over the uncertainty in ways that favor one version of events.
Systemic research on how frequently generative AI introduces fabricated or distorted details into police reports does not yet exist in the public record. Neither the DOJ’s Office of the Inspector General nor NIST has published findings specific to generative AI failures in criminal justice field operations. Defense attorneys, judges, and policymakers are left to extrapolate from individual cases rather than rely on comprehensive audits.
The disclosure problem
Perhaps the most pressing unresolved issue is whether anyone outside the police department knows when a report was drafted by AI. No uniform federal rule requires agencies to tell courts, defense counsel, or the public that a police report originated with an automated system. Some departments may treat AI involvement as an internal workflow detail, no different from using a word processor, rather than a fact that must appear in discovery materials or on the face of the document itself.
That distinction matters for due process. If a defendant’s attorney does not know a narrative was machine-generated, there is no opportunity to probe how much of the language reflects the officer’s independent recollection versus the model’s pattern-based predictions. A report that reads as a firsthand account but was actually assembled by software from fragmented audio raises questions that current disclosure practices are not designed to answer.
Vendors argue their products are purpose-built for law enforcement, with guardrails that prevent hallucination and bias. Federal guidance, by contrast, treats those same risk categories as unresolved challenges requiring ongoing management. Whether any vendor’s guardrails would have prevented the kind of fabrication documented in the Chicago case has not been independently tested in a published study. Without standardized benchmarks or third-party evaluations, distinguishing between a genuinely safer product and a more aggressively marketed one is nearly impossible.
What to watch for
As of May 2026, the adoption curve for AI-assisted police reporting continues to steepen, driven by the same staffing and efficiency pressures that launched it. Several developments will determine whether these tools strengthen or undermine the integrity of the justice system in the months ahead.
The first is whether any police department voluntarily publishes accuracy audits showing how often AI-drafted reports require substantial correction. So far, none has. The second is whether state legislatures or courts begin requiring disclosure of AI involvement in police reports used as evidence. A handful of states have introduced broader AI transparency bills, but none has enacted a law specifically addressing AI-generated law enforcement documents. The third is whether independent researchers gain access to deployed systems for testing, something vendors have not facilitated and agencies have not required.
For defense attorneys weighing challenges to AI-drafted evidence, for city council members evaluating procurement contracts, and for residents whose encounters with police may now generate machine-written reports, the practical first step is the same: ask whether the agency using these tools has published its accuracy metrics, its review protocols, and its policy on disclosing AI involvement in court filings. If those documents do not exist or are not public, the accountability gap that federal experts have identified and that a Chicago courtroom exposed remains wide open.
More from Morning Overview
*This article was researched with the help of AI, with human editors creating the final content.