AI coding tools are doubling output, with code quality holding up

Generative AI coding assistants are producing measurable speed gains for software engineering teams, with some tasks reaching roughly double the previous pace. Yet a closer look at the code itself reveals a less tidy picture: duplication rates have surged, raising hard questions about whether the productivity boost carries hidden long-term costs that teams are only beginning to reckon with.

Federal Research Points to Real Speed Gains

The clearest signal that AI tools are accelerating development work comes from a presentation delivered at the Department of Homeland Security’s Software Assurance Forum. Anna Minkiewicz’s December 2024 presentation on the impact of generative AI on software engineering activities examined where AI assistants deliver the most value across the development lifecycle. Documentation and test generation stood out as the areas seeing the largest acceleration, with output roughly doubling compared to traditional workflows. That finding matters because documentation and testing are often the tasks developers deprioritize under deadline pressure. This means AI tools may be filling a gap that human teams have long struggled to close on their own.

The DHS forum context is significant. Government software projects tend to be large, compliance-heavy, and slow moving. If AI coding tools can meaningfully speed up work in that environment, the implication for faster-moving private-sector teams is that gains could be even larger when bureaucratic overhead is lighter. Minkiewicz’s research focused on which engineering activities benefit most, rather than offering a blanket endorsement of AI-assisted coding. That distinction is worth keeping in mind: the speed gains are real but concentrated in specific task categories, not spread evenly across all development work.

Duplication Rates Tell a Different Story

Speed, though, is only half the equation. When GitClear analyzed millions of lines of code written between 2020 and 2024, the firm found an eightfold increase in duplicated code, a pattern that MIT Sloan Management Review identified as a measure of declining code quality. That is not a minor statistical blip. An eightfold jump in duplication means developers, or the AI tools assisting them, are copying and pasting logic rather than writing clean, modular solutions that can be maintained over time.

Duplicated code creates compounding problems. When the same block of logic appears in eight places instead of one, every future bug fix or feature change must be applied eight times. Miss one instance and the codebase develops inconsistencies that are difficult to trace. For engineering managers, this pattern translates directly into higher maintenance costs, longer debugging cycles, and a growing risk of subtle errors that slip through standard review processes. The productivity gains from AI-assisted coding may be real in the short term, but the GitClear data suggests those gains could be partially offset by rising technical debt.

Why the Tension Between Speed and Quality Persists

The disconnect between faster output and rising duplication is not accidental. It reflects how most AI coding assistants work today. Tools like GitHub Copilot and similar autocomplete systems generate code by predicting the next likely sequence of tokens based on patterns in their training data. When a developer asks for a function that resembles something common in open-source repositories, the tool is likely to produce a plausible answer quickly. But “plausible” and “well-architected” are not the same thing. AI assistants optimize for local correctness within a single file or function. They do not have a global view of the codebase, so they cannot know that the same logic already exists in another module. The result is structurally sound code that happens to be redundant.

This is where the headline claim that “code quality is holding up” needs careful framing. On a per-function basis, AI-generated code often passes tests and meets functional requirements. Reviewers looking at individual pull requests may see nothing wrong. The quality problem shows up only at scale, when someone examines the codebase as a whole and notices the same patterns repeated across dozens of files. Traditional code review processes were not designed to catch this kind of drift, which is why the GitClear findings are striking: the degradation was invisible to standard quality gates but obvious in aggregate data.

There is also a cultural component. Many teams still measure productivity through visible throughput: number of pull requests merged, tickets closed, or lines of code added. AI tools excel at boosting those metrics, which can unintentionally reward behaviors that increase duplication. Developers under pressure to deliver may accept AI-generated snippets as-is, even when a better solution would be to refactor existing code or extend a shared library. Without explicit incentives to minimize redundancy, the path of least resistance becomes the default.

What Teams Can Do About It

The practical question for engineering leaders is not whether to use AI coding tools. Adoption is already widespread and accelerating, driven by genuine productivity benefits and competitive pressure. The real question is how to capture the speed gains without letting duplication spiral out of control.

One approach gaining traction is pairing AI code generation with automated quality auditing tools that scan for duplication, dead code, and architectural drift on every commit. Static analysis platforms and newer AI-powered review systems can flag repeated patterns before they accumulate. The logic is straightforward: if AI tools are generating code faster than humans can review it, the review layer itself needs to be partially automated and tuned to catch the specific failure modes that AI assistants introduce.

Another strategy is to limit where AI assistants operate freely. The DHS forum research suggests that documentation and test generation are the highest-value use cases. Restricting AI-assisted coding to those domains, while keeping core application logic under tighter human oversight, could preserve the speed benefits while reducing the risk of duplication in critical systems. This is especially relevant for government and regulated-industry teams, where a maintenance burden in production code can have outsized consequences.

Teams can also adapt their development practices. Coding standards can be updated to emphasize reuse of existing modules and to require a quick search for similar functionality before accepting AI-generated solutions. Code review checklists can include explicit questions about duplication and opportunities to consolidate logic. Pair programming sessions that incorporate AI tools can encourage developers to treat the assistant as a brainstorming partner rather than an oracle, prompting them to refactor and generalize patterns instead of copying them verbatim.

Training plays a role as well. Many developers have learned to prompt AI tools effectively for speed, but fewer have been trained to use them responsibly for maintainability. Workshops that walk through real examples of duplicated AI-generated code, followed by refactoring exercises, can help teams internalize the trade-offs. The goal is not to slow developers down but to shift their mental model from “generate more code” to “generate the right code once and reuse it well.”

The Bigger Picture for Software Teams

The current evidence paints a split-screen view of AI-assisted development. On one side, teams are genuinely producing more output. Documentation gets written. Tests get generated. Features ship faster. On the other side, the codebase itself is accumulating structural problems that will demand attention later. Neither side of this picture invalidates the other, and the mistake would be to treat either the optimistic or pessimistic reading as the full story.

What the data from both the DHS forum presentation and the GitClear analysis point toward is a maturation challenge. First-generation AI coding tools were built to maximize suggestion throughput. The next generation will need to account for codebase-level quality, not just function-level correctness. Until that shift happens, the burden falls on engineering organizations to build guardrails, metrics that track duplication, processes that encourage reuse, and governance that defines where AI can be used freely and where it must be constrained.

For teams willing to confront these trade-offs directly, AI assistants can still be a net positive. The same tools that generate redundant code can help identify it, suggest refactorings, and keep documentation synchronized with evolving systems. The key is to recognize that productivity is not just about how fast code is written but how resilient that code remains over years of change. Generative AI has proven it can move the first part of that equation. The next phase of adoption will determine whether it can support the second.

More from Morning Overview

*This article was researched with the help of AI, with human editors creating the final content.

IG

FB

PIN

LI

X

AI coding tools are doubling output, with code quality holding up

Federal Research Points to Real Speed Gains

Duplication Rates Tell a Different Story

Why the Tension Between Speed and Quality Persists

What Teams Can Do About It

The Bigger Picture for Software Teams

Author

Get weekly updates with the latest news and tips!

More in AI

IG

FB

PIN

LI

X