Uber CTO teases massive AI coding push: ‘A real reset for engineering’

Uber’s chief technology officer has signaled that artificial intelligence will fundamentally reshape how the company builds software, describing the shift as “a real reset for engineering.” The comments, delivered at a major AI conference, come as Uber already runs an internal AI tool that processes tens of thousands of code changes each week and saves an estimated 1,500 developer hours in the same period. The ambition is clear: move AI from a helper role into something closer to an autonomous engineering partner.

What the Uber CTO Said at Interrupt 2025

Uber’s technology chief, Praveen Neppalli Naga, presented at LangChain’s Interrupt 2025: The AI Agent Conference. The event brought together companies building and deploying AI agents, and Uber’s presence on the speaker roster placed the ride-hailing giant alongside firms pushing the boundaries of agentic software systems.

The phrase “a real reset for engineering” suggests something beyond incremental tooling improvements. Naga’s framing points to a structural rethinking of developer workflows, where AI agents do not simply autocomplete code but take on broader, more complex tasks across the software development lifecycle. That distinction matters because most enterprise AI coding tools today still function as sophisticated assistants rather than independent actors capable of planning, executing, and reviewing multi-step engineering work.

The conference recap from LangChain confirms Uber as a featured speaker and highlights the event’s focus on practical AI agents, though a full transcript of Naga’s remarks has not been published. Without verbatim quotes beyond the headline phrase, the exact scope of Uber’s planned AI engineering overhaul remains partly opaque. What is available, however, is hard evidence of what Uber has already shipped internally.

uReview: 65,000 Code Changes a Week Under AI Scrutiny

The strongest proof that Uber is serious about AI-driven engineering sits in its own codebase. The company built and deployed an internal GenAI code review system called uReview, which now analyzes more than 90% of the roughly 65,000 diffs, or code changes, that Uber engineers produce each week. Those numbers come directly from Uber’s engineering post, which offers a detailed technical description of the system’s architecture and performance.

The scale is striking. Processing that volume of code changes means uReview touches nearly every meaningful piece of new or modified software at Uber on a weekly basis. The system estimates it saves approximately 1,500 hours of developer time per week, which works out to about 39 developer-years annually. For a company that operates a global platform spanning ride-hailing, food delivery, and freight logistics, reclaiming that much engineering capacity creates room for faster feature development and shorter release cycles.

Those savings deserve some scrutiny, though. The 1,500-hour figure is self-reported by Uber and based on internal estimates rather than independent measurement. Developer time saved through automated code review is notoriously hard to quantify because the counterfactual (how long a human reviewer would have taken) varies widely by code complexity, reviewer experience, and organizational review culture. The number is directionally useful but should be treated as an approximation rather than a precise accounting.

Why Automated Review Is Not the Same as Automated Coding

A common mistake in coverage of AI coding tools is conflating code review with code generation. uReview sits on the review side: it reads code that humans have already written and flags potential issues, suggests improvements, and accelerates the feedback loop between author and reviewer. That is valuable, but it is a different challenge from having AI agents write production code from scratch or autonomously plan and execute multi-file changes.

Naga’s “real reset” language implies Uber wants to push further along that spectrum. If the company is investing in agentic systems that can handle more than review, that would represent a qualitative leap. Writing code, testing it, deploying it, and monitoring its behavior in production all involve judgment calls that current AI models handle unevenly. Review is a natural starting point because the cost of a bad suggestion is low: a human engineer simply ignores it. The cost of a bad autonomous deployment is much higher.

This tension between efficiency and reliability is the central tradeoff that any company scaling AI coding tools must manage. Uber’s approach with uReview, which its engineering team describes as emphasizing scalable and trustworthy safeguards, suggests the company is aware of the risk. But as AI agents take on more responsibility in the development pipeline, the surface area for subtle bugs grows. An automated reviewer catching a formatting issue is routine. An automated agent introducing a logic error in a payment flow is a different category of problem entirely.

What This Means for Engineering Teams Beyond Uber

Uber’s moves carry weight because of the company’s engineering scale. With roughly 65,000 code changes flowing through its systems every week, any tool that works at that volume becomes a proof point for other large organizations considering similar investments. If uReview can maintain quality at more than 90% coverage across that many diffs, it offers a template, or at least a reference architecture, for enterprise AI code review.

The broader implication is about role evolution rather than role elimination. Saving 39 developer-years annually does not mean Uber laid off 39 engineers. It means those engineers spent less time on manual review and, presumably, more time on higher-order work such as system design, complex debugging, and cross-team coordination. That reallocation is the optimistic case. The less comfortable question is what happens as AI agents handle progressively more of the development cycle. If review is automated today and generation follows tomorrow, the skills that define a software engineer’s daily work shift significantly.

For individual developers, the practical takeaway is that familiarity with AI-assisted workflows is becoming a baseline expectation at major tech employers. Engineers who understand how to work alongside AI tools, how to evaluate their output, and how to catch their failures will be better positioned than those who treat the tools as either magic or threat. Knowing how to frame good prompts, interpret model suggestions, and integrate agentic systems into existing CI/CD pipelines is on its way to becoming part of standard engineering literacy.

The Gap Between Vision and Verification

Uber’s public disclosures create an interesting split. On one side, the company has published concrete, measurable data about uReview’s deployment and impact, including coverage rates and estimated hours saved. On the other, the broader “reset for engineering” vision articulated at Interrupt 2025 remains largely conceptual in the public record. Without detailed examples of future workflows, it is difficult to assess how far and how fast Uber intends to push toward autonomous or semi-autonomous engineering agents.

This gap between vision and verification is not unique to Uber. Across the industry, vendors and large tech firms frequently describe ambitious AI roadmaps that outpace what has been demonstrated at scale. The difference here is that Uber has already operationalized at least one substantial AI system in a core engineering workflow. That makes its rhetoric about agentic development more credible than purely speculative claims, but it does not eliminate the need for careful validation as the company experiments with more powerful tools.

For now, the safest conclusion is that Uber is using AI to aggressively optimize an existing human-centric process rather than to replace engineers outright. Code review, as a gatekeeping function for quality and security, lends itself to augmentation: humans remain responsible for the final decision, while the machine handles repetitive scanning and pattern recognition. Moving beyond that model will raise harder questions about accountability, incident response, and regulatory expectations if AI-written code causes real-world harm.

How Other Organizations Might Respond

Other large engineering organizations are likely to study Uber’s experience closely. A system that can review the vast majority of code changes, surface actionable feedback, and integrate into established tooling offers a compelling case for investment. At the same time, replicating uReview’s results would require not only technical sophistication but also cultural alignment: teams must be willing to trust AI-generated comments, adjust review norms, and refine processes as the tool learns.

Smaller companies may not match Uber’s scale, but they can still borrow key ideas. Starting with narrow, well-scoped use cases, instrumenting them with clear metrics, and iterating based on developer feedback can help avoid both underuse and overreliance. The lesson from Uber’s deployment is less about specific model choices and more about treating AI systems as long-term infrastructure that must be monitored, tuned, and governed rather than as one-off productivity hacks.

As AI agents mature, the engineering profession will likely see a shift toward work that is harder to formalize into prompts and patterns: understanding ambiguous product requirements, navigating organizational constraints, and making tradeoffs under uncertainty. Uber’s “reset for engineering” may ultimately be less about machines taking over coding and more about humans redefining what counts as the most valuable engineering work.

More from Morning Overview

*This article was researched with the help of AI, with human editors creating the final content.

IG

FB

PIN

LI

X