AI token demand is straining compute—some tools could face limits

In Northern Virginia, the densest cluster of data centers on Earth, utility Dominion Energy has a queue of facilities waiting years for grid connections. In Dublin, Ireland’s state grid operator paused new data center hookups in parts of the capital. And across the Nordic countries, once considered a safe haven for power-hungry servers, local authorities are pushing back on expansion plans. The common thread: artificial intelligence is consuming electricity at a pace that existing infrastructure was never built to handle.

The International Energy Agency quantified the problem in its early 2026 assessment of global data center power use. Data center electricity demand jumped 17% in 2025 compared with the prior year, outpacing growth in most other industrial sectors. AI-focused facilities grew even faster, the agency found, because training and running large language models requires dense clusters of GPUs operating around the clock. The IEA used the phrase “tightening bottlenecks” not as a forecast but as a description of conditions already in place.

Power, not chips, is the new bottleneck

For years, the AI industry’s speed limit was set by chip supply. Companies raced to secure Nvidia’s latest GPUs, and waitlists stretched for months. That constraint has not disappeared, but a second one has overtaken it. When a data center cannot draw enough electricity from the local grid, the GPUs inside sit idle or run below capacity. No amount of silicon solves a wiring problem.

The scale difference between AI workloads and traditional computing makes the squeeze worse. Researchers at the IEA and elsewhere estimate that a single query to a large language model can draw roughly six to ten times the electricity of a conventional web search, though exact figures vary by model size and hardware generation. Multiply that by hundreds of millions of daily queries across ChatGPT, Gemini, Claude, and dozens of smaller services, and the aggregate load on regional grids becomes significant enough to compete with residential and industrial demand.

The IEA’s broader Energy and AI report underscores how uncertain the trajectory remains. The agency published wide forecast bands for data center electricity consumption through the end of the decade, acknowledging that chip efficiency improvements, renewable energy buildouts, and government policy choices could push outcomes in sharply different directions. That uncertainty is itself a warning: neither providers nor grid planners can confidently promise that supply will catch up to demand on any fixed schedule.

What providers have and haven’t said

No major AI company, including OpenAI, Google, or Anthropic, has published detailed breakdowns of how token volume translates to electricity consumption per query. Google and Microsoft have released annual sustainability reports that disclose total data center energy use, but those figures bundle AI workloads with cloud computing, search indexing, and other services, making it impossible to isolate the AI-specific draw.

Behind the scenes, trade press reports citing unnamed sources suggest that several providers have discussed internal contingency plans for compute scarcity. The options reportedly on the table include dynamic pricing, where free-tier users face longer wait times or reduced output quality during peak hours, and geographic load-shifting, routing inference requests to regions with cheaper, more abundant power. As of late April 2026, however, no provider has made an official product announcement confirming usage caps or tiered throttling tied to energy constraints. Until that happens, the link between grid strain and consumer-facing limits remains a logical inference rather than a documented policy.

That gap matters for the millions of developers and small businesses now woven into AI-powered workflows. A startup that routes customer support through an LLM-based chatbot or a design studio that depends on image generation APIs has real operational exposure if access is suddenly rationed or repriced. The absence of transparent capacity planning from providers leaves those users guessing.

Why the grid can’t just catch up

Chip designers operate on roughly two-year cycles for major efficiency gains. Power infrastructure moves on a fundamentally different clock. Permitting and building a new natural gas plant in the United States takes four to seven years. Utility-scale solar and battery projects can move faster but still face interconnection queues that stretch 18 months or more in congested regions. Nuclear, including the small modular reactors that several tech companies have invested in, remains a decade or more from meaningful commercial output in most jurisdictions.

Grid operators in the U.S. mid-Atlantic region, home to “Data Center Alley” in Loudoun County, Virginia, have been transparent about the mismatch. PJM Interconnection, the regional transmission organization, has flagged that new large-load requests are arriving faster than transmission upgrades can be approved. In Ireland, EirGrid’s moratorium on new Dublin-area data center connections, first imposed in 2022, has been only partially relaxed despite political pressure from the tech sector.

Efficiency gains on the hardware side offer some relief. Each new generation of Nvidia and AMD accelerators delivers more inference per watt, and techniques like model quantization and speculative decoding reduce the compute cost of individual queries. But those improvements are running a race against adoption: as AI tools become embedded in more products and more users send more queries, total energy demand can rise even as per-query efficiency improves. The IEA’s data suggests that, at least through 2025, adoption won outright.

What this means for anyone using AI tools

The practical takeaway for developers and businesses is to build flexibility into workflows now, before any rationing arrives. That means identifying a fallback if a primary model provider introduces rate limits or price increases. Options include smaller open-source models capable of running on local hardware, a second commercial API from a different provider, or hybrid setups that reserve cloud-based LLMs for tasks where quality justifies the cost.

Monitoring provider status pages and pricing announcements is the simplest early-warning system. Any changes to access tiers will almost certainly appear there before they take effect. For teams with heavier usage, tracking per-query costs and token consumption over time can reveal whether a provider is quietly adjusting throughput before making a formal announcement.

The broader pattern is one of physical infrastructure lagging behind software ambition. Transistor density can double on a schedule, but concrete, steel, and copper move at the speed of permits and construction crews. That mismatch means the AI industry’s growth rate is, for the first time, being shaped not by breakthroughs in a research lab but by how fast the world can build the power systems to keep the lights on inside the machines.

More from Morning Overview

*This article was researched with the help of AI, with human editors creating the final content.

IG

FB

PIN

LI

X

Global Font

AI token demand is straining compute—some tools could face limits

Power, not chips, is the new bottleneck

What providers have and haven’t said

Why the grid can’t just catch up

What this means for anyone using AI tools

Dorian Maddox

Author

Justice Department reclassifies state-legal medical marijuana as less dangerous

Marines speed aircraft turnarounds and move munitions faster for dispersed ops

U.S. Air Force picks a 3rd base site for a microreactor power program

ISS Expedition 74 studies stem cells and DNA nanotherapy for space health

Oklo taps Nvidia and Los Alamos to validate plutonium-based reactor fuel

More in AI

AI

Trump push to blacklist Anthropic clashes with Pentagon interest in Claude

AI

Transportation Secretary Sean Duffy says AI won’t replace air controllers

AI

Leaked memo says Gemini trails Claude Code on key developer feature

AI

Tesla registers AI voice assistant with Shanghai regulators, Reuters says

AI

Chinese humanoid robot makers shift from demos to household chores

AI

University of Oregon AI models drug molecule motion before lab tests

AI

OpenAI’s ChatGPT Images 2.0 sharpens text rendering in AI images

AI

AI model trained on 13,000 simulations forecasts renewable power buildout

IG

FB

PIN

LI

X

IG

FB

PIN

LI

X

AI token demand is straining compute—some tools could face limits

Power, not chips, is the new bottleneck

What providers have and haven’t said

Why the grid can’t just catch up

What this means for anyone using AI tools

Author

Get weekly updates with the latest news and tips!

More in AI

IG

FB

PIN

LI

X