Morning Overview

The true cost of every ChatGPT prompt revealed

Every time a user types a question into ChatGPT, the computational machinery behind that response draws on data centers, specialized chips, and electricity at a scale most people never consider. The cost of a single prompt may seem negligible to the person asking it, but behind the scenes, the infrastructure required to deliver that answer connects to multibillion-dollar hardware deals and energy commitments measured in gigawatts. Understanding who pays for all of this, and how those costs ripple through the technology industry, reveals why the price of artificial intelligence is far higher than the subscription fee on a monthly credit card statement.

Chips, Capacity, and the Capital Behind Each Query

The economics of running a large language model at global scale start with hardware. AI inference, the process of generating a response to a prompt, requires massive clusters of specialized processors running around the clock. That demand has pushed companies like OpenAI into direct negotiations with chipmakers on a scale normally reserved for sovereign infrastructure projects. OpenAI is currently pursuing a multibillion-dollar chip deal that would give it a stake of roughly 10% in AMD, according to the Financial Times. The deal involves gigawatts of computing capacity, a unit of measurement more commonly associated with power grids than software companies.

That framing matters because it resets how the industry thinks about cost. When infrastructure is measured in gigawatts, the expense of a single ChatGPT prompt is not simply a fraction of a server’s hourly rate. It is a tiny slice of a capital commitment that spans chip fabrication, cooling systems, land acquisition, and long-term electricity contracts. Each prompt, in isolation, costs fractions of a cent. But the fixed costs required to make that prompt possible run into the tens of billions of dollars before a single user ever hits “send.” The distinction between marginal cost per token and total infrastructure cost per gigawatt of capacity is where the real financial story lives, because investors ultimately care less about the price of an individual API call than about whether these enormous upfront bets can ever be recovered.

Microsoft’s Books Show Where the Money Goes

OpenAI does not operate its inference infrastructure alone. Microsoft, its largest investor and cloud partner, shoulders a significant share of the compute burden through Azure. The clearest window into those costs comes from Microsoft’s own financial disclosures. The company’s detailed 10-K filing for the fiscal year ended June 30, 2025, published by the U.S. Securities and Exchange Commission, outlines capital expenditure, depreciation, and cost-of-revenue dynamics that define the hyperscale cloud business. While the document does not isolate OpenAI-specific unit economics, it provides an authoritative look at the broader infrastructure cost environment that makes AI inference possible at this scale, especially as Microsoft highlights rising investments tied to AI services.

What the filing makes clear is a company spending aggressively on data center buildouts while managing growing depreciation charges on the hardware it has already deployed. Cloud segment economics in the 10-K reflect the tension between accelerating AI-driven demand and the physical limits of how quickly new capacity can come online. For every ChatGPT prompt that flows through Azure, Microsoft absorbs a share of the cost through higher capital spending, shorter useful lives for servers and networking gear, and increased operating expenses for power and maintenance. The financial burden does not vanish just because users pay a flat subscription. Someone is absorbing the difference, and Microsoft’s balance sheet underscores that the gap between what users pay and what the infrastructure costs remains substantial, at least in the near term.

Energy Costs That Dwarf the Subscription Fee

Electricity is the most persistent and least visible cost embedded in every AI-generated response. A single data center cluster running inference workloads can consume as much power as a small city, and that consumption continues whether or not every GPU is fully utilized at a given moment. When OpenAI negotiates deals involving gigawatts of capacity, it is effectively locking in energy commitments that will define its cost structure for years. The price of that electricity varies by region, grid reliability, and the mix of renewable versus fossil fuel sources, but the sheer volume required means even modest fluctuations in energy prices translate into hundreds of millions of dollars in annual operating costs. This is why AI providers increasingly pay attention to where they site new facilities and how they secure long-term power purchase agreements.

For context, one gigawatt of sustained power is often described as enough to supply roughly 750,000 homes, depending on local consumption patterns and assumptions. AI companies are now competing with utilities, manufacturers, and municipalities for access to that power, especially in regions with abundant renewables or relatively low wholesale prices. The result is a new kind of infrastructure arms race where the ability to secure cheap, reliable electricity is as strategically important as the quality of the underlying model. Users who pay $20 per month for a ChatGPT subscription are covering only a fraction of the energy cost associated with their usage. The rest is effectively subsidized by investor capital, higher-margin cloud revenue from enterprise customers, and the expectation that AI workloads will eventually generate enough economic value to justify the upfront and ongoing energy expenditure.

Why Token Pricing Obscures the Full Picture

OpenAI and its competitors publish per-token pricing for their APIs, and those numbers have fallen dramatically over the past two years. That decline has created a perception that AI inference is becoming cheap and will soon resemble the near-zero marginal cost economics of serving static web pages. In one narrow sense, it is true: the marginal cost of generating a single token has dropped as models become more efficient, hardware improves, and software stacks are optimized. But marginal cost and total cost are not the same thing. The per-token price a developer sees on an API pricing page does not include the amortized cost of building the data center, securing the land, negotiating the power purchase agreement, or acquiring the chips that make the whole system run, nor does it reflect the financing costs associated with those investments.

This gap between visible pricing and invisible infrastructure cost is where the most common misunderstanding about AI economics takes root. A ChatGPT prompt that costs a fraction of a cent in incremental compute still depends on a capital stack that includes billions in chip procurement, years of construction timelines, and energy contracts that stretch into the next decade. The true cost of that prompt is not what OpenAI charges for it. The true cost is the sum of every upstream investment required to make the response possible, divided across every query the system will ever handle. That denominator is growing fast as usage increases, but so is the numerator as companies race to deploy larger models and more capacity. The balance between these forces will determine whether AI providers can keep cutting prices while still moving toward profitability, or whether today’s low token rates amount to a temporary subsidy funded by investors betting on future dominance.

What This Means for the AI Industry’s Financial Future

The capital intensity of AI inference creates a structural challenge that no amount of model optimization can fully solve. Even as algorithms become more efficient, the demand for compute is growing faster than efficiency gains can offset because new applications tend to use more tokens, more context, and more complex models. OpenAI’s pursuit of a direct equity stake in a major chipmaker signals that the company views hardware supply as a strategic bottleneck worth billions to secure. That is not the behavior of a company operating in a low-cost, commodity environment; it reflects a belief that control over the upstream supply of accelerators will shape who can afford to serve the next trillion prompts. Similar dynamics show up in Microsoft’s financial disclosures, where elevated capital expenditures and depreciation tied to AI infrastructure suggest that the industry is still in the heavy-investment phase rather than the cash-harvesting stage.

For users and developers, the implication is that today’s AI pricing may not fully reflect the long-run cost of the underlying infrastructure. If investor subsidies eventually taper off, providers could face pressure either to raise prices, introduce more aggressive usage tiers and limits, or find new revenue streams that cross-subsidize intensive workloads. At the same time, the enormous fixed costs create high barriers to entry, favoring a small number of hyperscale players that can afford multibillion-dollar chip commitments and gigawatt-scale energy deals. The price of a ChatGPT subscription, in other words, is the tip of a much larger economic iceberg, one built from silicon, steel, and power lines as much as from code. Whether that iceberg ultimately supports a sustainable business model or becomes a monument to overbuilt optimism will depend on how quickly real-world demand for AI-powered services catches up to the infrastructure already being put in place.

More from Morning Overview

*This article was researched with the help of AI, with human editors creating the final content.