The race for the singularity as Moore’s law slows

The prediction that transistor counts on microchips would keep doubling every two years gave the tech industry its growth engine for decades. That engine is losing speed. As physical limits squeeze traditional chip scaling, the companies racing to build ever-larger artificial intelligence systems face a harder question: how do you reach superintelligence when the hardware gains you depend on are shrinking?

A Sixty-Year Forecast Hits Physical Walls

In 1965, engineer and businessman Gordon Moore observed a pattern in semiconductor manufacturing that would shape the next half-century of computing. The trend he identified, later called Moore’s Law, holds that the number of transistors in a microchip will double roughly every two years, an observation that has been widely cited in explanations of how chip density has historically scaled. For most of its history, the observation tracked reality closely enough that chip designers, software developers, and entire business models could plan around it.

That reliability is now in question. Transistors at the leading edge of fabrication have shrunk to dimensions measured in just a few nanometers, approaching atomic scale. Heat dissipation, quantum tunneling, and manufacturing yield problems all compound at these sizes. An analysis by the OECD directly addresses this slowdown, documenting how transistor-density scaling is decelerating and forcing the industry toward other innovation vectors, including systems-level design, cloud infrastructure, and specialized hardware. The gains have not stopped entirely, but they no longer arrive on the predictable schedule that planners once counted on.

Researchers at institutions like Cornell University have also highlighted that as feature sizes shrink into the single-digit nanometer range, variability and reliability issues make further miniaturization disproportionately expensive. In other words, every additional “generation” of chips demands more capital and more ingenuity for smaller performance returns. The old rhythm of automatic, cheap speedups is gone.

Why AI Labs Cannot Simply Wait for Faster Chips

The timing of this slowdown is awkward. Large language models like GPT-4 have demonstrated that raw scale (more parameters trained on more data with more compute) can produce striking capability jumps. The technical documentation for GPT‑4 describes evaluation results and high-level scaling behavior, along with safety methodology. Yet the report withholds specifics about the compute resources behind those results, a gap that makes it difficult for outside researchers to assess how much of the progress came from hardware brute force versus algorithmic cleverness.

That distinction matters enormously. If frontier AI progress depends mainly on stacking more GPUs in larger data centers, then a slowdown in chip performance gains translates directly into slower AI progress, or dramatically higher costs to maintain the same pace. If, on the other hand, software and algorithmic improvements can compensate, the hardware ceiling becomes less binding. The available evidence suggests both forces are at work, but neither alone is sufficient.

Compounding the challenge, training state-of-the-art models already consumes vast energy and capital. As the easy wins from hardware taper off, each incremental model size increase risks running into practical limits such as power delivery, cooling, and data center land use. The industry’s implicit bet that “future chips will make this cheap” no longer looks safe.

The Post-ITRS Search for New Computing Paths

The semiconductor industry itself recognized the problem years ago. The International Technology Roadmap for Semiconductors, which had guided chip development priorities for decades, effectively ended as it became clear that simple geometric scaling would not continue indefinitely. In its wake, new efforts emerged. A study in Springer’s New Generation Computing journal describes how initiatives like IEEE Rebooting Computing launched alongside the International Roadmap for Devices and Systems, or IRDS, which emphasizes novel architectures, devices, and applications rather than simply cramming more transistors onto existing designs.

This shift carries a significant implication for the AI race. When the primary path to faster computing was shrinking transistors, progress was concentrated among the handful of foundries capable of cutting-edge fabrication. A pivot toward architectural innovation, including neuromorphic chips, optical computing, and domain-specific accelerators, could distribute progress more widely. Smaller labs and university groups can design novel chip architectures even if they cannot afford to build a fabrication plant. Whether that potential translates into real competition with the largest AI companies remains an open question, but the structural conditions for it are stronger than they were a decade ago.

Scaling Laws Are Breaking Across the Board

The problem extends beyond transistor counts. Dennard scaling, which predicted that power density would stay constant as transistors shrank, broke down years before Moore’s Law began faltering. That breakdown is why modern processors hit thermal walls and why clock speeds plateaued in the mid-2000s. A 2023 paper by Satoshi Matsuoka, Jens Domke, Mohamed Wahib, Aleksandr Drozd, and Torsten Hoefler in the International Journal of High Performance Computing Applications directly confronts the myths that persist around endless scaling, arguing that the current era reflects a turning point driven by the end of several classical scaling laws.

The authors’ conclusion is blunt: simply buying bigger machines will not indefinitely deliver proportional performance gains. Instead, they call for “algorithmic scaling,” where improvements in numerical methods, parallelization strategies, and communication patterns yield more work from the same hardware budget. For AI, that translates into techniques like more efficient optimizers, better initialization schemes, sparsity, and smarter data selection.

This is where the dominant narrative about AI progress deserves scrutiny. Much of the public conversation assumes a straight line from GPT-4 to artificial general intelligence, powered by ever-larger training runs. But if the scaling laws that made those training runs affordable are breaking, the straight line bends. The cost of each incremental capability gain rises. That does not mean progress stops. It means the character of progress changes, favoring efficiency, cleverness, and new hardware designs over sheer scale.

What Fills the Gap After Traditional Scaling

Researchers at MIT’s Computer Science and Artificial Intelligence Laboratory have framed this moment as “the death of Moore’s Law,” asking what might fill the gap once transistor scaling slows, and pointing to a mix of architectural and software advances as the most realistic path forward. For AI, several candidates are already visible.

First, specialized accelerators tailor silicon to the needs of machine learning workloads. Instead of general-purpose CPUs, training clusters increasingly rely on GPUs and custom chips optimized for matrix multiplication, low-precision arithmetic, and high-bandwidth memory. These designs squeeze more useful operations out of each transistor, partially offsetting the slowdown in raw scaling.

Second, distributed systems engineering has become a central lever. Sophisticated scheduling, model-parallel training, and communication-efficient algorithms can reduce idle time and network bottlenecks, effectively turning a collection of imperfect nodes into a more capable whole. Innovations in cluster design, such as hierarchical interconnects and disaggregated memory, offer further gains that do not depend on smaller transistors.

Third, algorithmic efficiency is beginning to catch up with hardware ambition. Techniques like mixture-of-experts architectures, where only a subset of parameters is active for each input, promise to decouple model capacity from compute cost. Compression, distillation, and quantization can shrink trained models for deployment while preserving most of their performance. These methods echo the call from high-performance computing researchers to treat algorithms as a primary scaling axis, not an afterthought.

Finally, entirely new device paradigms (neuromorphic chips, analog in-memory computing, and photonic circuits) are being explored as longer-term bets. Their timelines and ultimate impact are uncertain, but they reflect a broader realization: the era when Moore’s Law alone could underwrite exponential AI growth is ending. Future breakthroughs will likely emerge from a dense tangle of hardware innovation, software ingenuity, and system-level design, rather than a single clean curve on a transistor chart.

For AI labs aiming at superintelligence, that means revising expectations. The next leaps may come less from doubling model size and more from rethinking how computation itself is organized. The race is no longer just to build the biggest machine, but to make every joule and every transistor count.

More from Morning Overview

*This article was researched with the help of AI, with human editors creating the final content.

IG

FB

PIN

LI

X

The race for the singularity as Moore’s law slows

A Sixty-Year Forecast Hits Physical Walls

Why AI Labs Cannot Simply Wait for Faster Chips

The Post-ITRS Search for New Computing Paths

Scaling Laws Are Breaking Across the Board

What Fills the Gap After Traditional Scaling

Author

Get weekly updates with the latest news and tips!

More in AI

IG

FB

PIN

LI

X