Morning Overview

These mysterious genes may be older than all life on Earth

Researchers Gregory Goldman, Gregory Fournier, and Betul Kacar have identified a rare class of ancient gene families, called universal paralogs, that appear to have duplicated before the last universal common ancestor of all living things ever existed. Published in Cell Genomics, their analysis argues that these scarce genetic relics carry signals from a period of evolution that predates every branch on the tree of life, offering a direct window into the chemistry that preceded cellular organisms on Earth. Drawing on comparative genomics and phylogenetic tools, the team proposes that these genes are among the few surviving traces of the molecular systems that operated in a world of proto-cells and primitive genomes.

The new work builds on a growing effort to reconstruct conditions on the early Earth using genomic data instead of fossils, which are almost entirely absent for the first billion years of the planet’s history. By focusing on gene families that are both universal and duplicated, the researchers argue they can peel back the layers of later evolution and expose a deeper signal from the era before life split into bacteria, archaea, and eukaryotes. In that view, universal paralogs function like “time capsules” embedded in modern genomes, preserving clues about the earliest biochemical innovations that made life possible.

What Universal Paralogs Reveal

Most genes shared across all three domains of life (bacteria, archaea, and eukaryotes) trace back to a single common ancestor known as LUCA, or the last universal common ancestor. Universal paralogs are different. These are gene families present in at least two copies across nearly all modern genomes, and the duplication event that created those copies happened before LUCA itself emerged. That distinction matters because it means these genes carry phylogenetic signal from an era even older than the deepest root of the tree of life, allowing scientists to ask which molecular systems were already diversified before cellular lineages became distinct.

Compared to the hundreds of gene families typically attributed to LUCA, universal paralogs are much rarer, and that scarcity is precisely what makes them useful. Because they duplicated so early, each copy evolved independently across billions of years, accumulating differences that can be read like a molecular clock running in reverse. According to an Oberlin College summary of the work, the functions these paralogs handle cluster around two areas: protein production and membrane transport, suggesting that the machinery for building proteins and moving molecules across primitive boundaries was already diversifying before LUCA consolidated the genetic toolkit we see today.

Genes Older Than Their Own Ancestor

The idea that some genes in LUCA’s genome could be older than LUCA itself is not new, but it has been difficult to test rigorously. A scholarly treatment of the cenancestor concept explains that gene inventories attributed to the universal root can include components that originated in different pre-cenancestral epochs, reflecting a patchwork of lineages that exchanged and inherited genes long before a single, stable ancestor emerged. In plain terms, LUCA was not a blank slate; it inherited genes from earlier, now-extinct populations, and some of those genes had already duplicated and diverged before LUCA’s genome took shape.

Goldman, Fournier, and Kacar’s perspective builds on that foundation by arguing that universal paralogs preserve exactly this kind of pre-LUCA signal. By tracking these rare genes, researchers can investigate how early cells worked and what features of life emerged first, distinguishing which biological capabilities, such as translation, energy metabolism, and membrane integrity, were already in place before the tree of life began branching, and which ones evolved later within specific lineages. In doing so, they offer a framework for separating the truly ancient core of life’s machinery from later innovations layered on top of that foundation.

Rooting the Tree Without an Outgroup

One of the oldest puzzles in evolutionary biology is how to root the tree of life when, by definition, there is no outgroup organism sitting outside all living things. A 2015 analysis in Philosophical Transactions of the Royal Society B noted that rooting the tree of life cannot be achieved using an outgroup, but that universal duplicated genes offer a workaround by providing internal reference points. Because each paralog pair diverged before LUCA, one copy can serve as the outgroup for the other, allowing researchers to infer which branch of life diverged first without appealing to any external organism.

That technique has been used before, but the new Cell Genomics perspective sharpens the criteria for which gene families qualify and explains why earlier attempts sometimes produced conflicting trees. A separate large-scale analysis of RNA families by Hoeppner, Gardner, and Poole found that most RNA families are restricted to single domains, and the small fraction shared across domains often reflects horizontal transfer rather than vertical inheritance from a common ancestor. This reinforces the argument that only a narrow set of truly universal paralogs can reliably anchor the root, and that researchers need strict filters to separate genuine pre-LUCA duplicates from genes that spread sideways between lineages long after LUCA, potentially scrambling the signal that rooting methods depend on.

Viral Genes and the Pre-Cellular World

The story of pre-LUCA genetics does not end with cellular life. Viruses carry their own set of ancient genes that complicate the picture in revealing ways. Although viruses extensively exchange genes with their hosts, a distinct group of viral hallmark genes are shared by extremely diverse viruses to the exclusion of cellular life forms, hinting at a deep, virus-specific history. These hallmark genes include components of capsid formation and genome replication that appear to have no straightforward counterparts in cells, suggesting that viruses followed their own evolutionary trajectories even as they interacted closely with early cellular lineages.

On the other hand, detailed comparisons of viral and cellular proteins show that most essential viral genes have no close homologs among cellular genes, which complicates attempts to weave them neatly into the same pre-LUCA narrative as universal paralogs. Instead, researchers propose that ancient viruses and pre-cellular entities coexisted in a dynamic network of genetic exchange, with some genes passing into cellular genomes and others remaining confined to viral lineages. In this context, universal paralogs may represent the subset of early genes that were robust enough to survive both cellular evolution and viral predation, while many other primordial sequences vanished along with the ephemeral organisms that carried them.

Reconstructing Life Before LUCA

Taken together, the study of universal paralogs, domain-wide RNA families, and viral hallmark genes points toward a view of early evolution that is less like a single trunk and more like a braided river. Before LUCA, gene exchange and duplication appear to have been rampant, with multiple lineages sharing and reshuffling genetic material in a way that blurs the boundaries between ancestors. In such a world, a few especially successful gene families (those involved in protein synthesis, membrane transport, and genome maintenance) may have spread widely enough to become universal, leaving behind the faint signatures that researchers now detect as pre-LUCA paralogs.

By refining how those signatures are identified and interpreted, Goldman, Fournier, and Kacar’s work offers a more disciplined roadmap for exploring that hidden chapter of evolution. Their approach does not claim to reconstruct every detail of life before LUCA, but it does carve out a testable space where hypotheses about early biochemistry, environmental conditions, and the origins of key cellular systems can be anchored in comparative genomics. As new genomes are sequenced and analytical methods improve, universal paralogs and their viral counterparts may help transform what was once pure speculation about the dawn of life into a field grounded in measurable, reproducible patterns written into the DNA of every organism alive today.

More from Morning Overview

*This article was researched with the help of AI, with human editors creating the final content.