An AI sifter called RAVEN just pulled 10,000 candidate exoplanets out of old NASA TESS data — confirming 100+ new worlds and 31 nobody had spotted before

A machine-learning pipeline built at the University of Warwick has sifted through roughly four years of archived NASA TESS full-frame images and identified 118 newly validated exoplanets, 31 of which had never been flagged as candidates by any prior search. The tool, called RAVEN (RAnking and Validation of ExoplaNets), also produced more than 2,000 additional high-probability planet candidates now awaiting follow-up observations. The results show that better software can keep pulling new worlds out of data that telescopes collected years ago, without requiring a single additional hour of telescope time.

What RAVEN found in 2.26 million stars

The research team applied RAVEN to TESS SPOC (Science Processing Operations Center) light curves drawn from sectors 1 through 55, covering observations gathered between mid-2018 and mid-2022. The pipeline searched approximately 2.26 million stars characterized by Gaia for periodic dips in brightness consistent with a planet crossing in front of its host star. It targeted orbital periods between 0.5 and 16 days, a range that captures the short-period worlds TESS is best suited to detect during its repeated sky sweeps.

Out of that enormous sample, RAVEN initially flagged roughly 10,000 transit-like signals. Many of these were quickly rejected as obvious stellar variability, instrumental artifacts, or low-significance events. After automated vetting and statistical validation, the team reported 118 newly validated planets and more than 2,000 high-probability candidates that cleared the pipeline’s quality thresholds but still need independent confirmation. Of the 118 validated worlds, 31 were entirely new detections, meaning they did not appear in existing TESS Objects of Interest lists or in the broader literature. The peer-reviewed catalog published in Monthly Notices of the Royal Astronomical Society lays out the full sample alongside the validation statistics for each object.

The newly validated planets span a range of sizes and orbital configurations, but they are mostly short-period worlds hugging their host stars. Many fall into the super-Earth and sub-Neptune size regime, where planets are larger than Earth but smaller than Neptune. A smaller subset appears to be roughly Earth-sized, though their close-in orbits likely make them far hotter than our own planet. Because TESS is optimized for detecting frequent, shallow dips in brightness, these compact systems are precisely where an algorithm like RAVEN can make the biggest impact.

How the Bayesian and ML framework works

RAVEN is not a single algorithm but a two-stage framework. The first stage runs a box-least-squares search across detrended light curves to identify periodic dimming events. This step is deliberately permissive, sweeping up any repeating signal that could plausibly be a transit. The second stage applies a Bayesian and machine-learning validation layer that weighs the probability of a genuine planet transit against several astrophysical false-positive scenarios, including eclipsing binary stars, background eclipsing binaries blended into the target aperture, and purely instrumental noise. The methods description details how the team trained the classifier on both simulated planet injections and real false-positive signals drawn from known contaminants in the TESS data set.

In practice, RAVEN computes a set of summary statistics for each candidate signal: transit depth and duration, how well the events line up in phase, how the inferred planet radius compares to expectations for the host star, and whether the shape looks more like a flat-bottomed planet transit or a V-shaped stellar eclipse. These features feed into a probabilistic model that assigns posterior probabilities to competing hypotheses. Only signals that exceed stringent planet-probability thresholds, and that are inconsistent with the most common false-positive scenarios, graduate to the “validated planet” category.

Before the team turned RAVEN loose on the full archive, they benchmarked it against objects already cataloged in existing exoplanet databases. The pipeline recovered known planets at high accuracy rates, which gave the researchers confidence that its new detections were credible rather than artifacts of an overly permissive classifier. That calibration step matters because automated planet searches face a constant tension: cast the net too wide and false positives flood the results; set the threshold too tight and real planets slip through. By explicitly quantifying this trade-off with Bayesian statistics, RAVEN aims to keep both missed planets and spurious detections under control.

Why old data still hold new planets

One of the most striking aspects of the RAVEN results is that all of the discoveries came from data that had already been combed by previous pipelines. TESS has now observed nearly the entire sky, and its early sectors have been in the public archive for years. Yet the Warwick team’s re-analysis shows that algorithmic improvements alone can reveal dozens of additional planets and thousands of strong candidates in those same pixels.

This is partly because early TESS searches tended to focus on brighter stars and higher signal-to-noise transits, leaving marginal signals for later work. It is also because machine-learning techniques have grown more capable at separating subtle planetary signatures from stellar activity and spacecraft systematics. As RAVEN demonstrates, a careful combination of physical modeling and data-driven classification can squeeze more information out of each light curve than classical methods working in isolation.

What remains uncertain

Several open questions surround the RAVEN results. The NASA Exoplanet Archive has not yet published updated dispositions for the 31 newly detected planets, so their official confirmation status is still pending. Likewise, public follow-up databases show limited spectroscopy or high-resolution imaging for most of the 118 validated worlds. Without radial-velocity mass measurements or adaptive-optics imaging to rule out faint background stars, some fraction of these candidates could still turn out to be false positives masquerading as small planets.

NASA has not issued a public statement confirming that the RAVEN candidates have been integrated into official TESS Object of Interest lists. The University of Warwick press material describes the release of interactive tools and catalogs, but the pathway from validated candidate to confirmed planet typically requires independent observational evidence that goes beyond what any single pipeline can provide on its own. For now, RAVEN’s “validated” label should be read as a strong statistical endorsement rather than a final, community-wide verdict.

The orbital-period window of 0.5 to 16 days also means RAVEN’s search was limited to close-in planets. Worlds on longer orbits, including those in the habitable zones of Sun-like stars, fall outside this particular sweep. Future extensions of the pipeline to longer-period signals would face steeper challenges because fewer transits appear in any given TESS sector, making statistical validation harder and more reliant on external follow-up.

How to read the evidence

The strongest evidence here comes from the peer-reviewed journal article and its companion methods paper, both of which supply reproducible metrics: star counts, period ranges, posterior probability thresholds, and performance benchmarks against known catalogs. These are primary sources that other research groups can test by re-running the code on the same light curves and checking whether they recover the same ensemble of planets and candidates.

Readers should distinguish between three levels of certainty. First are the 118 statistically validated planets, which clear RAVEN’s internal thresholds and show no obvious signs of being eclipsing binaries or artifacts. Second are the more than 2,000 high-probability candidates that look promising but did not meet every validation criterion; these are prime targets for telescopes capable of radial-velocity measurements or high-resolution imaging. Third are the many thousands of lower-quality signals that RAVEN rejected, which are unlikely to be planets under any reasonable interpretation of the data.

Within this framework, the claim that RAVEN has uncovered “dozens of hidden planets” in existing TESS data is well supported by the statistical analysis and by the method’s performance on known systems. What remains to be seen is how many of the pipeline’s additional candidates will survive the scrutiny of independent observations. As follow-up campaigns proceed, the community will be able to refine occurrence rates for short-period planets and to test how far sophisticated machine-learning tools can push the frontier of exoplanet discovery in archival data.

More from Morning Overview

*This article was researched with the help of AI, with human editors creating the final content.

IG

FB

PIN

LI

X

An AI sifter called RAVEN just pulled 10,000 candidate exoplanets out of old NASA TESS data — confirming 100+ new worlds and 31 nobody had spotted before

What RAVEN found in 2.26 million stars

How the Bayesian and ML framework works

Why old data still hold new planets

What remains uncertain

How to read the evidence

Author

Get weekly updates with the latest news and tips!

More in AI

IG

FB

PIN

LI

X