A team of researchers has built an AI system that predicts activity across the entire human brain from movies, speech, and text all at once, and a reported successor version claims to do it at 70 times the resolution. The original model, called TRIBE (TRImodal Brain Encoder), was developed for the Algonauts 2025 challenge and described in a preprint posted to arXiv (identifier 2507.22229) in late 2025. The claimed leap to TRIBE v2, attributed to work connected with Meta, has generated significant attention in the brain-computer interface community, though the technical details behind that 70x figure have not yet appeared in any published document.
How TRIBE v1 works
Most brain-encoding models tackle one job at a time: predict how the visual cortex responds to images, or how auditory regions react to sound. TRIBE takes a different approach. It uses a multimodal transformer, a type of deep learning architecture that processes multiple input streams in parallel, to handle visual, auditory, and language stimuli simultaneously. The model then predicts functional MRI (fMRI) responses across the whole brain, not just in isolated cortical patches.
fMRI measures blood-oxygen-level changes in the brain as a proxy for neural activity. The data is divided into voxels, essentially three-dimensional pixels, each representing a tiny cube of brain tissue. Traditional encoding models might predict responses for a few thousand voxels in a single brain region. TRIBE v1 attempts to cover the full volume.
The model was built and evaluated within the Algonauts 2025 challenge, a competition organized by neuroscience researchers at MIT and other institutions that asks teams to predict brain responses to naturalistic stimuli like movie clips. According to the preprint, TRIBE v1’s trimodal fusion strategy produced strong benchmark results in that competition framework, though independent confirmation of final standings from the challenge organizers has not been published as of June 2026.
The 70x resolution claim
Reports circulating in May and June 2026 describe a successor system, TRIBE v2, that maps brain activity at roughly 70 times the spatial resolution of its predecessor. If accurate, that kind of jump would mean the model can distinguish neural responses at a far finer grain, potentially differentiating activity patterns in brain subregions that previous models treated as a single unit.
But the claim comes with a significant caveat: no preprint, peer-reviewed paper, or official technical report for TRIBE v2 has appeared in the public record. The 70x figure traces back to secondary reporting rather than to a methods section that specifies what “resolution” means in this context, what baseline it improves upon, or how the measurement was conducted.
That ambiguity matters. In brain-imaging AI, “resolution” can refer to spatial granularity (how small a brain region the model can distinguish), temporal precision (how quickly it tracks changes), or simply the number of voxels predicted. A 70x gain in one of those dimensions would be impressive. A 70x gain across all of them would be extraordinary. Without a technical document, there is no way to evaluate which version of the claim is being made.
Meta’s role and what is actually confirmed
The headline association with Meta reflects reporting that connects the TRIBE line of research to the company, but the v1 preprint on arXiv does not include an explicit Meta corporate affiliation in its author list. No attributable public statements from Meta executives or named company researchers discussing TRIBE v2’s architecture, intended applications, or development timeline have surfaced as of June 2026.
What is confirmed is the technical lineage. The v1 preprint positions the original system as a foundation for higher-fidelity iterations, and the authors describe a clear development trajectory toward improved resolution and accuracy. The preprint is hosted on arXiv, the open-access repository operated by Cornell University, which means it is publicly available but has not undergone formal journal peer review.
For now, the strongest anchor point is the v1 paper itself. Researchers can examine its architecture, training data, evaluation methodology, and citations directly. Everything beyond that document, including the v2 branding, the 70x number, and the corporate attribution, rests on reporting that has not yet been matched by a primary technical source.
Why the distinction matters for brain-computer interfaces
Brain-computer interface (BCI) research has accelerated sharply over the past two years, with companies and academic labs racing to decode neural signals for medical, assistive, and eventually consumer applications. Higher-resolution brain models are a prerequisite for many of the field’s most ambitious goals: restoring speech to paralyzed patients, enabling fine motor control of prosthetics, or building non-invasive neural input devices.
A model that genuinely maps whole-brain fMRI responses at 70 times current resolution would represent a meaningful step toward those goals. It could allow researchers to decode not just broad cognitive states but specific intentions, sensory experiences, or linguistic content from brain scans. That has obvious clinical value and equally obvious commercial interest for companies investing in neural interface hardware.
But the history of neuro-AI is littered with headline numbers that did not survive independent replication. Performance gains measured on a single challenge dataset do not always generalize to new subjects, different scanners, or real-world conditions outside a controlled experiment. The competitive dynamics of the field, where publicity and funding often follow bold claims, make it especially important to distinguish between a demonstrated result and a reported one.
Verification gaps that remain open on TRIBE v2
The key document to look for is a TRIBE v2 preprint or formal publication. When it appears, it should specify the exact resolution metric being claimed, the baseline model it improves upon, the validation dataset, and whether the gains hold across different subjects and stimulus types. It should also clarify whether v2 retains the trimodal architecture of its predecessor or modifies the input pipeline.
Independent replication will matter just as much. Outside labs attempting to reproduce the results on their own fMRI data would provide the strongest evidence that the improvement is real and generalizable, not an artifact of overfitting to a particular dataset or scanning protocol.
Until that evidence arrives, TRIBE v1 stands as a verified and publicly available contribution to whole-brain encoding research. The v2 claims, while potentially significant, remain exactly that: claims awaiting the technical documentation that would allow the scientific community to evaluate them on their merits.
More from Morning Overview
*This article was researched with the help of AI, with human editors creating the final content.