Apple is set to open its Worldwide Developers Conference on Monday with a rebuilt Siri assistant powered not by its own models alone but by a custom Google Gemini model with roughly 1.2 trillion parameters, according to people with knowledge of the matter. The deal is valued at about $1 billion a year and represents a dramatic shift for a company that has long built its software stack in-house. The keynote is expected to be Tim Cook’s final as CEO, adding a personal dimension to a product announcement that already carries heavy strategic weight.
A $1 Billion Licensing Gap Between Apple’s Published Models and Siri’s New Engine
The gap between what Apple has documented publicly and what it now needs to ship tells the story. Apple’s own technical paper on its Intelligence Foundation Language Models, available on arXiv, describes an on-device model of about 3 billion parameters paired with a larger server model designed for Private Cloud Compute. Those models handle tasks like summarization, notification triage, and short text generation on iPhones and iPads. They were never designed to manage the kind of open-ended, multi-turn conversations that users increasingly expect from AI assistants.
A 3-billion-parameter on-device model is efficient and fast, but it lacks the raw capacity to reason across long conversation threads, synthesize information from multiple domains, or handle the ambiguous follow-up questions that define natural dialogue. Apple’s Private Cloud Compute server model is larger, but the company has not published its parameter count or benchmark results against frontier-class systems. The decision to license a 1.2-trillion-parameter Gemini model from Google, as Bloomberg reported, suggests that Apple’s internal server model could not close the performance gap on its own.
That arithmetic matters for anyone who uses Siri daily. The difference between a 3-billion-parameter model and a 1.2-trillion-parameter model is not incremental. It is roughly a 400-fold increase in scale, and scale at that level typically translates into better performance on complex queries, code generation, creative writing, and factual recall. Apple appears to have concluded that building a model of that size internally would take longer and cost more than licensing one from Google, even at $1 billion a year.
What Apple’s Own Research Papers Reveal About the Capability Ceiling
Apple’s published research provides the clearest window into where its in-house models fall short. The foundation language models paper describes a system optimized for latency and privacy rather than raw reasoning power. The on-device model with about 3 billion parameters runs directly on Apple silicon, keeping user data local. The server model runs in Apple’s Private Cloud Compute environment, which routes queries through encrypted infrastructure so that Apple itself cannot read the data in transit.
Privacy is a genuine engineering achievement, but it comes with tradeoffs. Smaller models process queries faster and use less energy, which matters on a phone battery. They also produce less accurate results on tasks that require broad world knowledge or sustained logical chains. Apple’s paper does not include head-to-head benchmarks against Google’s Gemini or OpenAI’s GPT-4 class models, so the exact performance delta is not publicly quantified. The licensing deal itself, however, functions as an implicit benchmark. A company does not pay $1 billion a year for a capability it already possesses.
The research trail also highlights institutional backing. The preprint archive that hosts Apple’s AI papers is maintained by Cornell University, which oversees the service as part of a larger network of academic partners. That infrastructure has become Apple’s preferred venue for disclosing technical details that it does not discuss on earnings calls or in product marketing. At the same time, the archive’s operations depend partly on community support, and Cornell encourages readers to donate to sustain the service.
No Apple filing, earnings call, or press release has confirmed the Gemini licensing terms or the 1.2-trillion-parameter model size. Those details come from people with knowledge of the matter and remain uncorroborated by a second named source or official company statement. That leaves a gap between the capabilities described in Apple’s own research and the much larger system it is now said to be adopting for Siri.
Unresolved Questions About Privacy, Cost, and Control
The biggest open question is how Apple plans to reconcile its privacy commitments with a model hosted and trained by Google. Apple’s Private Cloud Compute system was built specifically to prevent third parties from accessing user queries. Routing Siri requests through a Google-trained model, even a custom version, introduces a dependency that Apple has spent years trying to avoid. Neither company has explained publicly how user data will be handled under the new arrangement, whether queries will be processed on Apple infrastructure using licensed Gemini weights, or whether some traffic will flow through Google’s servers.
Cost structure raises its own set of concerns. At about $1 billion a year, the Gemini license would rank among Apple’s largest recurring software expenses. Apple already pays Google billions annually to remain the default search engine on iPhones and iPads, a deal that has drawn antitrust scrutiny from the U.S. Department of Justice. Adding a second billion-dollar annual payment to Google deepens a financial relationship that regulators are already watching.
For developers building on Apple’s platforms, the practical question is which model will handle which tasks. If Apple’s on-device 3-billion-parameter system continues to power simple requests, while more complex queries are silently handed off to Gemini, application behavior could become harder to predict. A request that stays local on one device might be routed to the cloud on another, depending on context, phrasing, or available network bandwidth. That variability would complicate performance tuning and privacy guarantees for third-party apps that rely on Siri or system-level AI features.
Control is another unresolved issue. By tying Siri’s most advanced capabilities to a licensed model, Apple gives Google leverage over a core part of the iPhone experience. Any changes in pricing, usage limits, or technical roadmap on Google’s side could ripple through Apple’s products. Apple can mitigate that risk by continuing to invest in its own server-scale models, but rebuilding an equivalent to a 1.2-trillion-parameter system is a multi-year effort.
Strategic Implications for Apple, Google, and the AI Ecosystem
Strategically, the reported deal underscores how quickly AI has reshaped power dynamics in consumer technology. For more than a decade, Apple differentiated the iPhone through tight vertical integration: custom chips, custom operating systems, and proprietary services. Outsourcing the brain of Siri to a partner, even partially, cuts against that grain. It suggests that in the generative AI era, model scale and training data may matter more than owning every layer of the stack.
For Google, supplying the engine of Siri is both a revenue stream and a defensive move. If Apple had turned to another provider, Google risked seeing a rival’s model become the default assistant on hundreds of millions of devices. Licensing Gemini to Apple keeps its technology in front of users who might otherwise never touch a Google-branded app, and it reinforces Google’s argument that its AI offerings are essential infrastructure rather than just consumer products.
For the broader ecosystem, the partnership sends a mixed signal. On one hand, it validates the idea that even the largest platform companies may need to license frontier models rather than build everything in-house. On the other, it concentrates even more influence in the hands of a few AI providers whose systems quietly power services across competing platforms. Developers and regulators alike will be watching how Apple explains the arrangement onstage, how much technical detail it provides about data handling, and whether users get meaningful choices about which model processes their voice and text.
As Apple steps onto the WWDC stage, the story behind Siri’s new brain is not just about a bigger model or smarter assistant. It is about how far the company is willing to bend its long-standing principles on privacy, independence, and control to stay competitive in the AI race-and how much of that tradeoff it is prepared to share with the people who use its products every day.
More from Morning Overview
*This article was researched with the help of AI, with human editors creating the final content.