Google CEO Sundar Pichai stood on the I/O 2026 keynote stage and introduced Gemini Spark, a personal AI agent that keeps working on digital tasks around the clock, even after a user shuts their laptop or turns off their phone. The agent runs on dedicated virtual machines inside Google Cloud, not on the device itself, which means it can book flights, compare prices, or manage calendar conflicts while the owner sleeps. That architectural choice, shifting always-on AI processing from personal hardware to shared cloud infrastructure, opens a new set of security and energy questions that neither Google nor independent researchers have fully answered.
Why a 24/7 cloud agent changes the security equation
Pichai described Spark as a “24/7” personal AI agent during his opening keynote, emphasizing that users “don’t need to keep your laptop open” because the system operates on dedicated virtual machines on Google Cloud. That single design decision carries a direct consequence: the attack surface for prompt injection and memory manipulation shifts from a user’s local browser or operating system to a shared cloud environment managed by Google.
Academic security research on background execution patterns supports that concern. A study of long-running agents using heartbeat-style loops found that such systems can suffer from silent memory pollution, where an agent’s persistent state is gradually corrupted by malicious inputs without the user noticing. The paper is not specific to Google, but its findings map directly onto any system that maintains a continuous execution loop in the cloud on behalf of a consumer. When thousands or millions of agents share the same virtual-machine infrastructure, a successful injection technique could scale far beyond what is possible on a single laptop.
In traditional, device-bound workflows, users introduce natural breaks. They close browser tabs, put laptops to sleep, or restart phones, which effectively reset many in-memory contexts. A 24/7 cloud agent, by design, reduces those resets. If Spark maintains long-lived task state across hours or days, adversarial content hidden in a webpage, document, or email could influence the agent long after the original interaction. Without clear guardrails, the line between a helpful autonomous assistant and a compromised automation pipeline becomes blurry.
The hypothesis worth tracking is whether continuous cloud-based agents will concentrate the majority of consumer prompt-injection incidents into shared VM environments within the first year of broad availability. Local devices have natural circuit breakers; cloud agents do not unless their designers build explicit limits on context duration, memory retention, and external tool access. Google has said Spark is “designed to check with you before taking action,” but the company has not published technical details on how it isolates agent memory between sessions or detects injection attempts mid-task. For now, users are being asked to trust that the same cloud that keeps Spark running nonstop can also keep its long-lived memory safe.
Spark’s architecture and rollout timeline
Gemini Spark runs on Gemini 3.5 and is built on a platform Google calls Antigravity, according to the company’s official announcements. Josh Woodward, VP of Google Labs, Gemini app and AI Studio, described the product as a system meant to proactively manage tasks under user direction, framing it as the next step in making the Gemini app more agentic. The rollout begins with trusted testers and then expands to a beta for Google AI Ultra subscribers in the United States, giving Google a staged path to observe behavior before a wider release.
Spark fits into a broader category of web agents that can navigate sites, fill out forms, and interact with online services on a user’s behalf. Researchers analyzing web agent energy use explicitly reference earlier Google experiments in this space as examples of systems that blend browsing, automation, and large language models. That work predates Spark and does not include performance data on the new agent, but it establishes that web agents as a category carry measurable compute and energy costs that scale with task complexity and session length. A 24/7 agent, by definition, runs the longest possible sessions and can chain many subtasks together.
Google has not released independent benchmarks for Spark’s actual energy draw on its Cloud VMs. The company also has not disclosed how many virtual machines it expects to provision per user, whether idle agents consume baseline resources, or how it plans to scale infrastructure as the subscriber base grows beyond initial testers. Those gaps matter because the energy cost of always-on agents is not just an environmental question. It directly affects pricing and product design. If per-user compute costs are high, Google AI Ultra subscription fees could rise, or the company could limit how many concurrent tasks Spark handles, how long sessions persist, or how often the agent polls external services.
The architecture also raises questions about data locality and compliance. Running Spark on shared cloud infrastructure means user data may traverse multiple regions and services as the agent works through tasks like travel booking or invoice processing. Google has not yet detailed how Spark’s virtual machines align with regional data residency controls, nor how long intermediate artifacts-screenshots, parsed HTML, or tool-call logs-are retained. For businesses considering Spark for work accounts, those implementation details will be as important as headline capabilities.
What Google has not addressed about always-on agents
Three specific gaps stand out in the current disclosure. First, Google’s official posts contain no direct response to the class of background-execution security risks identified in academic research on heartbeat-style loops. The company says Spark checks with users before acting, but confirming an action is not the same as detecting whether an agent’s memory has been tampered with between check-ins. If an attacker can slowly bias Spark’s internal state, the confirmation step may simply rubber-stamp a decision that has already been steered in the wrong direction.
Second, no early tester feedback or governance logs have been made public. The trusted-tester phase is usually when real-world edge cases surface: what happens when Spark encounters a CAPTCHA, a two-factor authentication prompt, or a website that actively blocks automated agents? Does the agent gracefully hand control back to the user, or does it retry until it hits rate limits? Google has described the product’s capabilities in broad terms but has not explained how it handles these friction points that web agents routinely face, or how it prevents users from inadvertently violating terms of service on sites that restrict automation.
Third, the energy and cost profile of continuous operation remains unmeasured for this specific product. The academic work on web agent sustainability provides a framework for estimation, but Spark’s actual resource footprint depends on implementation choices that Google has not disclosed: how aggressively the system compresses or discards old context, how often it wakes up to poll for updates, and whether idle agents are fully suspended or kept in a warm state for fast response. Without those details, policymakers and customers cannot assess whether a proliferation of always-on personal agents will meaningfully increase data-center load or remain a marginal addition to existing AI services.
Those omissions do not mean Spark is inherently unsafe or unsustainable. They do, however, highlight how quickly the industry is moving from request–response chatbots to persistent, autonomous systems whose behavior is harder for end users to audit. A person can reread a single chatbot answer; they cannot easily reconstruct every web page, API call, and intermediate decision an always-on agent processed overnight. That asymmetry makes transparency and independent evaluation more important, not less.
As Gemini Spark moves from trusted testers to paying subscribers, Google will be under pressure to provide more than marketing language. Clear documentation of security boundaries, memory isolation strategies, and energy usage would allow researchers to test whether the risks identified in early web-agent studies show up in practice. Until then, potential users face a trade-off: accept the convenience of a tireless assistant that keeps working after they log off, or wait for stronger evidence that the cloud infrastructure behind it can keep both their data and the planet’s resources in balance.
More from Morning Overview
*This article was researched with the help of AI, with human editors creating the final content.