
Apple is turning the flat photo into a new computing primitive. With its SHARP model, the company says a single snapshot can be transformed into a navigable 3D scene in less than a second on standard hardware, collapsing what used to be a complex graphics pipeline into a near-instant AI inference step. That speed, paired with photorealistic output, positions SHARP as a foundational tool for everything from mobile photography to AR headsets.
Instead of demanding depth sensors or multi-camera rigs, SHARP infers the missing geometry from one image and then renders new viewpoints that stay faithful to the original frame. In practice, that means a portrait, a holiday landscape, or a product shot can suddenly be explored from slightly different angles, with realistic parallax and lighting, without any extra capture work from the user.
What SHARP actually is: Apple’s new 3D engine in code form
At its core, SHARP is not a consumer app but a research-grade engine for what computer vision scientists call monocular view synthesis, the process of generating new camera views from a single input image. Apple describes the work under the full title Sharp Monocular View Synthesis, emphasizing that the system is designed to synthesize a photorealistic 3D scene from one photograph while preserving sharp details that typically blur in similar models. Instead of treating the image as a flat texture, SHARP reconstructs a scene representation that can be rendered from slightly shifted viewpoints, which is why the resulting motion feels like a real camera move rather than a simple 2D pan.
Apple has also shipped SHARP as a full software project, not just a paper, which matters for developers who want to experiment with the technology directly. The company’s repository on ml-sharp documents the architecture and provides code that accompanies the research, making it possible to reproduce the model’s behavior and integrate it into pipelines that already handle image processing or 3D content. By publishing both the theory and the implementation, Apple is signaling that SHARP is meant to be scrutinized, benchmarked, and, ultimately, built upon.
How Apple is pitching SHARP: “Under a second” from photo to 3D
Apple is framing SHARP as a breakthrough in both speed and accessibility, highlighting that the model can convert 2D images into 3D scenes in under a second on widely available hardware. In coverage of the launch, the company is described as introducing SHARP as an AI tool that can transform a single picture into a 3D representation in less than a second, a claim that is central to the way Apple Unveils AI Tool That Converts 2D photos into 3D in “Under” a second, with the piece explicitly tying the announcement to Dec and naming Ola and Hassan Bolaji as the byline. That “under a second” promise is not just a marketing flourish, it is what makes SHARP feel like a natural extension of everyday photography rather than a specialist graphics workflow.
Apple is also leaning on the idea that SHARP works with “any” image, not just carefully staged or depth-annotated shots, which is why the company and outside observers keep returning to the phrase that it can turn “Images Into” 3D scenes almost instantly. The framing around the 19.12 launch window underscores how Apple wants SHARP to be seen as part of a broader Dec wave of AI features that run close to real time, rather than as a slow offline renderer that only professionals would tolerate. By anchoring the narrative in speed and simplicity, Apple is clearly targeting both casual users and developers who want to add 3D flair without adding friction.
The research paper and the people behind it
Behind the marketing language sits a formal research effort that Apple has chosen to surface in detail. The company’s own documentation notes that the software project accompanies the paper titled Sharp Monocular View Synthesis in Less Than a Second, explicitly crediting Lars Mes as a key author. That pairing of a named paper and a working codebase is typical of Apple’s recent AI research strategy, where the company publishes technical advances that can later be distilled into consumer features across iOS, macOS, and its hardware line.
The “Less Than” a second claim in the paper’s title is not just a performance boast, it is a constraint that shapes the model’s architecture and training regime. To hit that target, the researchers had to balance the complexity of the 3D representation with the need to run on a single GPU without exotic hardware, which is why the GitHub project includes specific instructions for how to create and run the environment with flags like “create -n sharp python=3.13” for reproducibility. By naming Lars Mes and tying the work to a concrete repository, Apple is inviting the research community to evaluate SHARP on its merits rather than treating it as a black box.
How SHARP actually turns pixels into depth
From a technical perspective, SHARP’s core trick is learning to infer depth and geometry from patterns that appear across millions of training images, then using that knowledge to reconstruct a plausible 3D scene from a single frame. Apple’s own research overview describes SHARP as a system for generating 3D scenes from 2D photos that is particularly good at discerning common depth patterns, a capability highlighted in a broader roundup of AI and ML research. Instead of explicitly reconstructing a full 3D mesh, SHARP learns a representation that can be rendered into new views while preserving fine details like hair, foliage, and text that often smear in other models.
One way to understand the output is to think of the image as being broken into millions of tiny colored points that each carry an estimated depth and orientation. Reporting on the model explains that when you put millions of these spots together, they can recreate a 3D scene that looks real from a particular viewing angle, and that SHARP can generate these new views in less than a second while maintaining realistic output, as detailed in coverage that explicitly starts with the word When. By learning how depth usually behaves in everyday scenes, SHARP can hallucinate the missing geometry in a way that feels convincing, even though it never actually saw the scene from another angle.
Performance: under a second on a standard GPU
Speed is where SHARP separates itself from earlier academic systems that could take minutes to render a single new view. Apple’s own description emphasizes that the model runs in under a second on a standard GPU, a point echoed in reporting that notes that But Apple built SHARP to predict the scene from just one photo in under a second on a typical GPU, rather than relying on specialized accelerators. That performance envelope is crucial if SHARP is ever to be embedded in consumer devices, where latency and battery life are unforgiving constraints.
Apple’s own demo material reinforces the idea that SHARP is not trading quality for speed. The project page notes that SHARP synthesizes a photorealistic 3D scene from a single image and illustrates the effect on photographs from Unsplash, showing subtle camera moves that reveal parallax around foreground objects while keeping textures crisp. Running that kind of view synthesis in less than a second means the model can be used interactively, for example to scrub through different viewpoints with a finger on a touchscreen or to update the scene in sync with head movements in a headset.
What SHARP means for iPhone photos and everyday users
For everyday photography, SHARP hints at a future where every shot taken on an iPhone or iPad quietly doubles as a lightweight 3D capture. Earlier work in iOS 26 already made it possible to turn two-dimensional photos into simple 3D-like effects, and Apple’s research summary explicitly connects SHARP to that lineage of instant 3D image conversion, describing how the model can generate 3D scenes from 2D images as part of a broader set of Here AI features. In practice, that could mean Live Photos that no longer just play back a short clip but instead let users nudge the camera angle slightly, or Portrait mode shots that can be reframed in 3D after the fact.
Because SHARP works from a single frame, it can also be applied retroactively to existing photo libraries, not just new captures. Reporting on the model notes that it contains examples that illustrate how well the system can translate regular photographs to a three-dimensional representation that stays close to the original photo, a point made explicitly in coverage that references It also contains such examples. That backward compatibility is what makes SHARP feel like a potential system-level feature rather than a niche creative tool, because it can breathe new life into years of existing images without asking users to change their habits.
Creative workflows: from 2D assets to 3D content
For artists, designers, and game developers, SHARP offers a shortcut from flat assets to 3D-ready content that can slot into existing pipelines. Coverage of the model points out that Apple has created an AI system that can turn 2D photos into 3D images and explains how it will work in practice, with the piece explicitly titled Apple creates AI model that can turn 2D photos into 3D images and mentioning TOI, Tech Desk, and TIMESOFINDIA in the byline. That reporting notes that the model is designed to preserve consistent scaling in the generated images, which is critical if the output is to be composited into larger scenes without visual glitches.
Because SHARP can generate multiple viewpoints that remain consistent with each other, it can serve as a starting point for 3D assets in tools like Blender, Unity, or Unreal Engine, where artists might use the synthesized views as reference or as input to reconstruction algorithms. Some coverage has already highlighted that Some
AR, VR, and the race for spatial computing
SHARP’s most obvious long-term impact is on augmented and virtual reality, where believable 3D content is both essential and expensive to produce. Reporting on the launch notes that Apple just dropped SHARP as an AI model that transforms any single 2D photo into a realistic 3D scene in seconds and explicitly frames it as a building block for future AR and VR tech, with the piece labeled Technology Dec 19, 2025 and naming Apple and SHARP in the same breath. If every flat image can be turned into a lightweight 3D scene, then AR apps can populate the environment with far more content without requiring full 3D scans or hand-modeled assets.
That matters for devices like headsets and smart glasses, where users expect their existing media to feel native rather than bolted on. Coverage focused on 3D and design notes that Apple’s SHARP can turn a photo into a 3D scene in under a second and that the resulting scenes look accurate from a particular viewpoint, a detail highlighted in analysis that mentions how Apple could leverage its hardware and software integration. In a headset context, that “accurate from a particular viewpoint” constraint is actually a feature, because the system only needs to look convincing from the user’s current perspective, not from arbitrary angles they will never see.
Why Apple is open-sourcing SHARP and what comes next
One of the more striking aspects of SHARP is how much of it Apple has chosen to share publicly. The company’s GitHub project for Sharp Monocular View Synthesis in Less Than a Second includes not only code but also clear instructions for setting up the environment and reproducing the results, which is unusual for a company that historically kept its most advanced graphics and imaging techniques proprietary. By doing so, Apple is effectively inviting external researchers and developers to stress-test SHARP, adapt it to new domains, and potentially discover failure modes that can be addressed before the technology is baked into consumer products.
At the same time, Apple is careful to frame SHARP as part of a broader AI strategy that spans on-device intelligence, privacy, and tight integration with its hardware. Reporting on the company’s AI efforts notes that some analysts believe that through a combination of its chips, operating systems, and models like SHARP, Apple could gain an advantage in AI-driven features that feel seamless to users, a point echoed in coverage that mentions how GPU-level performance and privacy dashboards are part of the story. If SHARP does become a standard capability across Apple devices, it will likely arrive wrapped in that same narrative of on-device processing and user control, even as the underlying research remains open for the community to inspect.
More from MorningOverview