New Chinese AI claims to beat GPT-5 and Sonnet 4.5

on

Nov 11, 2025

in AI

Beijing-based Moonshot AI, supported by Alibaba, has unveiled the Kimi K2 Thinking model, an open-source AI system that claims to outperform OpenAI’s GPT-5 and Anthropic’s Claude Sonnet 4.5 across key benchmarks. This model, which is available for free, was announced on November 7, 2025, and has positioned Moonshot AI as a significant competitor in the global AI landscape. This development follows the recent release of Anthropic’s Claude Sonnet 4.5, which was touted as their safest AI yet, and broader discussions on the need for better AI models and new evaluation methods.

The Rise of Moonshot AI

Moonshot AI, backed by Alibaba and based in Beijing, has emerged as a significant player in the AI landscape in China. The company has been developing models that can compete with global leaders in the field. On November 9, 2025, Moonshot AI launched its new model, specifically designed to challenge GPT-5, highlighting the company’s aggressive expansion in the open-source AI sector. This focus on creating accessible, high-performance tools is a reflection of Beijing’s push for domestic AI innovation in the face of international competition. source

Introducing Kimi K2 Thinking

Kimi K2 Thinking is the latest open-source model from Moonshot AI. It is designed as a versatile AI system capable of handling complex reasoning and generation tasks. The model was released on November 7, 2025, and emphasizes its free availability to developers and users worldwide, thereby lowering barriers to advanced AI adoption. Key features of the model include enhanced thinking capabilities that allow it to handle multifaceted queries more effectively than its predecessors.source

Benchmark Superiority Claims

According to reports from November 7, 2025, Kimi K2 Thinking outperforms GPT-5 and Claude 4.5 Sonnet in key benchmarks. This demonstrates superior performance in areas such as logical reasoning and creative problem-solving. The model’s results highlight specific metrics where it outperforms its competitors, based on standardized AI evaluations that measure accuracy and efficiency. Independent analyses, including those on new AI evals from September 30, 2025, validate the benchmarks used to assess Kimi K2 Thinking’s edge. source

Head-to-Head with GPT-5

In direct comparisons, Kimi K2 Thinking beats GPT-5, with claims of higher scores in reasoning and multimodal tasks. OpenAI’s GPT-5, a cornerstone of Western AI advancements, serves as the primary benchmark for Moonshot AI’s model, underscoring the competitive positioning of the Chinese entry. The free nature of Kimi K2 Thinking provides a cost advantage over GPT-5’s proprietary access model, potentially accelerating its adoption in enterprise settings. source

Comparison to Claude Sonnet 4.5

Kimi K2 Thinking also outperforms Anthropic’s Claude Sonnet 4.5, particularly in benchmark tests for speed and output quality. Claude Sonnet 4.5, released on September 29, 2025, is Anthropic’s safest AI model yet, prioritizing ethical safeguards. However, Kimi K2 Thinking claims broader capabilities without similar constraints. The differences in safety features versus raw performance could influence user preferences, with Kimi K2 Thinking appealing to those seeking unrestricted, high-power AI. source

Implications of Open-Source Access

As a free and open-source model, Kimi K2 Thinking democratizes access to top-tier AI, enabling global developers to build upon it without licensing fees. This approach contrasts with closed models like GPT-5, fostering innovation in regions with limited resources and aligning with Beijing’s strategy for AI leadership. However, potential challenges include ensuring responsible use, drawing from discussions on scaling RL training and better AI models around September 30, 2025. source

Broader Context in AI Evolution

The launch of Kimi K2 Thinking fits into ongoing advancements, including new AI evals that refine how models like this are tested, as explored on September 30, 2025. Moonshot AI’s model contributes to the global shift toward more capable, accessible AI, challenging U.S.-dominated players like OpenAI and Anthropic. Future developments may involve integrating Kimi K2 Thinking with scaling RL training techniques to further enhance its performance edge. source

More from MorningOverview