How does Kimi K2.5 (Reasoning) compare with other LLMs?

AskClash compares Kimi K2.5 (Reasoning) against nearby AI models using public benchmark scores, pricing, context window, and access details.

What benchmarks are tracked for Kimi K2.5 (Reasoning)?

The page shows cached public benchmark cells such as HLE, GPQA, SWE-bench, SWE-Pro, Terminal-Bench, MCP Atlas, MMMU-Pro, ARC-AGI-2, Tau2, and related model scores when available.

LLM Leaderboard · Proprietary

Kimi K2.5 (Reasoning) benchmarks, pricing, and LLM comparison.

Compare Kimi K2.5 (Reasoning) vs GPT, Claude, Gemini, DeepSeek, open-weight, and frontier AI models using public benchmark scores, token pricing, context window, and access details.

Open live LLM leaderboard Open in app

Rank #33AskClash overall score: 18.0

$0.60 / $3.00Input and output token price, when published. Context: 256K.

APIBilling and access path cached for this model row.

Kimi K2.5 (Reasoning) benchmark snapshot

AskClash combines public LLM benchmark cells into a weighted percentile score and penalizes missing coverage so narrow rows do not dominate better-measured models.

Overall18.0

Benchmark cells4

Context256K

CreatorMoonshot AI

Kimi K2.5 (Reasoning) public benchmark scores

Cached benchmark values can include HLE, GPQA, SWE-bench, SWE-Pro, SWE-Atlas, Terminal-Bench, MCP Atlas, MMMU-Pro, ARC-AGI-2, Tau2, and model-specific coding or agent scores.

GPQA

87.6 score

SWE-bench

76.8 score

MMMU-Pro

78.5 score

Tau2

95.9 score

Kimi K2.5 (Reasoning) vs other AI models

Use these comparison links to evaluate Kimi K2.5 (Reasoning) against nearby LLMs by benchmark score, price, context window, and provider.

Kimi K2.5 (Reasoning) vs Claude Mythos/Fable 5 Kimi K2.5 (Reasoning) vs Claude Opus 4.8 (Adaptive)Kimi K2.5 (Reasoning) vs GPT-5.5 xHigh Kimi K2.5 (Reasoning) vs Claude Opus 4.7 (Adaptive)Kimi K2.5 (Reasoning) vs GPT-5.4 xHigh Kimi K2.5 (Reasoning) vs Gemini 3.5 Flash High Kimi K2.5 (Reasoning) vs Kimi K2.7 Code Kimi K2.5 (Reasoning) vs Gemini 3.1 Pro

Related AI and tech coverage

Cached AskClash article matches that can provide release, provider, benchmark, pricing, or market context around this model.

Moonshot AI releases Kimi K2.7-Code, claiming 30% lower reasoning token usage compared to K2.6, available under a modified MIT license (Sean Michael Kerner/VentureBeat)

Techmeme