LLM Leaderboard · Open Weight

DeepSeek V4 Flash (Max) benchmarks, pricing, and LLM comparison.

Compare DeepSeek V4 Flash (Max) vs GPT, Claude, Gemini, DeepSeek, open-weight, and frontier AI models using public benchmark scores, token pricing, context window, and access details.

Rank #24AskClash overall score: 46.0
$0.14 / $0.28Input and output token price, when published. Context: 1M.
APIBilling and access path cached for this model row.

DeepSeek V4 Flash (Max) benchmark snapshot

AskClash combines public LLM benchmark cells into a weighted percentile score and penalizes missing coverage so narrow rows do not dominate better-measured models.

Overall46.0
Benchmark cells9
Context1M
CreatorDeepSeek

DeepSeek V4 Flash (Max) public benchmark scores

Cached benchmark values can include HLE, GPQA, SWE-bench, SWE-Pro, SWE-Atlas, Terminal-Bench, MCP Atlas, MMMU-Pro, ARC-AGI-2, Tau2, and model-specific coding or agent scores.

HLE

34.8 score

GPQA

88.1 score

MATH-500

57.4 score

SWE-bench

79.0 score

Terminal-Bench

56.9 score

LiveCodeBench

91.6 score

MCP Atlas

69.0 score

Tau2

95.0 score

DeepSeek V4 Flash (Max) vs other AI models

Use these comparison links to evaluate DeepSeek V4 Flash (Max) against nearby LLMs by benchmark score, price, context window, and provider.

Related AI and tech coverage

Cached AskClash article matches that can provide release, provider, benchmark, pricing, or market context around this model.

China's DeepSeek to make permanent 75% price cut on flagship V4‑Pro AI model

Huawei’s AI chip sales have benefited from U.S. export controls that prevent Nvidia from selling its most advanced semiconductors in China, although separate curbs on chipmaking equipment exports have limited Huawei’s ability to scale up Ascend production. Chinese artificial intelligence startup DeepSeek will make permanent a 75% price cut on its flagship V4‑Pro artificial intelligence model, keeping prices at a quarter of their original level, the company said in a statement on Saturday.

Flash Attention fa4-v4.0.0.beta13 Release Notes

* [Cute,Bwd,Sm100] fix incorrect calculation of n_block global max for bwd deterministic by @jayhshah in https://github.com/Dao-AILab/flash-attention/pull/2549 * [FA4][hd256] Fix layout of non-contiguous qkv in backward kernel by @wangsiyu in https://github.com/Dao-AILab/flash-attention/pull/2545

The storage refresh that outlives the flash cycle

The storage refresh that outlives the flash cycle :root { --lab_page_width: 1223px; --lab_columns_gutter: 11px; --space-top: 0; --space-top-adnuntiusAd: ; } @media(max-width: 767px) { :root { --lab_columns

Last cached leaderboard date: May 22, 2026. This model page is generated from the AskClash LLM Leaderboard cache and linked from the live leaderboard.