RWT
9.5 score
Compare MiniMax-M3 vs GPT, Claude, Gemini, DeepSeek, open-weight, and frontier AI models using public benchmark scores, token pricing, context window, and access details.
AskClash combines public LLM benchmark cells into a weighted percentile score and penalizes missing coverage so narrow rows do not dominate better-measured models.
Cached benchmark values can include HLE, GPQA, SWE-bench, SWE-Pro, SWE-Atlas, Terminal-Bench, MCP Atlas, MMMU-Pro, ARC-AGI-2, Tau2, and model-specific coding or agent scores.
9.5 score
37.1 score
92.9 score
82.9 score
80.5 score
59.0 score
66.0 score
70.1 score
74.2 score
48.3 score
78.1 score
88.9 score
Use these comparison links to evaluate MiniMax-M3 against nearby LLMs by benchmark score, price, context window, and provider.
Cached AskClash article matches that can provide release, provider, benchmark, pricing, or market context around this model.
MiniMax-M3-VL is the vision-language member of the MiniMax-M3 family that pairs a CLIP-style vision tower with 3D rotary position embeddings with the MiniMax-M3 text backbone. It uses a mixed dense/sparse Mixture-of-Experts decoder with SwiGLU-OAI gated experts and a lightning indexer for block-sparse attention. The model processes images through a Conv3d patch embedding system and includes specialized components for efficient multimodal understanding and generation. The official weights for PP-
* Model fixes: NemotronH MTP + Chunked Prefill (#35447), Qwen3-VL video timestamps (#37439), Qwen3.5 GDN quantized models (#37448), Qwen3Next A_log FP32 (#37810), JAIS ALiBi (#37820), RoBERTa CUDA graph position IDs (#37873), AudioFlamingo3/MusicFlamingo (#37643), Music Flamingo loading (#35535), bge-m3 task selection (#37632), Nemotron Parse loading (#37407), GLM OCR patch merger (#37962), PaddleOCR checkpoint compat (#38232), DeepSeek v3.2 params (#33703), MiniMax NVFP4 weight loading (#37214)
* **DeepSeek-V4 matures across backends**: Following its introduction in v0.22.0, DeepSeek-V4 received another large hardening and optimization pass. Its sparse MLA metadata is now decoupled from DeepSeek-V3.2 (#44699), it gained a TRTLLM-gen attention kernel (#43827), EPLB support for the Mega-MoE (#43339), selective prefix-cache retention for sliding-window KV cache (#43447), and an index-share feature for DSA MTP (#44420). The model was also detached from `torch.compile` (#43746, #43891), its
Last cached leaderboard date: June 1, 2026. This model page is generated from the AskClash LLM Leaderboard cache and linked from the live leaderboard.