Skip to content

MiMo V2 Flash

MiMo V2 Flash is Xiaomi's MiMo v2 Flash MoE reasoning model with 309B total parameters and 15B active per forward pass, using hybrid attention and multi-token prediction for inference efficiency. It supports a context window of 262.1K tokens at $0.1 per million input tokens and $0.3 per million output tokens.

ReasoningTool Use
index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'xiaomi/mimo-v2-flash',
prompt: 'Why is the sky blue?'
})

More models by Xiaomi

Model
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
Providers
ZDR
No Training
Release Date
1.1M
2.0s
64tps
$1.00/M
$3.00/M
Read:
$0.2/M
Write:
xiaomi logo
04/22/2026
1.1M
2.0s
98tps
$0.40/M
$2.00/M
Read:
$0.08/M
Write:
xiaomi logo
04/22/2026
1M
1.3s
77tps
$1.00/M
$3.00/M
Read:
$0.2/M
Write:
xiaomi logo
03/18/2026