Skip to content

Kimi K2 Turbo

Kimi K2 Turbo is Moonshot AI's throughput-oriented K2 variant. It runs the K2 Mixture-of-Experts (MoE) architecture without thinking overhead, built for streaming interfaces, high-volume pipelines, and agentic workflows where first-token latency drives responsiveness.

Tool Use
index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'moonshotai/kimi-k2-turbo',
prompt: 'Why is the sky blue?'
})

More models by Moonshot AI

Model
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
Providers
ZDR
No Training
Release Date
262K
1.7s
49tps
$0.95/M$4.00/M
Read:$0.16/M
Write:
fireworks logo
moonshotai logo
novita logo
04/20/2026
262K
1.0s
54tps
$0.50/M$2.80/M
Read:$0.1/M
Write:
bedrock logo
fireworks logo
moonshotai logo
+2
01/26/2026
262K
3.3s
97tps
$1.15/M$8.00/M
Read:$0.15/M
Write:
moonshotai logo
11/06/2025
262K
0.6s
23tps
$0.60/M$2.50/M
Read:$0.15/M
Write:
deepinfra logo
moonshotai logo
11/06/2025
131K
1.4s
21tps
$0.57/M$2.30/M
novita logo
09/05/2025