Skip to content

GLM 4.6

GLM 4.6 is Z.ai's coding-focused model released September 30, 2025, with enhanced performance on both benchmarks and real-world programming tasks. It features an expanded context window of 204.8K tokens for handling large codebases and complex agent workflows.

ReasoningTool UseImplicit Caching
index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'zai/glm-4.6',
prompt: 'Why is the sky blue?'
})

More models by Z.ai

Model
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
Providers
ZDR
No Training
Release Date
205K
1.0s
72tps
$1.40/M$4.40/M
Read:$0.26/M
Write:
deepinfra logo
fireworks logo
novita logo
+1
04/07/2026
200K
0.9s
125tps
$1.20/M$4.00/M
Read:$0.24/M
Write:
zai logo
04/01/2026
203K
0.9s
82tps
$1.20/M$4.00/M
Read:$0.24/M
Write:
zai logo
03/15/2026
203K
0.2s
64tps
$0.80/M$2.56/M
Read:$0.16/M
Write:
bedrock logo
deepinfra logo
fireworks logo
+3
02/12/2026
205K
0.1s
559tps
$2.25/M$2.75/M
Read:$2.25/M
Write:
bedrock logo
cerebras logo
deepinfra logo
+2
12/22/2025
200K
0.1s
128tps
$0.07/M$0.40/M
Read:$0.01/M
Write:
bedrock logo
zai logo