Skip to content

GLM 4.7 Flash

GLM 4.7 Flash is the speed-optimized variant in Z.ai's GLM-4.7 generation, released N/A. It delivers faster inference for high-throughput workloads while retaining the coding, tool usage, and conversational improvements introduced in GLM-4.7.

ReasoningTool Use
index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'zai/glm-4.7-flash',
prompt: 'Why is the sky blue?'
})

About GLM 4.7 Flash

GLM 4.7 Flash was released N/A as the middle tier in Z.ai's GLM-4.7 generation, sitting between the full GLM-4.7 and the ultra-fast GLM-4.7-FlashX. It inherits the 4.7 generation's gains in coding assistance, tool usage, multi-step reasoning, and natural conversational tone while trading peak capability for faster inference.

The GLM-4.7 generation focused on closing coding and tool-use gaps with competing models. GLM 4.7 Flash carries those gains forward at a cost-and-latency profile that fits high-volume coding assistance, real-time chat, and production pipelines with strict response time budgets. If the full GLM-4.7 is too slow and GLM-4.7-FlashX strips too much capability, GLM 4.7 Flash is the compromise.

Through AI Gateway, switching between GLM-4.7 tiers requires only changing the model identifier. The API surface and request format stay the same.