GLM 4.7 Flash

GLM 4.7 Flash is the speed-optimized variant in Z.ai's GLM-4.7 generation, released N/A. It delivers faster inference for high-throughput workloads while retaining the coding, tool usage, and conversational improvements introduced in GLM-4.7.

ReasoningTool Use

index.ts

import { streamText } from 'ai'

const result = streamText({
  model: 'zai/glm-4.7-flash',
  prompt: 'Why is the sky blue?'
})

Overview Playground About Providers Throughput Latency Uptime Status Similar FAQ

About GLM 4.7 Flash

GLM 4.7 Flash was released N/A as the middle tier in Z.ai's GLM-4.7 generation, sitting between the full GLM-4.7 and the ultra-fast GLM-4.7-FlashX. It inherits the 4.7 generation's gains in coding assistance, tool usage, multi-step reasoning, and natural conversational tone while trading peak capability for faster inference.

The GLM-4.7 generation focused on closing coding and tool-use gaps with competing models. GLM 4.7 Flash carries those gains forward at a cost-and-latency profile that fits high-volume coding assistance, real-time chat, and production pipelines with strict response time budgets. If the full GLM-4.7 is too slow and GLM-4.7-FlashX strips too much capability, GLM 4.7 Flash is the compromise.

Through AI Gateway, switching between GLM-4.7 tiers requires only changing the model identifier. The API surface and request format stay the same.

AI Cloud

Core Platform

Security

Company

Learn

Open Source

Use Cases

Tools

Users

GLM 4.7 Flash

About GLM 4.7 Flash