DeepSeek V4 Flash
DeepSeek V4 Flash is DeepSeek's April 23, 2026 efficiency-tier model in the V4 series. It pairs a hybrid attention architecture with a context window of 1.0M tokens and supports reasoning, tool use, and implicit caching.
import { streamText } from 'ai'
const result = streamText({ model: 'deepseek/deepseek-v4-flash', prompt: 'Why is the sky blue?'})P50 time to first token (TTFT) on live AI Gateway traffic, in milliseconds. View the docs for more info.