Skip to content

DeepSeek V4 Flash

DeepSeek V4 Flash is DeepSeek's April 23, 2026 efficiency-tier model in the V4 series. It pairs a hybrid attention architecture with a context window of 1.0M tokens and supports reasoning, tool use, and implicit caching.

ReasoningTool UseImplicit Caching
index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'deepseek/deepseek-v4-flash',
prompt: 'Why is the sky blue?'
})

More models by DeepSeek

Model
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
Providers
ZDR
No Training
Release Date
1M
0.6s
60tps
$0.43/M$0.87/M
Read:$0.0/M
Write:
deepinfra logo
deepseek logo
fireworks logo
+1
04/23/2026
164K
0.3s
67tps
$0.28/M$0.42/M
Read:$0.03/M
Write:
bedrock logo
deepinfra logo
deepseek logo
+2
12/01/2025
164K
0.3s
83tps
$0.28/M$0.42/M
Read:$0.03/M
Write:
bedrock logo
deepinfra logo
deepseek logo
+2
12/01/2025
131K
2.3s
28tps
$0.27/M$1.00/M
Read:$0.14/M
Write:
novita logo
09/22/2025
164K
0.2s
162tps
$0.50/M$1.50/M
Read:$0.13/M
Write:
baseten logo
deepinfra logo
fireworks logo
+3
08/21/2025
164K
1.0s
120tps
$0.77/M$0.77/M
Read:$0.14/M
Write:
baseten logo
novita logo
12/26/2024