Skip to content

Gemini 2.5 Flash

Gemini 2.5 Flash is Google's first fully hybrid reasoning model, letting developers toggle thinking on or off and set thinking budgets to tune the balance between quality, cost, and latency, all on top of the fast, multimodal foundation of 2.0 Flash.

File InputReasoningTool UseVision (Image)Web SearchImplicit Caching
index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'google/gemini-2.5-flash',
prompt: 'Why is the sky blue?'
})

More models by Google

Model
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
Providers
ZDR
No Training
Release Date
1M
2.9s
331tps
$1.50/M$9.00/M
Read:$0.15/M
Write:
$14.00/K
+ input costs
google logo
vertex logo
05/19/2026
1M
0.7s
254tps
$0.25/M$1.50/M
Read:$0.03/M
Write:
$14.00/K
+ input costs
google logo
vertex logo
03/03/2026
1M
4.1s
173tps
$2.00/M
$12.00/M
Read:
$0.2/M
Write:
$14.00/K
+ input costs
google logo
vertex logo
02/19/2026
1M
0.8s
182tps
$0.50/M
$3.00/M
Read:
$0.05/M
Write:
$14.00/K
+ input costs
google logo
vertex logo
12/17/2025
1M
0.6s
241tps
$0.10/M$0.40/M
Read:$0.01/M
Write:
$35.00/K
+ input costs
google logo
vertex logo
06/17/2025
1M
1.9s
107tps
$1.25/M
$10.00/M
Read:
$0.13/M
Write:
$35.00/K
+ input costs
google logo
vertex logo
03/20/2025