Gemini 2.5 Flash

Gemini 2.5 Flash is Google's first fully hybrid reasoning model, letting developers toggle thinking on or off and set thinking budgets to tune the balance between quality, cost, and latency, all on top of the fast, multimodal foundation of 2.0 Flash.

File InputReasoningTool UseVision (Image)Web SearchImplicit Caching

index.ts

import { streamText } from 'ai'

const result = streamText({
  model: 'google/gemini-2.5-flash',
  prompt: 'Why is the sky blue?'
})

Overview Playground About Providers Throughput Latency Uptime Status Similar FAQ

More models by Google

Model

Context	Latency	Throughput	Input	Output	Cache	Web Search	Per Query	Capabilities	Providers	ZDR	No Training	Release Date

2.9s

331tps

$1.50/M

$9.00/M

Read:$0.15/M

Write:—

$14.00/K

+ input costs

—

05/19/2026

0.7s

254tps

$0.25/M

$1.50/M

Read:$0.03/M

Write:—

$14.00/K

+ input costs

—

03/03/2026

4.1s

173tps

$2.00/M

$12.00/M

Read:

$0.2/M

Write:

—

$14.00/K

+ input costs

—

02/19/2026

0.8s

182tps

$0.50/M

$3.00/M

Read:

$0.05/M

Write:

—

$14.00/K

+ input costs

—

12/17/2025

0.6s

241tps

$0.10/M

$0.40/M

Read:$0.01/M

Write:—

$35.00/K

+ input costs

—

06/17/2025

1.9s

107tps

$1.25/M

$10.00/M

Read:

$0.13/M

Write:

—

$35.00/K

+ input costs

—

03/20/2025

AI Cloud

Core Platform

Security

Company

Learn

Open Source

Use Cases

Tools

Users

Gemini 2.5 Flash

More models by Google