Gemini 2.5 Pro
Gemini 2.5 Pro is a Pro-tier thinking model from Google, built for complex reasoning, coding, math, and science tasks, with strong results on human preference benchmarks and a context window of 1.0M tokens.
import { streamText } from 'ai'
const result = streamText({ model: 'google/gemini-2.5-pro', prompt: 'Why is the sky blue?'})Playground
Try out Gemini 2.5 Pro by Google. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.
Providers
Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.
| Provider |
|---|
P50 throughput on live AI Gateway traffic, in tokens per second (TPS). Visit the docs for more info.
P50 time to first token (TTFT) on live AI Gateway traffic, in milliseconds. View the docs for more info.
Direct request success rate on AI Gateway and per-provider. Visit the docs for more info.
More models by Google
| Model |
|---|
About Gemini 2.5 Pro
Google introduced Gemini 2.5 Pro on March 20, 2025 as the flagship of the Gemini 2.5 thinking model generation. Reasoning is its headline capability. Gemini 2.5 models reason through their thoughts before responding, and Google achieved this performance level by combining a significantly enhanced base model with improved post-training. On reasoning benchmarks, 2.5 Pro posts strong results on math and science (including GPQA and AIME 2025) without majority voting or other cost-increasing test-time techniques. On Humanity's Last Exam, a dataset designed by hundreds of subject matter experts to represent the human frontier of knowledge and reasoning, 2.5 Pro scores 18.8% without tool use.
Coding performance received particular attention. Gemini 2.5 Pro represents a significant leap over the 2.0 generation in creating web apps and agentic code applications, along with code transformation and editing. On SWE-Bench Verified, the industry-standard benchmark for agentic code evaluation, it scores 63.8% with a custom agent setup. It can generate a playable video game from a single-line prompt.
Gemini 2.5 Pro ships with a context window of 1.0M tokens, the largest among Gemini 2.5 models, and supports text, audio, images, video, and entire code repositories as input. Tool use including Google Search and code execution is available.
What To Consider When Choosing a Provider
- Configuration: Given the context window of 1.0M tokens, applications passing very large inputs should confirm provider-side limits and latency expectations for long-context requests before deploying at scale.
- Zero Data Retention: AI Gateway supports Zero Data Retention for this model via direct gateway requests (BYOK is not included). To configure this, check the documentation.
- Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.
When to Use Gemini 2.5 Pro
Best For
- Advanced coding and software engineering: Building visually compelling web applications, writing agentic code, performing large-scale code transformation and editing across entire repositories
- Complex mathematical and scientific reasoning: Multi-step problems in mathematics, physics, chemistry, or logic that require sustained chain-of-thought reasoning without test-time augmentation
- Research and long-document analysis: Processing entire codebases, academic papers, legal corpora, or research datasets within a context of 1.0M tokens to extract insights, connections, and answers
- Hard benchmark-level tasks: Questions from expert-curated datasets, graduate-level reasoning problems, or tasks at the outer edge of what general-purpose models typically handle
- Agentic applications requiring deep planning: Multi-step workflows where the model must reason across tools, plan sub-tasks, and produce executable or high-accuracy outputs
Consider Alternatives When
- High-volume routine tasks: Translation, classification, and summarization where the reasoning depth of 2.5 Pro adds cost without improving output quality
- Speed-first accuracy targets: Response speed is paramount and accuracy requirements can be met by 2.5 Flash with thinking enabled
- Smaller context windows suffice: Your application does not benefit from the context window of 1.0M tokens, making the pricing premium for Pro's larger capacity unnecessary
- Embedding or retrieval workloads: A dedicated embedding model is architecturally appropriate for these use cases
Conclusion
Gemini 2.5 Pro is purpose-built for the hardest problems: code that requires deep understanding of large repositories, mathematical reasoning at competition level, and research tasks that demand both breadth of knowledge and sustained logical precision. Teams tackling the most demanding inference workloads will find in 2.5 Pro a model whose thinking architecture and context window of 1.0M tokens were designed specifically for that class of challenge.
Frequently Asked Questions
What is Gemini 2.5 Pro's score on LMArena?
Gemini 2.5 Pro ranks highly on LMArena, which measures human preferences across a broad range of tasks. Check the LMArena leaderboard for the latest score, as rankings shift over time.
What coding benchmarks does 2.5 Pro perform strongly on?
Gemini 2.5 Pro scores 63.8% on SWE-Bench Verified with a custom agent setup. SWE-Bench Verified is the industry-standard benchmark for agentic code evaluation. The model also excels at creating web apps, agentic code applications, and code transformation.
How does 2.5 Pro's thinking capability differ from 2.5 Flash's?
Both models reason through problems before responding. Gemini 2.5 Pro is the Pro tier in the Gemini 2.5 family and posts strong results on coding, math, and science benchmarks. Gemini 2.5 Flash provides configurable thinking budgets and sits at the Pareto frontier of cost and performance.
What is Humanity's Last Exam and how does Gemini 2.5 Pro perform on it?
Humanity's Last Exam is a benchmark dataset created by hundreds of subject matter experts to capture the human frontier of knowledge and reasoning. Gemini 2.5 Pro scores 18.8% on this benchmark without tool use.
What is the context window size?
Gemini 2.5 Pro has a context window of 1.0M tokens, the largest among Gemini 2.5 models, enabling it to process entire code repositories, lengthy research datasets, or extensive multi-document inputs in a single pass.
What tool use capabilities does 2.5 Pro have?
Google Search and code execution are available as built-in tools. The model can fetch real-time information, run code, and verify results within a single inference session.
Does 2.5 Pro support multimodal input?
Yes. The model accepts text, audio, images, video, and entire code repositories as input, maintaining the native multimodality that defines the Gemini model family.
Is Gemini 2.5 Pro generally available?
It launched as an experimental model on March 20, 2025. Google later promoted it to stable general availability as part of the Gemini 2.5 family expansion.