Gemini 3.1 Flash Lite

Gemini 3.1 Flash Lite is the GA release of the efficiency tier in the Gemini 3.1 generation, with improvements in reasoning, multimodal understanding, agentic tool use, and long-context performance over 2.5 Flash Lite, plus four configurable thinking levels and a context window of 1M tokens.

ReasoningTool UseImplicit CachingFile InputVision (Image)Web Search

index.ts

import { streamText } from 'ai'

const result = streamText({
  model: 'google/gemini-3.1-flash-lite',
  prompt: 'Why is the sky blue?'
})

Overview Playground About Providers Throughput Latency Uptime Status Similar FAQ

More models by Google

Model

Context	Latency	Throughput	Input	Output	Cache	Web Search	Per Query	Capabilities	Providers	ZDR	No Training	Release Date

2.4s

290tps

$1.50/M

$9.00/M

Read:$0.15/M

Write:—

$14.00/K

+ input costs

—

05/19/2026

0.8s

237tps

$0.25/M

$1.50/M

Read:$0.03/M

Write:—

$14.00/K

+ input costs

—

03/03/2026

4.1s

173tps

$2.00/M

$12.00/M

Read:

$0.2/M

Write:

—

$14.00/K

+ input costs

—

02/19/2026

0.6s

179tps

$0.50/M

$3.00/M

Read:

$0.05/M

Write:

—

$14.00/K

+ input costs

—

12/17/2025

0.7s

218tps

$0.10/M

$0.40/M

Read:$0.01/M

Write:—

$35.00/K

+ input costs

—

06/17/2025

0.4s

190tps

$0.30/M

$2.50/M

Read:$0.03/M

Write:—

$35.00/K

+ input costs

—

03/20/2025

AI Cloud

Core Platform

Security

Company

Learn

Open Source

Use Cases

Tools

Users

Gemini 3.1 Flash Lite

More models by Google