How does GPT 5.4 Mini compare to GPT-5 mini?

It handles code generation, tool orchestration, and multi-step browser interactions more reliably. It also supports verbosity and reasoning level parameters for tunable output.

What context window does GPT 5.4 Mini support?

400K tokens, supporting extended inputs for agentic workflows.

What are the verbosity and reasoning level parameters?

They give you control over response detail and how much the model reasons before answering, letting you tune the cost-quality tradeoff per request.

Is GPT 5.4 Mini suitable for sub-agent workflows?

Yes. It's built for sub-agent architectures where multiple smaller models coordinate on parts of a larger task.

When should I use GPT-5.4 Nano instead?

When cost is the dominant concern and you're running high-volume parallel calls. GPT-5.4 Nano performs close to mini in evaluations at a lower price point.

How does AI Gateway handle authentication for GPT 5.4 Mini?

AI Gateway accepts a single API key or OIDC token for all requests. You don't embed OpenAI credentials in your application; AI Gateway routes and authenticates on your behalf.

What are typical latency characteristics?

This page shows live throughput and time-to-first-token metrics measured across real AI Gateway traffic.

Dashboard

GPT 5.4 Mini

GPT 5.4 Mini is the cost-efficient member of the GPT-5.4 family, delivering strong performance in code generation, tool orchestration, and multi-step browser interactions at a price point designed for agentic production workloads.

ReasoningTool UseVision (Image)File InputImplicit CachingWeb Search

index.ts

import { streamText } from 'ai'

const result = streamText({
  model: 'openai/gpt-5.4-mini',
  prompt: 'Why is the sky blue?'
})

Overview Playground About Providers Throughput Latency Uptime Status Similar FAQ

Playground

Try out GPT 5.4 Mini by OpenAI. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.

Providers

Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.

Provider

Context	Latency	Throughput	Input	Output	Cache	Web Search	Per Query	Capabilities	ZDR	No Training	Release Date

Legal:Terms

•

Privacy

400K

1.3s

165tps

$0.75/M

$4.50/M

Read:$0.07/M

Write:—

$10.00/K

+ input costs

—

03/17/2026

Legal:Terms

•

Privacy

400K

3.5s

120tps

$0.75/M

$4.50/M

Read:$0.07/M

Write:—

$14/K

+ input costs

—

03/17/2026

More models by OpenAI

Model

Context	Latency	Throughput	Input	Output	Cache	Web Search	Per Query	Capabilities	Providers	ZDR	No Training	Release Date

1.0s

67tps

$5.00/M

$30.00/M

Read:

$0.5/M

Write:

—

$10.00/K

+ input costs

—

04/24/2026

400K

0.4s

15tps

$0.20/M

$1.25/M

Read:$0.02/M

Write:—

$10.00/K

+ input costs

—

03/17/2026

1.1M

1.2s

61tps

$2.50/M

$15.00/M

Read:

$0.25/M

Write:

—

$10.00/K

+ input costs

—

03/05/2026

128K

0.7s

103tps

$1.25/M

$10.00/M

Read:$0.13/M

Write:—

$10.00/K

+ input costs

—

11/12/2025

400K

5.4s

329tps

$0.25/M

$2.00/M

Read:$0.03/M

Write:—

$14/K

+ input costs

—

08/07/2025

131K

0.1s

522tps

$0.35/M

$0.75/M

Read:$0.25/M

Write:—

—

08/05/2025

About GPT 5.4 Mini

GPT 5.4 Mini became available on March 17, 2026 on AI Gateway as the cost-efficient variant of the GPT-5.4 model family. It handles code generation, tool orchestration, and multi-step browser interactions more reliably than previous mini-tier models, making it a strong default for agentic tasks.

The model supports verbosity and reasoning level parameters, giving you control over response detail and how much the model reasons before answering. This is useful for tuning the cost-quality tradeoff per request. It's built for sub-agent workflows where multiple smaller models coordinate on parts of a larger task.

With a context window of 400K tokens and the full API feature set, GPT 5.4 Mini provides the capabilities production applications need at a price point that scales. If you're migrating from GPT-5 mini, it offers measurable improvements in agentic task completion.

What To Consider When Choosing a Provider

Configuration: GPT 5.4 Mini is a strong default for agentic tasks that need to balance capability and cost. It handles code generation, tool orchestration, and multi-step browser interactions more reliably than previous mini-tier models.
Configuration: The model supports verbosity and reasoning level parameters, giving you control over response detail and how much the model reasons before answering.
Zero Data Retention: AI Gateway supports Zero Data Retention for this model via direct gateway requests (BYOK is not included). To configure this, check the documentation.
Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

When to Use GPT 5.4 Mini

Best For

Agentic production workloads: Multi-step tasks involving tools, code, and browser interactions at sustainable cost
Code generation: Reliable code output for development tools and agent pipelines
Sub-agent coordination: Smaller model that coordinates on parts of a larger task alongside other agents
Tool orchestration: Calling and composing external APIs and functions in multi-step sequences
Cost-efficient chat: Capable conversational interface at a lower price than full GPT-5.4

Consider Alternatives When

Maximum capability: GPT-5.4 or GPT-5.4 pro when the task demands full GPT-5.4 quality
Lowest cost: GPT-5.4 nano for high-volume sub-agent workflows where cost scales with parallel calls
Specialized coding: GPT-5.3 codex for autonomous software engineering in sandboxed environments
Pure reasoning: O3 for chain-of-thought mathematical and scientific reasoning

Conclusion

GPT 5.4 Mini is the default production model in the GPT-5.4 family, balancing agentic capability and cost. For applications on AI Gateway that need reliable tool use and code generation at scale, it's the natural choice.

Frequently Asked Questions

How does GPT 5.4 Mini compare to GPT-5 mini?
It handles code generation, tool orchestration, and multi-step browser interactions more reliably. It also supports verbosity and reasoning level parameters for tunable output.
What context window does GPT 5.4 Mini support?
400K tokens, supporting extended inputs for agentic workflows.
What are the verbosity and reasoning level parameters?
They give you control over response detail and how much the model reasons before answering, letting you tune the cost-quality tradeoff per request.
Is GPT 5.4 Mini suitable for sub-agent workflows?
Yes. It's built for sub-agent architectures where multiple smaller models coordinate on parts of a larger task.
When should I use GPT-5.4 Nano instead?
When cost is the dominant concern and you're running high-volume parallel calls. GPT-5.4 Nano performs close to mini in evaluations at a lower price point.
How does AI Gateway handle authentication for GPT 5.4 Mini?
AI Gateway accepts a single API key or OIDC token for all requests. You don't embed OpenAI credentials in your application; AI Gateway routes and authenticates on your behalf.
What are typical latency characteristics?
This page shows live throughput and time-to-first-token metrics measured across real AI Gateway traffic.

AI Cloud

Core Platform

Security

Company

Learn

Open Source

Use Cases

Tools

Users

GPT 5.4 Mini

Playground

Providers

More models by OpenAI

About GPT 5.4 Mini

What To Consider When Choosing a Provider

When to Use GPT 5.4 Mini

Best For

Consider Alternatives When

Conclusion

Frequently Asked Questions