MiniMax M2.5

MiniMax M2.5 is a third-generation agentic model from MiniMax that handles full-stack development across Web, Android, iOS, Windows, and Mac platforms. It supports a context window of 1M tokens, a max output of 196K tokens, and completes tasks about 37% faster than M2.1.

ReasoningTool UseImplicit Caching

index.ts

import { streamText } from 'ai'

const result = streamText({
  model: 'minimax/minimax-m2.5',
  prompt: 'Why is the sky blue?'
})

Overview Playground About Providers Throughput Latency Uptime Status Similar FAQ

More models by MiniMax

Model

Context	Latency	Throughput	Input	Output	Cache	Web Search	Per Query	Capabilities	Providers	ZDR	No Training	Release Date

205K

0.3s

145tps

$0.30/M

$1.20/M

Read:$0.06/M

Write:$0.38/M

—

03/18/2026

205K

1.1s

56tps

$0.60/M

$2.40/M

Read:$0.06/M

Write:$0.38/M

—

03/18/2026

205K

0.9s

54tps

$0.60/M

$2.40/M

Read:$0.03/M

Write:$0.38/M

—

02/12/2026

205K

1.4s

275tps

$0.30/M

$1.20/M

Read:$0.03/M

Write:$0.38/M

—

10/27/2025

205K

0.7s

74tps

$0.30/M

$1.20/M

Read:$0.03/M

Write:$0.38/M

—

10/27/2025

205K

0.8s

52tps

$0.30/M

$2.40/M

Read:$0.03/M

Write:$0.38/M

—

10/27/2025

AI Cloud

Core Platform

Security

Company

Learn

Open Source

Use Cases

Tools

Users

MiniMax M2.5

More models by MiniMax