Skip to content

Kimi K2.6

Kimi K2.6 is Moonshot AI's natively multimodal flagship focused on long-horizon coding and design with code, with a context window of 262.1K tokens, available through AI Gateway via moonshotai, fireworks, novita.

ReasoningTool UseVision (Image)File InputImplicit Caching
index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'moonshotai/kimi-k2.6',
prompt: 'Why is the sky blue?'
})

Playground

Try out Kimi K2.6 by Moonshot AI. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.

Providers

Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.

Provider
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
ZDR
No Training
Release Date
Moonshot AI
Legal:Terms
Privacy
262K
3.4s
24tps
$0.95/M$4.00/M
Read:$0.16/M
Write:
04/20/2026
Fireworks
Legal:Terms
Privacy
262K
5.8s
38tps
$0.95/M$4.00/M
Read:$0.16/M
Write:
04/20/2026
Novita AI
Legal:Terms
Privacy
262K
5.0s
31tps
$0.95/M$4.00/M
Read:$0.16/M
Write:
04/20/2026
Throughput

P50 throughput on live AI Gateway traffic, in tokens per second (TPS). Visit the docs for more info.

Latency

P50 time to first token (TTFT) on live AI Gateway traffic, in milliseconds. View the docs for more info.

Uptime

Direct request success rate on AI Gateway and per-provider. Visit the docs for more info.

More models by Moonshot AI

Model
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
Providers
ZDR
No Training
Release Date
262K
1.3s
73tps
$0.50/M$2.80/M
Read:$0.1/M
Write:
bedrock logo
fireworks logo
moonshotai logo
+2
01/26/2026
262K
2.9s
93tps
$1.15/M$8.00/M
Read:$0.15/M
Write:
moonshotai logo
11/06/2025
262K
0.5s
22tps
$0.60/M$2.50/M
Read:$0.15/M
Write:
deepinfra logo
moonshotai logo
11/06/2025
131K
1.4s
23tps
$0.57/M$2.30/M
novita logo
09/05/2025
256K
0.7s
61tps
$1.15/M$8.00/M
Read:$0.15/M
Write:
moonshotai logo
09/05/2025

About Kimi K2.6

Kimi K2.6, released on April 20, 2026, is the natively multimodal successor in the Kimi line. Moonshot AI positions it around three capability areas: long-horizon execution, agentic coding, and design with code.

Long-horizon coding is the headline shift from earlier K2 variants. Kimi K2.6 sustains tool-use chains across thousands of calls in a single session, with reported workloads spanning many hours of continuous execution and iterative optimization across a codebase. The model maintains task state across these extended sessions rather than losing thread after a few dozen turns.

Native vision input changes how design tasks compose. Kimi K2.6 accepts images directly, so a screenshot or mockup can drive a frontend generation step without a separate vision model in the pipeline. Moonshot AI documents output that includes structured layouts, hero sections, interactive elements, and animations, rather than syntax-level scaffolding alone. Full-stack workflows that pair frontend output with authentication, user interaction, and database operations are part of the documented scope.

Access Kimi K2.6 through AI Gateway by setting the model string to moonshotai/kimi-k2.6. AI Gateway routes across moonshotai, fireworks, novita with automatic failover, and the observability layer tracks token usage and costs across the long sessions this model is built for.

Kimi K2.6 supports a context window of 262.1K tokens and completions up to 262.1K tokens per request. It's available through AI Gateway at $0.95 per million input tokens and $4 per million output tokens.

What To Consider When Choosing a Provider

  • Configuration: Kimi K2.6 runs the longest agentic sessions in the Kimi family. Plan token budgets around extended tool-use chains and verify your agent harness handles multi-hour execution windows. Vision input is native, so a separate vision model isn't required for design or screenshot tasks.
  • Zero Data Retention: AI Gateway supports Zero Data Retention for this model via direct gateway requests (BYOK is not included). To configure this, check the documentation.
  • Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

When to Use Kimi K2.6

Best For

  • Long-horizon coding agents: Sessions that run for hours, accumulate thousands of tool calls, and iterate across a full codebase
  • Vision-to-frontend pipelines: Frontend generation from screenshots, mockups, or design references without a separate vision step
  • Full-stack scaffolding: Workflows spanning UI, authentication, user interaction, and database operations from one model
  • Kimi K2.5 upgrade path: Teams that want stronger long-horizon execution and design output than K2.5 provides

Consider Alternatives When

  • Explicit reasoning traces: Kimi K2 Thinking emits chain-of-thought output for tasks that reward visible deliberation
  • Short-horizon throughput: Kimi K2 Turbo runs the K2 MoE without the longer-session emphasis when tasks finish in a few turns
  • Cost-sensitive deployments: Earlier K2 variants may meet your quality bar at lower cost per token
  • Text-only pipelines: A text-only K2 variant is a closer fit when no vision step is needed

Conclusion

Kimi K2.6 extends the Kimi line into long-horizon agentic coding and design-with-code workflows with native vision input. For agents that need to run for hours across a codebase, or for pipelines that turn visual references into working frontends, it's the K2 generation built for those sessions.

Frequently Asked Questions

  • What makes Kimi K2.6 different from Kimi K2.5?

    Long-horizon execution and design output. Moonshot AI documents sustained tool-use chains across thousands of calls and frontend generation with structured layouts, hero sections, and animations. Vision input is native, so screenshots and mockups feed directly into the model without a separate vision step.

  • How long can Kimi K2.6 sustain a single agentic session?

    Moonshot AI reports workloads spanning many hours of continuous execution with thousands of tool calls in a single session. Plan your agent harness around extended runs and budget tokens for the accumulated history.

  • Does Kimi K2.6 accept image inputs directly?

    Yes. Kimi K2.6 is natively multimodal, so screenshots, mockups, and design references go in alongside text prompts. Confirm modality limits on https://platform.kimi.ai/docs/pricing/chat before you build a vision-heavy pipeline.

  • What kind of frontend code does Kimi K2.6 produce?

    Moonshot AI documents output with structured layouts, hero sections, interactive elements, and animations, plus full-stack scaffolding that spans authentication, user interaction, and database operations.

  • How do I switch from an earlier Kimi K2 variant to kimi-k2.6?

    Update the model string in your API call to moonshotai/kimi-k2.6. Authentication, tool-calling format, and the rest of the integration stay the same.

  • How do I use Kimi K2.6 on AI Gateway?

    Use the identifier moonshotai/kimi-k2.6 with the AI SDK or any supported interface like Chat Completions, Responses, or Messages. AI Gateway routes across moonshotai, fireworks, novita and handles failover automatically.

  • Does Kimi K2.6 support zero data retention?

    Yes, Zero Data Retention is available for this model. Zero Data Retention is offered on a per-provider basis. See https://vercel.com/docs/ai-gateway/capabilities/zdr for details.