DeepSeek V3.1
DeepSeek V3.1 is DeepSeek's August 21, 2025 model update introducing hybrid inference with selectable thinking and non-thinking modes in one endpoint. It strengthens tool use and multi-step agent capabilities over DeepSeek-V3.
import { streamText } from 'ai'
const result = streamText({ model: 'deepseek/deepseek-v3.1', prompt: 'Why is the sky blue?'})Playground
Try out DeepSeek V3.1 by DeepSeek. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.
Providers
Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.
| Provider |
|---|
P50 throughput on live AI Gateway traffic, in tokens per second (TPS). Visit the docs for more info.
P50 time to first token (TTFT) on live AI Gateway traffic, in milliseconds. View the docs for more info.
Direct request success rate on AI Gateway and per-provider. Visit the docs for more info.
More models by DeepSeek
| Model |
|---|
About DeepSeek V3.1
DeepSeek V3.1 was released August 21, 2025. Its central change consolidates thinking and non-thinking inference into one model. Access non-thinking mode via the deepseek-chat API identifier and thinking mode via deepseek-reasoner. Previously these required separate deployments. The dual-mode design lets you route requests to different inference behaviors without maintaining separate integrations, simplifying agent architectures where some steps need reasoning and others don't.
The thinking mode offers improved efficiency over prior reasoning models. Strict function calling is available in beta, alongside Anthropic API format compatibility, expanding the range of infrastructure that can route to DeepSeek V3.1 without modification.
DeepSeek V3.1 targets stronger multi-step reasoning for complex search tasks, better performance on SWE-Bench and Terminal-Bench, and a new tokenizer with a refreshed chat template. Current AI Gateway rates appear on this page.
What To Consider When Choosing a Provider
- Configuration: Two usage modes share the same model. Test both thinking and non-thinking paths in your integration to confirm your application correctly interprets response structure under each mode.
- Zero Data Retention: AI Gateway supports Zero Data Retention for this model via direct gateway requests (BYOK is not included). To configure this, check the documentation.
- Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.
When to Use DeepSeek V3.1
Best For
- Mixed agent pipelines: Combine reasoning-heavy steps (tool planning, code generation) with fast-response steps (parsing, classification) through a single endpoint
- Software engineering automation: SWE-Bench and Terminal-Bench improvements translate to better code generation and execution performance
- Anthropic API compatibility: Existing Anthropic-format integrations route to DeepSeek V3.1 with minimal integration change
- Complex multi-step search: The thinking mode's improved efficiency reduces total response latency for multi-step workflows
- Upgrading from DeepSeek-V3: Backward-compatible API routing plus optional thinking mode
Consider Alternatives When
- Pure reasoning workloads: DeepSeek-R1 remains the dedicated reasoning specialist
- Multilingual stability critical: DeepSeek-V3.1 Terminus addresses reliability issues for Chinese-English code-switching output consistency
- Straightforward chat or completion: DeepSeek-V3 may be more cost-efficient for high-volume workloads without hybrid inference needs
Conclusion
DeepSeek V3.1 consolidates thinking and non-thinking modes into a single endpoint, simplifying deployment for reasoning-capable systems. It adds capability over DeepSeek-V3 for agentic and software engineering tasks.
Frequently Asked Questions
What does "hybrid inference" mean for DeepSeek V3.1?
The same model weights support both a thinking mode (extended chain-of-thought) and a non-thinking mode (direct completion). Select the mode by calling
deepseek-reasonerfor thinking ordeepseek-chatfor non-thinking. No separate model switch is needed.Is DeepSeek V3.1's thinking mode faster than DeepSeek-R1?
Yes. DeepSeek-V3.1-Think reaches answers in less time than DeepSeek-R1-0528 on equivalent tasks.
Does DeepSeek V3.1 support the Anthropic API format?
Yes. Existing Anthropic-format integrations can route to DeepSeek V3.1 without additional conversion.
What is strict function calling and is it available in DeepSeek V3.1?
It's in beta for DeepSeek V3.1. Strict function calling requires tool call arguments to match the provided JSON schema exactly.
What is the context window for DeepSeek V3.1?
163.8K tokens for both thinking and non-thinking modes.