Qwen 3.6 27B
Qwen 3.6 27B is a Qwen 3.6 native vision-language model from Alibaba built on a hybrid linear-attention plus sparse mixture-of-experts architecture, with a context window of 256K tokens and improvements in agentic coding, math and code reasoning, spatial intelligence, and object detection.
import { streamText } from 'ai'
const result = streamText({ model: 'alibaba/qwen3.6-27b', prompt: 'Why is the sky blue?'})Playground
Try out Qwen 3.6 27B by Alibaba. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.
Providers
Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.
| Provider |
|---|
P50 throughput on live AI Gateway traffic, in tokens per second (TPS). Visit the docs for more info.
P50 time to first token (TTFT) on live AI Gateway traffic, in milliseconds. View the docs for more info.
Direct request success rate on AI Gateway and per-provider. Visit the docs for more info.
More models by Alibaba
| Model |
|---|
About Qwen 3.6 27B
Qwen 3.6 27B, released on April 22, 2026, is a Qwen 3.6 native vision-language model in Alibaba's Qwen 3 family. It is built on a hybrid architecture that combines linear attention mechanisms with a sparse mixture-of-experts (MoE) framework, a design intended to keep inference efficient at long context while preserving the capability profile of a larger network.
Compared with the prior 3.5-35B-A3B generation, Qwen 3.6 27B brings improvements across several axes. Agentic coding ability is stronger, which matters for pipelines that chain tool calls and multi-step plans. Mathematical and code reasoning have been upgraded for benchmark-style problem solving and real-world programming tasks. Spatial intelligence, object localization, and object detection are sharper, which improves the model's grounding when it must reason about positions of elements within an image.
Native vision-language support means images are treated as first-class inputs alongside text rather than processed through a bolt-on encoder. Qwen 3.6 27B accepts file input and is tagged for reasoning, tool use, and implicit caching, so it can ingest documents and images, decide when to invoke registered tools, and reuse cached prefixes when serving repeated long prompts. The context window of 256K tokens accommodates extended multimodal sessions, document plus image inputs, and long agent traces.
You can integrate Qwen 3.6 27B through AI SDK, Chat Completions API, Responses API, Messages API, or other API formats, from TypeScript or Python. Maximum output is 256K tokens tokens per request.
What To Consider When Choosing a Provider
- Configuration: Sparse MoE architectures keep active compute small per token, but providers serve the model through different infrastructure paths. Check the live cost and latency metrics on this page before sizing high-volume workloads, and confirm vision payload limits with your selected provider.
- Zero Data Retention: AI Gateway does not currently support Zero Data Retention for this model. See the documentation for models that support ZDR.
- Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.
When to Use Qwen 3.6 27B
Best For
- Agentic Multimodal Coding: Pipelines that combine reasoning, tool use, and image input within a single session
- Spatial And Object Tasks: Workloads requiring sharper object localization, detection, and 2D grounding
- Math And Code Reasoning: Step-by-step accuracy in mathematical and programming problems over raw throughput
- Long-Context Multimodal Sessions: Combined text and image inputs handled within the window of 256K tokens
- Repeated Long Prefixes: Implicit caching reduces cost on shared system prompts and document inputs
Consider Alternatives When
- Text-Only Workloads: A dedicated text model offers lower cost per token when vision is never used
- Maximum Reasoning Depth: A thinking-mode model is a better match when extended chain-of-thought matters most
- Latency-Critical Pipelines: A smaller, faster vision model serves simple multimodal tasks at lower cost
- Image Or Video Generation: A generation-class model fits tasks that produce pixels rather than read them
Conclusion
Qwen 3.6 27B brings the Qwen 3.6 generation's improvements in agentic coding, reasoning, and spatial intelligence to a hybrid linear-attention plus sparse MoE architecture. Routing through AI Gateway gives you a single integration surface across the Qwen 3.6 line, with provider failover and consolidated billing.
Frequently Asked Questions
What architecture does Qwen 3.6 27B use?
Qwen 3.6 27B is built on a hybrid architecture that integrates linear attention mechanisms with a sparse mixture-of-experts framework. The combination is designed to keep inference efficient at long context while preserving the capability profile of a larger network.
How does Qwen 3.6 27B compare to the prior Qwen 3.5-35B-A3B generation?
Alibaba positions Qwen 3.6 27B as a clear step up from the 3.5-35B-A3B generation, with significantly improved agentic coding, mathematical and code reasoning, spatial intelligence, and object localization and detection performance.
What modalities does Qwen 3.6 27B accept?
Qwen 3.6 27B is a native vision-language model. It accepts interleaved text and images within a single context window of up to 256K tokens, and supports file input alongside text and vision.
Does Qwen 3.6 27B support tool calling?
Yes. Qwen 3.6 27B is tagged for tool use and is tuned for agentic coding workflows. The model can select and invoke registered functions across multi-turn sessions through AI Gateway.
How do I access Qwen 3.6 27B through AI Gateway?
Authenticate with an AI Gateway API key or OIDC token and reference `
alibaba/qwen3.6-27b` as the model. You can call Qwen 3.6 27B through AI SDK, Chat Completions API, Responses API, Messages API, or other API formats, from TypeScript or Python.Does Qwen 3.6 27B support zero data retention?
Zero Data Retention is not currently available for this model. Zero Data Retention is offered on a per-provider basis. See https://vercel.com/docs/ai-gateway/capabilities/zdr for details.
Where can I see live latency and cost data for Qwen 3.6 27B?
This page shows live throughput, time-to-first-token, and pricing metrics for Qwen 3.6 27B measured across real AI Gateway traffic.