Multi-Step Planning
Break down complex problems into actionable steps with reliable execution plans.
Qwen3.5-122B-A10B is a large MoE model for complex reasoning, multi-step planning, and in-depth analysis. Try it free in your browser.
Qwen3.5-122B-A10B is the default model for this page. Large MoE model tuned for harder reasoning, multi-step plans, and detailed answers.
Starter prompts
Free to use on this site via OpenRouter.
Qwen3.5-122B-A10B sits between the compact 35B-A3B and the flagship 397B-A17B. It activates 10B parameters per token — enough for significantly deeper reasoning than 35B-A3B — while keeping inference costs well below the flagship. For many production workloads, this model hits the optimal cost-quality balance.
10B active parameters per token deliver substantially deeper reasoning than smaller MoE models.
Strong enough for complex tasks while still cost-effective for API-scale deployment.
Maintains coherence and accuracy across extended outputs and multi-turn conversations.
How Qwen3.5-122B-A10B compares to nearby models in the Qwen family.
Compact MoE model, also the base model behind Qwen3.5-Flash.
Mid-tier MoE model for deeper reasoning and agent tasks.
Flagship open-weight Qwen3.5 model, also the base model behind Qwen3.5-Plus.
Scores are from public model cards and the qwen.ai release page. Hosted models are labeled with their open-weight base.
Updated 2026-04-02This model shines when tasks require sustained reasoning, detailed analysis, or high-quality structured output.
Break down complex problems into actionable steps with reliable execution plans.
Analyze research papers, financial reports, and technical documentation in depth.
Handle multi-file refactoring, architecture decisions, and complex debugging.
Produce coherent articles, reports, and documentation over thousands of words.
Analyze datasets, explain patterns, and generate insights from structured data.
Power multi-tool agents that need strong reasoning for task orchestration.
Common questions about the large MoE model.
122B-A10B activates over 3x more parameters per token (10B vs 3B) and draws from a much larger expert pool (122B vs 35B). This results in noticeably better reasoning, especially on complex multi-step tasks.
Use 397B-A17B when you need the absolute best reasoning quality and are willing to pay higher compute costs. For most production use cases, 122B-A10B provides excellent quality at lower cost.
Yes, but it requires multi-GPU setups or high-VRAM servers. Quantized versions help reduce requirements. Cloud deployment via vLLM is the most common production setup.
Yes. 122B-A10B handles complex codebases, multi-file reasoning, and architecture-level decisions well — a significant step up from the dense models for programming tasks.
Around 40-60 GB at Q4 quantization. Most users run it on multi-GPU setups or cloud instances with 2-4 GPUs.
Yes. It offers a strong balance between quality and cost. Many production setups use it as a middle ground between the compact 35B-A3B and the flagship 397B-A17B.
Yes. All Qwen 3.5 models support function calling. The 122B-A10B size handles multi-step tool chains more reliably than the smaller variants.
Qwen3.5-122B-A10B supports 262,144 native tokens and can stretch higher with compatible serving stacks.