Chat with Qwen 3.5 & Qwen 3.6 — Free, Instant, No Setup

Try the latest Qwen AI models right in your browser. Ask questions, write code, search the web, and think step-by-step — all in one place.

Qwen AI Chat
Online
Pick a model, decide whether this needs web search or thinking, then start with a real prompt.

Starter prompts

Try the models in the browser. Each model page summarizes the key specs, benchmarks, and use cases in a simpler way.

Models
8
Context
262K to 1M
Availability
Open + Hosted
Architectures
Dense + MoE

Qwen Model Benchmark Reference

Public benchmark scores across the Qwen model family. Hosted models (Flash, Plus) are labeled with the open-weight base model they reference.

Qwen3.5-9B

Light dense model for quick prompts and lightweight coding.

Updated 2026-04-02
MMLU-Pro
82.5
GPQA / GPQA-family
81.7
LiveCodeBench v6
65.6

Qwen3.5-27B

Balanced dense model with better reasoning and coding depth.

Updated 2026-04-02
MMLU-Pro
86.1
GPQA / GPQA-family
85.5
LiveCodeBench v6
80.7

Qwen3.5-35B-A3B

Compact MoE model, also the base model behind Qwen3.5-Flash.

Updated 2026-04-02
MMLU-Pro
85.3
GPQA / GPQA-family
84.2
LiveCodeBench v6
74.6

Qwen3.5-Flash

Hosted

Hosted version built on Qwen3.5-35B-A3B with additional tooling and a 1M context window.

Scores reference the Qwen3.5-35B-A3B base model.

Updated 2026-04-02
MMLU-Pro
85.3
GPQA / GPQA-family
84.2
LiveCodeBench v6
74.6

Qwen3.5-122B-A10B

Mid-tier MoE model for deeper reasoning and agent tasks.

Updated 2026-04-02
MMLU-Pro
86.7
GPQA / GPQA-family
86.6
LiveCodeBench v6
78.9

Qwen3.5-397B-A17B

Flagship open-weight Qwen3.5 model, also the base model behind Qwen3.5-Plus.

Updated 2026-04-02
MMLU-Pro
87.8
GPQA / GPQA-family
88.4
LiveCodeBench v6
83.6

Qwen3.5-Plus

Hosted

Hosted version built on Qwen3.5-397B-A17B with additional tooling and a 1M context window.

Scores reference the Qwen3.5-397B-A17B base model.

Updated 2026-04-02
MMLU-Pro
87.8
GPQA / GPQA-family
88.4
LiveCodeBench v6
83.6

Qwen3.6-Plus

New

Current Qwen 3.6 hosted release with agentic coding, stronger tool use, and multimodal reasoning.

1M default context window with preserve_thinking support.

Updated 2026-04-02
MMLU-Pro
88.5
GPQA / GPQA-family
90.4
LiveCodeBench v6
87.1

Key Features Of Qwen Family

Qwen 3.5 and Qwen 3.6 together cover a wide range of tasks — from lightweight local models to hosted agents with 1M context.

Dense and MoE Architectures

Qwen 3.5 offers dense models (9B, 27B) for low-latency tasks and MoE models (35B-A3B, 122B-A10B, 397B-A17B) for deeper reasoning. Pick the trade-off that fits your workload.

Up to 1M Context Window

Open Qwen 3.5 models support 262K native tokens. Hosted models like Flash, Plus, and Qwen3.6-Plus extend this to 1M by default.

Thinking Mode

Enable step-by-step reasoning for complex tasks — debugging, comparisons, multi-step plans — where a quick answer usually falls short.

Qwen 3.6: Agentic Workflows

Qwen3.6-Plus adds agentic coding, stronger tool calling, and multimodal reasoning on top of the Qwen 3.5 foundation. Best for multi-step tasks that need the model to plan and act.

Open Weights + Hosted APIs

Qwen 3.5 open-weight models use Apache 2.0 and run anywhere. Hosted models (Flash, Plus, Qwen3.6-Plus) add built-in tools and managed infrastructure.

Multilingual and Multimodal

Strong performance across 100+ languages. Qwen3.6-Plus also handles images and documents alongside text in the same conversation.

FAQ

Frequently Asked Questions

Common questions about the Qwen model family and how to use them on this site.

1

What is Qwen 3.5?

Qwen 3.5 is Alibaba Cloud's open-weight model family with dense and MoE variants from 9B to 397B parameters. This site also hosts Flash, Plus, and the newer Qwen3.6-Plus.

2

Which Qwen model should I start with?

For everyday tasks, try Qwen3.5-9B or Flash for speed. For harder reasoning or coding, use Qwen3.5-122B-A10B or 397B-A17B. For the strongest all-rounder, pick Qwen3.5-Plus or Qwen3.6-Plus.

3

Is it free to use?

Yes. Every account gets 5 free credits to try the site. When those credits run out, you can top up or subscribe to keep using higher-cost models, web search, and thinking mode. Open-weight Qwen 3.5 releases remain Apache 2.0 if you want to self-host them.

4

What is the difference between Qwen 3.5 and Qwen 3.6?

Qwen 3.6 builds on the Qwen 3.5 foundation. Qwen3.6-Plus adds agentic coding, stronger tool calling, multimodal reasoning, and a 1M default context window. Qwen 3.5 has more model sizes and open weights.

5

Can I run Qwen 3.5 locally?

Yes. The open-weight Qwen 3.5 models work with Ollama, vLLM, llama.cpp, and Hugging Face Transformers. Hardware requirements depend on the model size and quantization level.

6

What context length does Qwen support?

Open Qwen 3.5 models support 262K native tokens, extensible to ~1M with compatible frameworks. Hosted models (Flash, Plus, Qwen3.6-Plus) ship with 1M context by default.

7

What is thinking mode?

Thinking mode lets the model reason step-by-step before answering. It improves accuracy on complex tasks like debugging, math, and multi-step analysis. You can toggle it on or off in the chat.

8

What is the difference between Dense and MoE models?

Dense models (9B, 27B) use all parameters for every token — simpler and more predictable. MoE models (35B-A3B, 122B-A10B, 397B-A17B) activate only a subset of experts per token, giving stronger reasoning with lower compute per token.

9

Can Qwen 3.5 write code?

Yes. All Qwen 3.5 models handle coding tasks. For simple code, Qwen3.5-9B or Flash work well. For complex multi-file projects, Qwen3.5-Plus or Qwen3.6-Plus are better choices.

10

Does Qwen support web search?

On this site, you can enable web search in the chat to let the model pull in live information before answering. This is useful for current events, documentation lookups, and fact-checking.

Latest Qwen Guides

Guides, comparisons, and setup notes for Qwen 3.5, Qwen 3.6, Ollama, Hugging Face, vLLM, GGUF, and more.

Qwen 3.5 API: Access Qwen Models via OpenRouter and Alibaba Cloud

Qwen 3.5 API: Access Qwen Models via OpenRouter and Alibaba Cloud

How to access Qwen 3.5 through APIs including OpenRouter and Alibaba Cloud DashScope. Covers API keys, Python and curl examples, pricing, and model IDs for qwen 3.5 api integration.

Apr 3, 2026
QQ-Chat Team
Qwen 3.5 Benchmark Results: How It Compares Across Tasks

Qwen 3.5 Benchmark Results: How It Compares Across Tasks

A breakdown of Qwen 3.5 benchmark results across reasoning, coding, math, and multilingual tasks — with comparisons to GPT-4o, Claude, and Llama.

Apr 3, 2026
QQ-Chat Team
Qwen 3.5 for Coding: Best Models, Tips, and Examples

Qwen 3.5 for Coding: Best Models, Tips, and Examples

Which Qwen 3.5 model is best for coding? A practical guide to code generation, debugging, and IDE workflows with the Qwen 3.5 family.

Apr 3, 2026
QQ-Chat Team
Qwen 3.5 GGUF: Download and Run Quantized Models Locally

Qwen 3.5 GGUF: Download and Run Quantized Models Locally

How to download and run Qwen 3.5 GGUF files for local inference with llama.cpp. Covers quantization levels, where to find GGUF files, setup instructions, and quality vs performance tradeoffs.

Apr 3, 2026
QQ-Chat Team
Qwen 3.5 on Hugging Face: Download, Deploy, and Chat

Qwen 3.5 on Hugging Face: Download, Deploy, and Chat

How to find, download, and run Qwen 3.5 models from Hugging Face. Covers model cards, transformers integration, inference examples, and variant comparison for qwen 3.5 huggingface users.

Apr 3, 2026
QQ-Chat Team
Run Qwen 3.5 Locally: Complete Setup Guide

Run Qwen 3.5 Locally: Complete Setup Guide

Everything you need to run Qwen 3.5 locally: hardware requirements by model size, setup with Ollama, vLLM, llama.cpp, and Transformers, plus performance optimization tips.

Apr 3, 2026
QQ-Chat Team
Qwen 3.5 Uncensored: What You Should Know

Qwen 3.5 Uncensored: What You Should Know

An honest guide to qwen3.5 uncensored models: what uncensored really means, where community fine-tunes come from, safety considerations, and how to approach them responsibly.

Apr 3, 2026
QQ-Chat Team
Fine-Tune Qwen 3.5 with Unsloth: Step-by-Step Guide

Fine-Tune Qwen 3.5 with Unsloth: Step-by-Step Guide

A practical guide to fine-tuning Qwen 3.5 with Unsloth, covering installation, LoRA and QLoRA setup, training configuration, and exporting your fine-tuned model.

Apr 3, 2026
QQ-Chat Team
How to Run Qwen 3.5 with vLLM: Setup Guide

How to Run Qwen 3.5 with vLLM: Setup Guide

A complete guide to running Qwen 3.5 models with vLLM for high-throughput inference. Covers installation, serving, model variants, and performance tuning for vllm qwen3.5 deployments.

Apr 3, 2026
QQ-Chat Team
Qwen 3.5 vs Qwen 3.6: What Changed and Which to Choose

Qwen 3.5 vs Qwen 3.6: What Changed and Which to Choose

A detailed comparison of Qwen 3.5 vs Qwen 3.6 covering key differences, feature upgrades, context window changes, and practical guidance on which version fits your workflow.

Apr 3, 2026
QQ-Chat Team
Qwen3.6-Plus API: How to Access and Integrate Qwen 3.6

Qwen3.6-Plus API: How to Access and Integrate Qwen 3.6

How to use the Qwen3.6-Plus API — endpoints, request format, tool calling, and integration tips for developers building with Qwen 3.6.

Apr 3, 2026
QQ-Chat Team
Qwen3.6-Plus for Coding: When It Beats Qwen3.5-Plus

Qwen3.6-Plus for Coding: When It Beats Qwen3.5-Plus

A practical look at where Qwen3.6-Plus feels better for coding than Qwen3.5-Plus, and where the older model is still enough.

Apr 3, 2026
QQ-Chat Team
Qwen3.6-Plus 1M Context Window: What It Changes in Practice

Qwen3.6-Plus 1M Context Window: What It Changes in Practice

A practical guide to Qwen3.6-Plus's 1M context window, what it helps with, and what long context still does not solve.

Apr 3, 2026
QQ-Chat Team
Qwen3.6-Plus: Features, Use Cases, and How It Compares to Qwen 3.5

Qwen3.6-Plus: Features, Use Cases, and How It Compares to Qwen 3.5

What Qwen3.6-Plus brings to the table — agentic coding, 1M context, multimodal reasoning — and when to pick it over Qwen 3.5 models.

Apr 3, 2026
QQ-Chat Team
Qwen3.5 Ollama: when local runs make sense and when the browser is easier

Qwen3.5 Ollama: when local runs make sense and when the browser is easier

A practical first pass on qwen3.5 ollama: what people usually mean, how to decide between local and hosted use, and which Qwen page to open next.

Apr 2, 2026
QQ-Chat Team