qwen

Qwen: Qwen3 235B A22B Thinking 2507

Qwen3-235B-A22B-Thinking-2507 is a high-performance, open-weight Mixture-of-Experts (MoE) language model optimized for complex reasoning tasks. It activates 22B of its 235B parameters per forward pass and natively supports up to 262,144 tokens of context. This "thinking-only" variant enhances structured logical reasoning, mathematics, science, and long-form generation, showing strong benchmark performance across AIME, SuperGPQA, LiveCodeBench, and MMLU-Redux. It enforces a special reasoning mode (</think>) and is designed for high-token outputs (up to 81,920 tokens) in challenging domains. The model is instruction-tuned and excels at step-by-step reasoning, tool use, agentic workflows, and multilingual tasks. This release represents the most capable open-source variant in the Qwen3-235B series, surpassing many closed models in structured reasoning use cases.

Input Cost

$0.11

per 1M tokens

Output Cost

$0.60

per 1M tokens

Context Window

262,144

tokens

Compare vs GPT-4o

                Developer ID: qwen/qwen3-235b-a22b-thinking-2507            

Related Models

qwen

$0.80/1M

Qwen: Qwen VL Max

Qwen VL Max is a visual understanding model with 7500 tokens context length. It excels in ...

📝 131,072 ctx Compare →

qwen

$0.03/1M

Qwen: Qwen2.5 Coder 7B Instruct

Qwen2.5-Coder-7B-Instruct is a 7B parameter instruction-tuned language model optimized for...

📝 32,768 ctx Compare →

qwen

$1.60/1M

Qwen: Qwen-Max

Qwen-Max, based on Qwen2.5, provides the best inference performance among [Qwen models](/q...

📝 32,768 ctx Compare →