Best AI Models for Long Context

google

$0.13/1M

Google: Gemma 4 26B A4B

Gemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model from Google Deep...

📝 262,144 ctx Compare →

google

$0.14/1M

Google: Gemma 4 31B

Gemma 4 31B Instruct is Google DeepMind's 30.7B dense multimodal model supporting text and...

📝 262,144 ctx Compare →

qwen

Free/1M

Qwen: Qwen3.6 Plus (free)

Qwen 3.6 Plus builds on a hybrid architecture that combines efficient linear attention wit...

📝 1,000,000 ctx Compare →

z-ai

$1.20/1M

Z.ai: GLM 5V Turbo

GLM-5V-Turbo is Z.ai’s first native multimodal agent foundation model, built for vision-...

📝 202,752 ctx Compare →

arcee-ai

$0.22/1M

Arcee AI: Trinity Large Thinking

Trinity Large Thinking is a powerful open source reasoning model from the team at Arcee AI...

📝 262,144 ctx Compare →

x-ai

$2.00/1M

xAI: Grok 4.20 Multi-Agent

Grok 4.20 Multi-Agent is a variant of xAI’s Grok 4.20 designed for collaborative, agent-...

📝 2,000,000 ctx Compare →

x-ai

$2.00/1M

xAI: Grok 4.20

Grok 4.20 is xAI's newest flagship model with industry-leading speed and agentic tool call...

📝 2,000,000 ctx Compare →

google

Free/1M

Google: Lyria 3 Pro Preview

Full-length songs are priced at $0.08 per song. Lyria 3 is Google's family of music genera...

📝 1,048,576 ctx Compare →

google

Free/1M

Google: Lyria 3 Clip Preview

30 second duration clips are priced at $0.04 per clip. Lyria 3 is Google's family of music...

📝 1,048,576 ctx Compare →

kwaipilot

$0.30/1M

Kwaipilot: KAT-Coder-Pro V2

KAT-Coder-Pro V2 is the latest high-performance model in KwaiKAT’s KAT-Coder series, des...

📝 256,000 ctx Compare →

xiaomi

$0.40/1M

Xiaomi: MiMo-V2-Omni

MiMo-V2-Omni is a frontier omni-modal model that natively processes image, video, and audi...

📝 262,144 ctx Compare →

xiaomi

$1.00/1M

Xiaomi: MiMo-V2-Pro

MiMo-V2-Pro is Xiaomi's flagship foundation model, featuring over 1T total parameters and ...

📝 1,048,576 ctx Compare →

minimax

$0.30/1M

MiniMax: MiniMax M2.7

MiniMax-M2.7 is a next-generation large language model designed for autonomous, real-world...

📝 204,800 ctx Compare →

openai

$0.20/1M

OpenAI: GPT-5.4 Nano

GPT-5.4 nano is the most lightweight and cost-efficient variant of the GPT-5.4 family, opt...

📝 400,000 ctx Compare →

openai

$0.75/1M

OpenAI: GPT-5.4 Mini

GPT-5.4 mini brings the core capabilities of GPT-5.4 to a faster, more efficient model opt...

📝 400,000 ctx Compare →

mistralai

$0.15/1M

Mistral: Mistral Small 4

Mistral Small 4 is the next major release in the Mistral Small family, unifying the capabi...

📝 262,144 ctx Compare →

z-ai

$1.20/1M

Z.ai: GLM 5 Turbo

GLM-5 Turbo is a new model from Z.ai designed for fast inference and strong performance in...

📝 202,752 ctx Compare →

nvidia

Free/1M

NVIDIA: Nemotron 3 Super (free)

NVIDIA Nemotron 3 Super is a 120B-parameter open hybrid MoE model, activating just 12B par...

📝 262,144 ctx Compare →

nvidia

$0.10/1M

NVIDIA: Nemotron 3 Super

NVIDIA Nemotron 3 Super is a 120B-parameter open hybrid MoE model, activating just 12B par...

📝 262,144 ctx Compare →

bytedance-seed

$0.25/1M

ByteDance Seed: Seed-2.0-Lite

Seed-2.0-Lite is a versatile, cost‑efficient enterprise workhorse that delivers strong m...

📝 262,144 ctx Compare →

qwen

$0.05/1M

Qwen: Qwen3.5-9B

Qwen3.5-9B is a multimodal foundation model from the Qwen3.5 family, designed to deliver s...

📝 256,000 ctx Compare →

openai

$30.00/1M

OpenAI: GPT-5.4 Pro

GPT-5.4 Pro is OpenAI's most advanced model, building on GPT-5.4's unified architecture wi...

📝 1,050,000 ctx Compare →

openai

$2.50/1M

OpenAI: GPT-5.4

GPT-5.4 is OpenAI’s latest frontier model, unifying the Codex and GPT lines into a singl...

📝 1,050,000 ctx Compare →

inception

$0.25/1M

Inception: Mercury 2

Mercury 2 is an extremely fast reasoning LLM, and the first reasoning diffusion LLM (dLLM)...

📝 128,000 ctx Compare →

openai

$1.75/1M

OpenAI: GPT-5.3 Chat

GPT-5.3 Chat is an update to ChatGPT's most-used model that makes everyday conversations s...

📝 128,000 ctx Compare →

google

$0.25/1M

Google: Gemini 3.1 Flash Lite Preview

Gemini 3.1 Flash Lite Preview is Google's high-efficiency model optimized for high-volume ...

📝 1,048,576 ctx Compare →

bytedance-seed

$0.10/1M

ByteDance Seed: Seed-2.0-Mini

Seed-2.0-mini targets latency-sensitive, high-concurrency, and cost-sensitive scenarios, e...

📝 262,144 ctx Compare →

qwen

$0.16/1M

Qwen: Qwen3.5-35B-A3B

The Qwen3.5 Series 35B-A3B is a native vision-language model designed with a hybrid archit...

📝 262,144 ctx Compare →

qwen

$0.20/1M

Qwen: Qwen3.5-27B

The Qwen3.5 27B native vision-language Dense model incorporates a linear attention mechani...

📝 262,144 ctx Compare →

qwen

$0.26/1M

Qwen: Qwen3.5-122B-A10B

The Qwen3.5 122B-A10B native vision-language model is built on a hybrid architecture that ...

📝 262,144 ctx Compare →

qwen

$0.07/1M

Qwen: Qwen3.5-Flash

The Qwen3.5 native vision-language Flash models are built on a hybrid architecture that in...

📝 1,000,000 ctx Compare →

google

$2.00/1M

Google: Gemini 3.1 Pro Preview Custom Tools

Gemini 3.1 Pro Preview Custom Tools is a variant of Gemini 3.1 Pro that improves tool sele...

📝 1,048,576 ctx Compare →

openai

$1.75/1M

OpenAI: GPT-5.3-Codex

GPT-5.3-Codex is OpenAI’s most advanced agentic coding model, combining the frontier sof...

📝 400,000 ctx Compare →

aion-labs

$0.80/1M

AionLabs: Aion-2.0

Aion-2.0 is a variant of DeepSeek V3.2 optimized for immersive roleplaying and storytellin...

📝 131,072 ctx Compare →

google

$2.00/1M

Google: Gemini 3.1 Pro Preview

Gemini 3.1 Pro Preview is Google’s frontier reasoning model, delivering enhanced softwar...

📝 1,048,576 ctx Compare →

anthropic

$3.00/1M

Anthropic: Claude Sonnet 4.6

Sonnet 4.6 is Anthropic's most capable Sonnet-class model yet, with frontier performance a...

📝 1,000,000 ctx Compare →

qwen

$0.26/1M

Qwen: Qwen3.5 Plus 2026-02-15

The Qwen3.5 native vision-language series Plus models are built on a hybrid architecture t...

📝 1,000,000 ctx Compare →

qwen

$0.39/1M

Qwen: Qwen3.5 397B A17B

The Qwen3.5 series 397B-A17B native vision-language model is built on a hybrid architectur...

📝 262,144 ctx Compare →

minimax

Free/1M

MiniMax: MiniMax M2.5 (free)

MiniMax-M2.5 is a SOTA large language model designed for real-world productivity. Trained ...

📝 196,608 ctx Compare →

minimax

$0.12/1M

MiniMax: MiniMax M2.5

MiniMax-M2.5 is a SOTA large language model designed for real-world productivity. Trained ...

📝 196,608 ctx Compare →

qwen

$0.78/1M

Qwen: Qwen3 Max Thinking

Qwen3-Max-Thinking is the flagship reasoning model in the Qwen3 series, designed for high-...

📝 262,144 ctx Compare →

anthropic

$5.00/1M

Anthropic: Claude Opus 4.6

Opus 4.6 is Anthropic’s strongest model for coding and long-running professional tasks. ...

📝 1,000,000 ctx Compare →

qwen

$0.12/1M

Qwen: Qwen3 Coder Next

Qwen3-Coder-Next is an open-weight causal language model optimized for coding agents and l...

📝 262,144 ctx Compare →

openrouter

Free/1M

Free Models Router

The simplest way to get free inference. openrouter/free is a router that selects free mode...

📝 200,000 ctx Compare →

stepfun

Free/1M

StepFun: Step 3.5 Flash (free)

Step 3.5 Flash is StepFun's most capable open-source foundation model. Built on a sparse M...

📝 256,000 ctx Compare →

stepfun

$0.10/1M

StepFun: Step 3.5 Flash

Step 3.5 Flash is StepFun's most capable open-source foundation model. Built on a sparse M...

📝 262,144 ctx Compare →

arcee-ai

Free/1M

Arcee AI: Trinity Large Preview (free)

Trinity-Large-Preview is a frontier-scale open-weight language model from Arcee, built as ...

📝 131,000 ctx Compare →

moonshotai

$0.38/1M

MoonshotAI: Kimi K2.5

Kimi K2.5 is Moonshot AI's native multimodal model, delivering state-of-the-art visual cod...

📝 262,144 ctx Compare →

upstage

$0.15/1M

Upstage: Solar Pro 3

Solar Pro 3 is Upstage's powerful Mixture-of-Experts (MoE) language model. With 102B total...

📝 128,000 ctx Compare →

writer

$0.60/1M

Writer: Palmyra X5

Palmyra X5 is Writer's most advanced model, purpose-built for building and scaling AI agen...

📝 1,040,000 ctx Compare →

openai

$2.50/1M

OpenAI: GPT Audio

The gpt-audio model is OpenAI's first generally available audio model. The new snapshot fe...

📝 128,000 ctx Compare →

openai

$0.60/1M

OpenAI: GPT Audio Mini

A cost-efficient version of GPT Audio. The new snapshot features an upgraded decoder for m...

📝 128,000 ctx Compare →

z-ai

$0.06/1M

Z.ai: GLM 4.7 Flash

As a 30B-class SOTA model, GLM-4.7-Flash offers a new option that balances performance and...

📝 202,752 ctx Compare →

openai

$1.75/1M

OpenAI: GPT-5.2-Codex

GPT-5.2-Codex is an upgraded version of GPT-5.1-Codex optimized for software engineering a...

📝 400,000 ctx Compare →

bytedance-seed

$0.08/1M

ByteDance Seed: Seed 1.6 Flash

Seed 1.6 Flash is an ultra-fast multimodal deep thinking model by ByteDance Seed, supporti...

📝 262,144 ctx Compare →

bytedance-seed

$0.25/1M

ByteDance Seed: Seed 1.6

Seed 1.6 is a general-purpose model released by the ByteDance Seed team. It incorporates m...

📝 262,144 ctx Compare →

minimax

$0.27/1M

MiniMax: MiniMax M2.1

MiniMax-M2.1 is a lightweight, state-of-the-art large language model optimized for coding,...

📝 196,608 ctx Compare →

z-ai

$0.39/1M

Z.ai: GLM 4.7

GLM-4.7 is Z.ai’s latest flagship model, featuring upgrades in two key areas: enhanced p...

📝 202,752 ctx Compare →

google

$0.50/1M

Google: Gemini 3 Flash Preview

Gemini 3 Flash Preview is a high speed, high value thinking model designed for agentic wor...

📝 1,048,576 ctx Compare →

xiaomi

$0.09/1M

Xiaomi: MiMo-V2-Flash

MiMo-V2-Flash is an open-source foundation language model developed by Xiaomi. It is a Mix...

📝 262,144 ctx Compare →

nvidia

Free/1M

NVIDIA: Nemotron 3 Nano 30B A3B (free)

NVIDIA Nemotron 3 Nano 30B A3B is a small language MoE model with highest compute efficien...

📝 256,000 ctx Compare →

nvidia

$0.05/1M

NVIDIA: Nemotron 3 Nano 30B A3B

NVIDIA Nemotron 3 Nano 30B A3B is a small language MoE model with highest compute efficien...

📝 262,144 ctx Compare →

openai

$1.75/1M

OpenAI: GPT-5.2 Chat

GPT-5.2 Chat (AKA Instant) is the fast, lightweight member of the 5.2 family, optimized fo...

📝 128,000 ctx Compare →

openai

$21.00/1M

OpenAI: GPT-5.2 Pro

GPT-5.2 Pro is OpenAI’s most advanced model, offering major improvements in agentic codi...

📝 400,000 ctx Compare →

openai

$1.75/1M

OpenAI: GPT-5.2

GPT-5.2 is the latest frontier-grade model in the GPT-5 series, offering stronger agentic ...

📝 400,000 ctx Compare →

mistralai

$0.40/1M

Mistral: Devstral 2 2512

Devstral 2 is a state-of-the-art open-source model by Mistral AI specializing in agentic c...

📝 262,144 ctx Compare →

relace

$1.00/1M

Relace: Relace Search

The relace-search model uses 4-12 `view_file` and `grep` tools in parallel to explore a co...

📝 256,000 ctx Compare →

z-ai

$0.30/1M

Z.ai: GLM 4.6V

GLM-4.6V is a large multimodal model designed for high-fidelity visual understanding and l...

📝 131,072 ctx Compare →

nex-agi

$0.14/1M

Nex AGI: DeepSeek V3.1 Nex N1

DeepSeek V3.1 Nex-N1 is the flagship release of the Nex-N1 series — a post-trained model...

📝 131,072 ctx Compare →

openrouter

$-1,000,000.00/1M

Body Builder (beta)

Transform your natural language requests into structured OpenRouter API request objects. D...

📝 128,000 ctx Compare →

openai

$1.25/1M

OpenAI: GPT-5.1-Codex-Max

GPT-5.1-Codex-Max is OpenAI’s latest agentic coding model, designed for long-running, hi...

📝 400,000 ctx Compare →

amazon

$0.30/1M

Amazon: Nova 2 Lite

Nova 2 Lite is a fast, cost-effective reasoning model for everyday workloads that can proc...

📝 1,000,000 ctx Compare →

mistralai

$0.20/1M

Mistral: Ministral 3 14B 2512

The largest model in the Ministral 3 family, Ministral 3 14B offers frontier capabilities ...

📝 262,144 ctx Compare →

mistralai

$0.15/1M

Mistral: Ministral 3 8B 2512

A balanced model in the Ministral 3 family, Ministral 3 8B is a powerful, efficient tiny l...

📝 262,144 ctx Compare →

mistralai

$0.10/1M

Mistral: Ministral 3 3B 2512

The smallest model in the Ministral 3 family, Ministral 3 3B is a powerful, efficient tiny...

📝 131,072 ctx Compare →

mistralai

$0.50/1M

Mistral: Mistral Large 3 2512

Mistral Large 3 2512 is Mistral’s most capable model to date, featuring a sparse mixture...

📝 262,144 ctx Compare →

arcee-ai

Free/1M

Arcee AI: Trinity Mini (free)

Trinity Mini is a 26B-parameter (3B active) sparse mixture-of-experts language model featu...

📝 131,072 ctx Compare →

arcee-ai

$0.05/1M

Arcee AI: Trinity Mini

Trinity Mini is a 26B-parameter (3B active) sparse mixture-of-experts language model featu...

📝 131,072 ctx Compare →

deepseek

$0.40/1M

DeepSeek: DeepSeek V3.2 Speciale

DeepSeek-V3.2-Speciale is a high-compute variant of DeepSeek-V3.2 optimized for maximum re...

📝 163,840 ctx Compare →

deepseek

$0.26/1M

DeepSeek: DeepSeek V3.2

DeepSeek-V3.2 is a large language model designed to harmonize high computational efficienc...

📝 163,840 ctx Compare →

prime-intellect

$0.20/1M

Prime Intellect: INTELLECT-3

INTELLECT-3 is a 106B-parameter Mixture-of-Experts model (12B active) post-trained from GL...

📝 131,072 ctx Compare →

anthropic

$5.00/1M

Anthropic: Claude Opus 4.5

Claude Opus 4.5 is Anthropic’s frontier reasoning model optimized for complex software e...

📝 200,000 ctx Compare →

x-ai

$0.20/1M

xAI: Grok 4.1 Fast

Grok 4.1 Fast is xAI's best agentic tool calling model that shines in real-world use cases...

📝 2,000,000 ctx Compare →

deepcogito

$1.25/1M

Deep Cogito: Cogito v2.1 671B

Cogito v2.1 671B MoE represents one of the strongest open models globally, matching perfor...

📝 128,000 ctx Compare →

openai

$1.25/1M

OpenAI: GPT-5.1

GPT-5.1 is the latest frontier-grade model in the GPT-5 series, offering stronger general-...

📝 400,000 ctx Compare →

openai

$1.25/1M

OpenAI: GPT-5.1 Chat

GPT-5.1 Chat (AKA Instant is the fast, lightweight member of the 5.1 family, optimized for...

📝 128,000 ctx Compare →

openai

$1.25/1M

OpenAI: GPT-5.1-Codex

GPT-5.1-Codex is a specialized version of GPT-5.1 optimized for software engineering and c...

📝 400,000 ctx Compare →

openai

$0.25/1M

OpenAI: GPT-5.1-Codex-Mini

GPT-5.1-Codex-Mini is a smaller and faster version of GPT-5.1-Codex

📝 400,000 ctx Compare →

moonshotai

$0.47/1M

MoonshotAI: Kimi K2 Thinking

Kimi K2 Thinking is Moonshot AI’s most advanced open reasoning model to date, extending ...

📝 131,072 ctx Compare →

amazon

$2.50/1M

Amazon: Nova Premier 1.0

Amazon Nova Premier is the most capable of Amazon’s multimodal models for complex reason...

📝 1,000,000 ctx Compare →

perplexity

$3.00/1M

Perplexity: Sonar Pro Search

Exclusively available on the OpenRouter API, Sonar Pro's new Pro Search mode is Perplexity...

📝 200,000 ctx Compare →

openai

$0.08/1M

OpenAI: gpt-oss-safeguard-20b

gpt-oss-safeguard-20b is a safety reasoning model from OpenAI built upon gpt-oss-20b. This...

📝 131,072 ctx Compare →

nvidia

Free/1M

NVIDIA: Nemotron Nano 12B 2 VL (free)

NVIDIA Nemotron Nano 2 VL is a 12-billion-parameter open multimodal reasoning model design...

📝 128,000 ctx Compare →

nvidia

$0.20/1M

NVIDIA: Nemotron Nano 12B 2 VL

NVIDIA Nemotron Nano 2 VL is a 12-billion-parameter open multimodal reasoning model design...

📝 131,072 ctx Compare →

minimax

$0.26/1M

MiniMax: MiniMax M2

MiniMax-M2 is a compact, high-efficiency large language model optimized for end-to-end cod...

📝 196,608 ctx Compare →

qwen

$0.10/1M

Qwen: Qwen3 VL 32B Instruct

Qwen3-VL-32B-Instruct is a large-scale multimodal vision-language model designed for high-...

📝 131,072 ctx Compare →

ibm-granite

$0.02/1M

IBM: Granite 4.0 Micro

Granite-4.0-H-Micro is a 3B parameter from the Granite 4 family of models. These models ar...

📝 131,000 ctx Compare →

openai

$2.50/1M

OpenAI: GPT-5 Image Mini

GPT-5 Image Mini combines OpenAI's advanced language capabilities, powered by [GPT-5 Mini]...

📝 400,000 ctx Compare →

anthropic

$1.00/1M

Anthropic: Claude Haiku 4.5

Claude Haiku 4.5 is Anthropic’s fastest and most efficient model, delivering near-fronti...

📝 200,000 ctx Compare →

qwen

$0.12/1M

Qwen: Qwen3 VL 8B Thinking

Qwen3-VL-8B-Thinking is the reasoning-optimized variant of the Qwen3-VL-8B multimodal mode...

📝 131,072 ctx Compare →

qwen

$0.08/1M

Qwen: Qwen3 VL 8B Instruct

Qwen3-VL-8B-Instruct is a multimodal vision-language model from the Qwen3-VL series, built...

📝 131,072 ctx Compare →

openai

$10.00/1M

OpenAI: GPT-5 Image

[GPT-5](https://openrouter.ai/openai/gpt-5) Image combines OpenAI's GPT-5 model with state...

📝 400,000 ctx Compare →

openai

$10.00/1M

OpenAI: o3 Deep Research

o3-deep-research is OpenAI's advanced model for deep research, designed to tackle complex,...

📝 200,000 ctx Compare →

openai

$2.00/1M

OpenAI: o4 Mini Deep Research

o4-mini-deep-research is OpenAI's faster, more affordable deep research model—ideal for ...

📝 200,000 ctx Compare →

nvidia

$0.10/1M

NVIDIA: Llama 3.3 Nemotron Super 49B V1.5

Llama-3.3-Nemotron-Super-49B-v1.5 is a 49B-parameter, English-centric reasoning/chat model...

📝 131,072 ctx Compare →

baidu

$0.07/1M

Baidu: ERNIE 4.5 21B A3B Thinking

ERNIE-4.5-21B-A3B-Thinking is Baidu's upgraded lightweight MoE model, refined to boost rea...

📝 131,072 ctx Compare →

qwen

$0.13/1M

Qwen: Qwen3 VL 30B A3B Thinking

Qwen3-VL-30B-A3B-Thinking is a multimodal model that unifies strong text generation with v...

📝 131,072 ctx Compare →

qwen

$0.13/1M

Qwen: Qwen3 VL 30B A3B Instruct

Qwen3-VL-30B-A3B-Instruct is a multimodal model that unifies strong text generation with v...

📝 131,072 ctx Compare →

openai

$15.00/1M

OpenAI: GPT-5 Pro

GPT-5 Pro is OpenAI’s most advanced model, offering major improvements in reasoning, cod...

📝 400,000 ctx Compare →

z-ai

$0.39/1M

Z.ai: GLM 4.6

Compared with GLM-4.5, this generation brings several key improvements: Longer context win...

📝 204,800 ctx Compare →

anthropic

$3.00/1M

Anthropic: Claude Sonnet 4.5

Claude Sonnet 4.5 is Anthropic’s most advanced Sonnet model to date, optimized for real-...

📝 1,000,000 ctx Compare →

deepseek

$0.27/1M

DeepSeek: DeepSeek V3.2 Exp

DeepSeek-V3.2-Exp is an experimental large language model released by DeepSeek as an inter...

📝 163,840 ctx Compare →

thedrummer

$0.30/1M

TheDrummer: Cydonia 24B V4.1

Uncensored and creative writing model based on Mistral Small 3.2 24B with good recall, pro...

📝 131,072 ctx Compare →

relace

$0.85/1M

Relace: Relace Apply 3

Relace Apply 3 is a specialized code-patching LLM that merges AI-suggested edits straight ...

📝 256,000 ctx Compare →

google

$0.10/1M

Google: Gemini 2.5 Flash Lite Preview 09-2025

Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized...

📝 1,048,576 ctx Compare →

qwen

$0.26/1M

Qwen: Qwen3 VL 235B A22B Thinking

Qwen3-VL-235B-A22B Thinking is a multimodal model that unifies strong text generation with...

📝 131,072 ctx Compare →

qwen

$0.20/1M

Qwen: Qwen3 VL 235B A22B Instruct

Qwen3-VL-235B-A22B Instruct is an open-weight multimodal model that unifies strong text ge...

📝 262,144 ctx Compare →

qwen

$0.78/1M

Qwen: Qwen3 Max

Qwen3-Max is an updated release built on the Qwen3 series, offering major improvements in ...

📝 262,144 ctx Compare →

qwen

$0.65/1M

Qwen: Qwen3 Coder Plus

Qwen3 Coder Plus is Alibaba's proprietary version of the Open Source Qwen3 Coder 480B A35B...

📝 1,000,000 ctx Compare →

openai

$1.25/1M

OpenAI: GPT-5 Codex

GPT-5-Codex is a specialized version of GPT-5 optimized for software engineering and codin...

📝 400,000 ctx Compare →

deepseek

$0.21/1M

DeepSeek: DeepSeek V3.1 Terminus

DeepSeek-V3.1 Terminus is an update to [DeepSeek V3.1](/deepseek/deepseek-chat-v3.1) that ...

📝 163,840 ctx Compare →

x-ai

$0.20/1M

xAI: Grok 4 Fast

Grok 4 Fast is xAI's latest multimodal model with SOTA cost-efficiency and a 2M token cont...

📝 2,000,000 ctx Compare →

alibaba

$0.09/1M

Tongyi DeepResearch 30B A3B

Tongyi DeepResearch is an agentic large language model developed by Tongyi Lab, with 30 bi...

📝 131,072 ctx Compare →

qwen

$0.20/1M

Qwen: Qwen3 Coder Flash

Qwen3 Coder Flash is Alibaba's fast and cost efficient version of their proprietary Qwen3 ...

📝 1,000,000 ctx Compare →

qwen

$0.10/1M

Qwen: Qwen3 Next 80B A3B Thinking

Qwen3-Next-80B-A3B-Thinking is a reasoning-first chat model in the Qwen3-Next line that ou...

📝 131,072 ctx Compare →

qwen

Free/1M

Qwen: Qwen3 Next 80B A3B Instruct (free)

Qwen3-Next-80B-A3B-Instruct is an instruction-tuned chat model in the Qwen3-Next series op...

📝 262,144 ctx Compare →

qwen

$0.09/1M

Qwen: Qwen3 Next 80B A3B Instruct

Qwen3-Next-80B-A3B-Instruct is an instruction-tuned chat model in the Qwen3-Next series op...

📝 262,144 ctx Compare →

meituan

$0.20/1M

Meituan: LongCat Flash Chat

LongCat-Flash-Chat is a large-scale Mixture-of-Experts (MoE) model with 560B total paramet...

📝 131,072 ctx Compare →

qwen

$0.26/1M

Qwen: Qwen Plus 0728 (thinking)

Qwen Plus 0728, based on the Qwen3 foundation model, is a 1 million context hybrid reasoni...

📝 1,000,000 ctx Compare →

qwen

$0.26/1M

Qwen: Qwen Plus 0728

Qwen Plus 0728, based on the Qwen3 foundation model, is a 1 million context hybrid reasoni...

📝 1,000,000 ctx Compare →

nvidia

Free/1M

NVIDIA: Nemotron Nano 9B V2 (free)

NVIDIA-Nemotron-Nano-9B-v2 is a large language model (LLM) trained from scratch by NVIDIA,...

📝 128,000 ctx Compare →

nvidia

$0.04/1M

NVIDIA: Nemotron Nano 9B V2

NVIDIA-Nemotron-Nano-9B-v2 is a large language model (LLM) trained from scratch by NVIDIA,...

📝 131,072 ctx Compare →

moonshotai

$0.40/1M

MoonshotAI: Kimi K2 0905

Kimi K2 0905 is the September update of [Kimi K2 0711](moonshotai/kimi-k2). It is a large-...

📝 131,072 ctx Compare →

qwen

$0.08/1M

Qwen: Qwen3 30B A3B Thinking 2507

Qwen3-30B-A3B-Thinking-2507 is a 30B parameter Mixture-of-Experts reasoning model optimize...

📝 131,072 ctx Compare →

x-ai

$0.20/1M

xAI: Grok Code Fast 1

Grok Code Fast 1 is a speedy and economical reasoning model that excels at agentic coding....

📝 256,000 ctx Compare →

nousresearch

$0.13/1M

Nous: Hermes 4 70B

Hermes 4 70B is a hybrid reasoning model from Nous Research, built on Meta-Llama-3.1-70B. ...

📝 131,072 ctx Compare →

nousresearch

$1.00/1M

Nous: Hermes 4 405B

Hermes 4 is a large-scale reasoning model built on Meta-Llama-3.1-405B and released by Nou...

📝 131,072 ctx Compare →

openai

$2.50/1M

OpenAI: GPT-4o Audio

The gpt-4o-audio-preview model adds support for audio inputs as prompts. This enhancement ...

📝 128,000 ctx Compare →

mistralai

$0.40/1M

Mistral: Mistral Medium 3.1

Mistral Medium 3.1 is an updated version of Mistral Medium 3, which is a high-performance ...

📝 131,072 ctx Compare →

ai21

$2.00/1M

AI21: Jamba Large 1.7

Jamba Large 1.7 is the latest model in the Jamba open family, offering improvements in gro...

📝 256,000 ctx Compare →

openai

$1.25/1M

OpenAI: GPT-5 Chat

GPT-5 Chat is designed for advanced, natural, multimodal, and context-aware conversations ...

📝 128,000 ctx Compare →

openai

$1.25/1M

OpenAI: GPT-5

GPT-5 is OpenAI’s most advanced model, offering major improvements in reasoning, code qu...

📝 400,000 ctx Compare →

openai

$0.25/1M

OpenAI: GPT-5 Mini

GPT-5 Mini is a compact version of GPT-5, designed to handle lighter-weight reasoning task...

📝 400,000 ctx Compare →

openai

$0.05/1M

OpenAI: GPT-5 Nano

GPT-5-Nano is the smallest and fastest variant in the GPT-5 system, optimized for develope...

📝 400,000 ctx Compare →

openai

Free/1M

OpenAI: gpt-oss-120b (free)

gpt-oss-120b is an open-weight, 117B-parameter Mixture-of-Experts (MoE) language model fro...

📝 131,072 ctx Compare →

openai

$0.04/1M

OpenAI: gpt-oss-120b

gpt-oss-120b is an open-weight, 117B-parameter Mixture-of-Experts (MoE) language model fro...

📝 131,072 ctx Compare →

openai

Free/1M

OpenAI: gpt-oss-20b (free)

gpt-oss-20b is an open-weight 21B parameter model released by OpenAI under the Apache 2.0 ...

📝 131,072 ctx Compare →

openai

$0.03/1M

OpenAI: gpt-oss-20b

gpt-oss-20b is an open-weight 21B parameter model released by OpenAI under the Apache 2.0 ...

📝 131,072 ctx Compare →

anthropic

$15.00/1M

Anthropic: Claude Opus 4.1

Claude Opus 4.1 is an updated version of Anthropic’s flagship model, offering improved p...

📝 200,000 ctx Compare →

mistralai

$0.30/1M

Mistral: Codestral 2508

Mistral's cutting-edge language model for coding released end of July 2025. Codestral spec...

📝 256,000 ctx Compare →

qwen

$0.07/1M

Qwen: Qwen3 Coder 30B A3B Instruct

Qwen3-Coder-30B-A3B-Instruct is a 30.5B parameter Mixture-of-Experts (MoE) model with 128 ...

📝 160,000 ctx Compare →

qwen

$0.09/1M

Qwen: Qwen3 30B A3B Instruct 2507

Qwen3-30B-A3B-Instruct-2507 is a 30.5B-parameter mixture-of-experts language model from Qw...

📝 262,144 ctx Compare →

z-ai

$0.60/1M

Z.ai: GLM 4.5

GLM-4.5 is our latest flagship foundation model, purpose-built for agent-based application...

📝 131,072 ctx Compare →

z-ai

Free/1M

Z.ai: GLM 4.5 Air (free)

GLM-4.5-Air is the lightweight variant of our latest flagship model family, also purpose-b...

📝 131,072 ctx Compare →

z-ai

$0.13/1M

Z.ai: GLM 4.5 Air

GLM-4.5-Air is the lightweight variant of our latest flagship model family, also purpose-b...

📝 131,072 ctx Compare →

qwen

$0.15/1M

Qwen: Qwen3 235B A22B Thinking 2507

Qwen3-235B-A22B-Thinking-2507 is a high-performance, open-weight Mixture-of-Experts (MoE) ...

📝 131,072 ctx Compare →

z-ai

$0.10/1M

Z.ai: GLM 4 32B

GLM 4 32B is a cost-effective foundation language model. It can efficiently perform comple...

📝 128,000 ctx Compare →

qwen

Free/1M

Qwen: Qwen3 Coder 480B A35B (free)

Qwen3-Coder-480B-A35B-Instruct is a Mixture-of-Experts (MoE) code generation model develop...

📝 262,000 ctx Compare →

qwen

$0.22/1M

Qwen: Qwen3 Coder 480B A35B

Qwen3-Coder-480B-A35B-Instruct is a Mixture-of-Experts (MoE) code generation model develop...

📝 262,144 ctx Compare →

bytedance

$0.10/1M

ByteDance: UI-TARS 7B

UI-TARS-1.5 is a multimodal vision-language agent optimized for GUI-based environments, in...

📝 128,000 ctx Compare →

google

$0.10/1M

Google: Gemini 2.5 Flash Lite

Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized...

📝 1,048,576 ctx Compare →

qwen

$0.07/1M

Qwen: Qwen3 235B A22B Instruct 2507

Qwen3-235B-A22B-Instruct-2507 is a multilingual, instruction-tuned mixture-of-experts lang...

📝 262,144 ctx Compare →

switchpoint

$0.85/1M

Switchpoint Router

Switchpoint AI's router instantly analyzes your request and directs it to the optimal AI f...

📝 131,072 ctx Compare →

moonshotai

$0.57/1M

MoonshotAI: Kimi K2 0711

Kimi K2 Instruct is a large-scale Mixture-of-Experts (MoE) language model developed by Moo...

📝 131,072 ctx Compare →

mistralai

$0.40/1M

Mistral: Devstral Medium

Devstral Medium is a high-performance code generation and agentic reasoning model develope...

📝 131,072 ctx Compare →

mistralai

$0.10/1M

Mistral: Devstral Small 1.1

Devstral Small 1.1 is a 24B parameter open-weight language model for software engineering ...

📝 131,072 ctx Compare →

x-ai

$3.00/1M

xAI: Grok 4

Grok 4 is xAI's latest reasoning model with a 256k context window. It supports parallel to...

📝 256,000 ctx Compare →

tencent

$0.14/1M

Tencent: Hunyuan A13B Instruct

Hunyuan-A13B is a 13B active parameter Mixture-of-Experts (MoE) language model developed b...

📝 131,072 ctx Compare →

tngtech

$0.30/1M

TNG: DeepSeek R1T2 Chimera

DeepSeek-TNG-R1T2-Chimera is the second-generation Chimera model from TNG Tech. It is a 67...

📝 163,840 ctx Compare →

morph

$0.90/1M

Morph: Morph V3 Large

Morph's high-accuracy apply model for complex code edits. ~4,500 tokens/sec with 98% accur...

📝 262,144 ctx Compare →

inception

$0.25/1M

Inception: Mercury

Mercury is the first diffusion large language model (dLLM). Applying a breakthrough discre...

📝 128,000 ctx Compare →

mistralai

$0.08/1M

Mistral: Mistral Small 3.2 24B

Mistral-Small-3.2-24B-Instruct-2506 is an updated 24B parameter model from Mistral optimiz...

📝 128,000 ctx Compare →

minimax

$0.40/1M

MiniMax: MiniMax M1

MiniMax-M1 is a large-scale, open-weight reasoning model designed for extended context and...

📝 1,000,000 ctx Compare →

google

$0.30/1M

Google: Gemini 2.5 Flash

Gemini 2.5 Flash is Google's state-of-the-art workhorse model, specifically designed for a...

📝 1,048,576 ctx Compare →

google

$1.25/1M

Google: Gemini 2.5 Pro

Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, co...

📝 1,048,576 ctx Compare →

openai

$20.00/1M

OpenAI: o3 Pro

The o-series of models are trained with reinforcement learning to think before they answer...

📝 200,000 ctx Compare →

x-ai

$0.30/1M

xAI: Grok 3 Mini

A lightweight model that thinks before responding. Fast, smart, and great for logic-based ...

📝 131,072 ctx Compare →

x-ai

$3.00/1M

xAI: Grok 3

Grok 3 is the latest model from xAI. It's their flagship model that excels at enterprise u...

📝 131,072 ctx Compare →

google

$1.25/1M

Google: Gemini 2.5 Pro Preview 06-05

Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, co...

📝 1,048,576 ctx Compare →

deepseek

$0.45/1M

DeepSeek: R1 0528

May 28th update to the [original DeepSeek R1](/deepseek/deepseek-r1) Performance on par wi...

📝 163,840 ctx Compare →

anthropic

$15.00/1M

Anthropic: Claude Opus 4

Claude Opus 4 is benchmarked as the world’s best coding model, at time of release, bring...

📝 200,000 ctx Compare →

anthropic

$3.00/1M

Anthropic: Claude Sonnet 4

Claude Sonnet 4 significantly enhances the capabilities of its predecessor, Sonnet 3.7, ex...

📝 200,000 ctx Compare →

mistralai

$0.40/1M

Mistral: Mistral Medium 3

Mistral Medium 3 is a high-performance enterprise-grade language model designed to deliver...

📝 131,072 ctx Compare →

google

$1.25/1M

Google: Gemini 2.5 Pro Preview 05-06

Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, co...

📝 1,048,576 ctx Compare →

arcee-ai

$0.18/1M

Arcee AI: Spotlight

Spotlight is a 7‑billion‑parameter vision‑language model derived from Qwen 2.5‑VL ...

📝 131,072 ctx Compare →

arcee-ai

$0.90/1M

Arcee AI: Maestro Reasoning

Maestro Reasoning is Arcee's flagship analysis model: a 32 B‑parameter derivative of Qwe...

📝 131,072 ctx Compare →

arcee-ai

$0.75/1M

Arcee AI: Virtuoso Large

Virtuoso‑Large is Arcee's top‑tier general‑purpose LLM at 72 B parameters, tuned to ...

📝 131,072 ctx Compare →

inception

$0.25/1M

Inception: Mercury Coder

Mercury Coder is the first diffusion large language model (dLLM). Applying a breakthrough ...

📝 128,000 ctx Compare →

meta-llama

$0.18/1M

Meta: Llama Guard 4 12B

Llama Guard 4 is a Llama 4 Scout-derived multimodal pretrained model, fine-tuned for conte...

📝 163,840 ctx Compare →

qwen

$0.46/1M

Qwen: Qwen3 235B A22B

Qwen3-235B-A22B is a 235B parameter mixture-of-experts (MoE) model developed by Qwen, acti...

📝 131,072 ctx Compare →

openai

$1.10/1M

OpenAI: o4 Mini High

OpenAI o4-mini-high is the same model as [o4-mini](/openai/o4-mini) with reasoning_effort ...

📝 200,000 ctx Compare →

openai

$2.00/1M

OpenAI: o3

o3 is a well-rounded and powerful model across domains. It sets a new standard for math, s...

📝 200,000 ctx Compare →

openai

$1.10/1M

OpenAI: o4 Mini

OpenAI o4-mini is a compact reasoning model in the o-series, optimized for fast, cost-effi...

📝 200,000 ctx Compare →

openai

$2.00/1M

OpenAI: GPT-4.1

GPT-4.1 is a flagship large language model optimized for advanced instruction following, r...

📝 1,047,576 ctx Compare →

openai

$0.40/1M

OpenAI: GPT-4.1 Mini

GPT-4.1 Mini is a mid-sized model delivering performance competitive with GPT-4o at substa...

📝 1,047,576 ctx Compare →

openai

$0.10/1M

OpenAI: GPT-4.1 Nano

For tasks that demand low latency, GPT‑4.1 nano is the fastest and cheapest model in the...

📝 1,047,576 ctx Compare →

x-ai

$0.30/1M

xAI: Grok 3 Mini Beta

Grok 3 Mini is a lightweight, smaller thinking model. Unlike traditional models that gener...

📝 131,072 ctx Compare →

x-ai

$3.00/1M

xAI: Grok 3 Beta

Grok 3 is the latest model from xAI. It's their flagship model that excels at enterprise u...

📝 131,072 ctx Compare →

nvidia

$0.60/1M

NVIDIA: Llama 3.1 Nemotron Ultra 253B v1

Llama-3.1-Nemotron-Ultra-253B-v1 is a large language model (LLM) optimized for advanced re...

📝 131,072 ctx Compare →

meta-llama

$0.15/1M

Meta: Llama 4 Maverick

Llama 4 Maverick 17B Instruct (128E) is a high-capacity multimodal language model from Met...

📝 1,048,576 ctx Compare →

meta-llama

$0.08/1M

Meta: Llama 4 Scout

Llama 4 Scout 17B Instruct (16E) is a mixture-of-experts (MoE) language model developed by...

📝 327,680 ctx Compare →

qwen

$0.20/1M

Qwen: Qwen2.5 VL 32B Instruct

Qwen2.5-VL-32B is a multimodal vision-language model fine-tuned through reinforcement lear...

📝 128,000 ctx Compare →

deepseek

$0.20/1M

DeepSeek: DeepSeek V3 0324

DeepSeek V3, a 685B-parameter, mixture-of-experts model, is the latest iteration of the fl...

📝 163,840 ctx Compare →

openai

$150.00/1M

OpenAI: o1-pro

The o1 series of models are trained with reinforcement learning to think before they answe...

📝 200,000 ctx Compare →

mistralai

$0.03/1M

Mistral: Mistral Small 3.1 24B

Mistral Small 3.1 24B Instruct is an upgraded variant of Mistral Small 3 (2501), featuring...

📝 131,072 ctx Compare →

allenai

$0.05/1M

AllenAI: Olmo 2 32B Instruct

OLMo-2 32B Instruct is a supervised instruction-finetuned variant of the OLMo-2 32B March ...

📝 128,000 ctx Compare →

google

$0.04/1M

Google: Gemma 3 4B

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It ha...

📝 131,072 ctx Compare →

google

$0.04/1M

Google: Gemma 3 12B

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It ha...

📝 131,072 ctx Compare →

cohere

$2.50/1M

Cohere: Command A

Command A is an open-weights 111B parameter model with a 256k context window focused on de...

📝 256,000 ctx Compare →

openai

$0.15/1M

OpenAI: GPT-4o-mini Search Preview

GPT-4o mini Search Preview is a specialized model for web search in Chat Completions. It i...

📝 128,000 ctx Compare →

openai

$2.50/1M

OpenAI: GPT-4o Search Preview

GPT-4o Search Previewis a specialized model for web search in Chat Completions. It is trai...

📝 128,000 ctx Compare →

google

Free/1M

Google: Gemma 3 27B (free)

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It ha...

📝 131,072 ctx Compare →

google

$0.08/1M

Google: Gemma 3 27B

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It ha...

📝 131,072 ctx Compare →

perplexity

$2.00/1M

Perplexity: Sonar Reasoning Pro

Note: Sonar Pro pricing includes Perplexity search pricing. See [details here](https://doc...

📝 128,000 ctx Compare →

perplexity

$3.00/1M

Perplexity: Sonar Pro

Note: Sonar Pro pricing includes Perplexity search pricing. See [details here](https://doc...

📝 200,000 ctx Compare →

perplexity

$2.00/1M

Perplexity: Sonar Deep Research

Sonar Deep Research is a research-focused model designed for multi-step retrieval, synthes...

📝 128,000 ctx Compare →

qwen

$0.15/1M

Qwen: QwQ 32B

QwQ is the reasoning model of the Qwen series. Compared with conventional instruction-tune...

📝 131,072 ctx Compare →

google

$0.08/1M

Google: Gemini 2.0 Flash Lite

Gemini 2.0 Flash Lite offers a significantly faster time to first token (TTFT) compared to...

📝 1,048,576 ctx Compare →

anthropic

$3.00/1M

Anthropic: Claude 3.7 Sonnet

Claude 3.7 Sonnet is an advanced large language model with improved reasoning, coding, and...

📝 200,000 ctx Compare →

anthropic

$3.00/1M

Anthropic: Claude 3.7 Sonnet (thinking)

Claude 3.7 Sonnet is an advanced large language model with improved reasoning, coding, and...

📝 200,000 ctx Compare →

meta-llama

$0.02/1M

Llama Guard 3 8B

Llama Guard 3 is a Llama-3.1-8B pretrained model, fine-tuned for content safety classifica...

📝 131,072 ctx Compare →

openai

$1.10/1M

OpenAI: o3 Mini High

OpenAI o3-mini-high is the same model as [o3-mini](/openai/o3-mini) with reasoning_effort ...

📝 200,000 ctx Compare →

google

$0.10/1M

Google: Gemini 2.0 Flash

Gemini Flash 2.0 offers a significantly faster time to first token (TTFT) compared to [Gem...

📝 1,048,576 ctx Compare →

qwen

$0.14/1M

Qwen: Qwen VL Plus

Qwen's Enhanced Large Visual Language Model. Significantly upgraded for detailed recogniti...

📝 131,072 ctx Compare →

aion-labs

$4.00/1M

AionLabs: Aion-1.0

Aion-1.0 is a multi-model system designed for high performance across various tasks, inclu...

📝 131,072 ctx Compare →

aion-labs

$0.70/1M

AionLabs: Aion-1.0-Mini

Aion-1.0-Mini 32B parameter model is a distilled version of the DeepSeek-R1 model, designe...

📝 131,072 ctx Compare →

qwen

$0.52/1M

Qwen: Qwen VL Max

Qwen VL Max is a visual understanding model with 7500 tokens context length. It excels in ...

📝 131,072 ctx Compare →

qwen

$0.03/1M

Qwen: Qwen-Turbo

Qwen-Turbo, based on Qwen2.5, is a 1M context model that provides fast speed and low cost,...

📝 131,072 ctx Compare →

qwen

$0.26/1M

Qwen: Qwen-Plus

Qwen-Plus, based on the Qwen2.5 foundation model, is a 131K context model with a balanced ...

📝 1,000,000 ctx Compare →

openai

$1.10/1M

OpenAI: o3 Mini

OpenAI o3-mini is a cost-efficient language model optimized for STEM reasoning tasks, part...

📝 200,000 ctx Compare →

deepseek

$0.70/1M

DeepSeek: R1 Distill Llama 70B

DeepSeek R1 Distill Llama 70B is a distilled large language model based on [Llama-3.3-70B-...

📝 131,072 ctx Compare →

minimax

$0.20/1M

MiniMax: MiniMax-01

MiniMax-01 is a combines MiniMax-Text-01 for text generation and MiniMax-VL-01 for image u...

📝 1,000,192 ctx Compare →

deepseek

$0.32/1M

DeepSeek: DeepSeek V3

DeepSeek-V3 is the latest model from the DeepSeek team, building upon the instruction foll...

📝 163,840 ctx Compare →

sao10k

$0.65/1M

Sao10K: Llama 3.3 Euryale 70B

Euryale L3.3 70B is a model focused on creative roleplay from [Sao10k](https://ko-fi.com/s...

📝 131,072 ctx Compare →

openai

$15.00/1M

OpenAI: o1

The latest and strongest model family from OpenAI, o1 is designed to spend more time think...

📝 200,000 ctx Compare →

cohere

$0.04/1M

Cohere: Command R7B (12-2024)

Command R7B (12-2024) is a small, fast update of the Command R+ model, delivered in Decemb...

📝 128,000 ctx Compare →

meta-llama

$0.10/1M

Meta: Llama 3.3 70B Instruct

The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction...

📝 131,072 ctx Compare →

amazon

$0.06/1M

Amazon: Nova Lite 1.0

Amazon Nova Lite 1.0 is a very low-cost multimodal model from Amazon that focused on fast ...

📝 300,000 ctx Compare →

amazon

$0.04/1M

Amazon: Nova Micro 1.0

Amazon Nova Micro 1.0 is a text-only model that delivers the lowest latency responses in t...

📝 128,000 ctx Compare →

amazon

$0.80/1M

Amazon: Nova Pro 1.0

Amazon Nova Pro 1.0 is a capable multimodal model from Amazon focused on providing a combi...

📝 300,000 ctx Compare →

openai

$2.50/1M

OpenAI: GPT-4o (2024-11-20)

The 2024-11-20 version of GPT-4o offers a leveled-up creative writing ability with more na...

📝 128,000 ctx Compare →

mistralai

$2.00/1M

Mistral Large 2411

Mistral Large 2 2411 is an update of [Mistral Large 2](/mistralai/mistral-large) released ...

📝 131,072 ctx Compare →

mistralai

$2.00/1M

Mistral Large 2407

This is Mistral AI's flagship model, Mistral Large 2 (version mistral-large-2407). It's a ...

📝 131,072 ctx Compare →

mistralai

$2.00/1M

Mistral: Pixtral Large 2411

Pixtral Large is a 124B parameter, open-weight, multimodal model built on top of [Mistral ...

📝 131,072 ctx Compare →

anthropic

$0.80/1M

Anthropic: Claude 3.5 Haiku

Claude 3.5 Haiku features offers enhanced capabilities in speed, coding accuracy, and tool...

📝 200,000 ctx Compare →

nvidia

$1.20/1M

NVIDIA: Llama 3.1 Nemotron 70B Instruct

NVIDIA's Llama 3.1 Nemotron 70B is a language model designed for generating precise and us...

📝 131,072 ctx Compare →

meta-llama

Free/1M

Meta: Llama 3.2 3B Instruct (free)

Llama 3.2 3B is a 3-billion-parameter multilingual large language model, optimized for adv...

📝 131,072 ctx Compare →

meta-llama

$0.05/1M

Meta: Llama 3.2 11B Vision Instruct

Llama 3.2 11B Vision is a multimodal model with 11 billion parameters, designed to handle ...

📝 131,072 ctx Compare →

cohere

$2.50/1M

Cohere: Command R+ (08-2024)

command-r-plus-08-2024 is an update of the [Command R+](/models/cohere/command-r-plus) wit...

📝 128,000 ctx Compare →

cohere

$0.15/1M

Cohere: Command R (08-2024)

command-r-08-2024 is an update of the [Command R](/models/cohere/command-r) with improved ...

📝 128,000 ctx Compare →

sao10k

$0.85/1M

Sao10K: Llama 3.1 Euryale 70B v2.2

Euryale L3.1 70B v2.2 is a model focused on creative roleplay from [Sao10k](https://ko-fi....

📝 131,072 ctx Compare →

nousresearch

$0.30/1M

Nous: Hermes 3 70B Instruct

Hermes 3 is a generalist language model with many improvements over [Hermes 2](/models/nou...

📝 131,072 ctx Compare →

nousresearch

Free/1M

Nous: Hermes 3 405B Instruct (free)

Hermes 3 is a generalist language model with many improvements over Hermes 2, including ad...

📝 131,072 ctx Compare →

nousresearch

$1.00/1M

Nous: Hermes 3 405B Instruct

Hermes 3 is a generalist language model with many improvements over Hermes 2, including ad...

📝 131,072 ctx Compare →

openai

$2.50/1M

OpenAI: GPT-4o (2024-08-06)

The 2024-08-06 version of GPT-4o offers improved performance in structured outputs, with t...

📝 128,000 ctx Compare →

meta-llama

$0.40/1M

Meta: Llama 3.1 70B Instruct

Meta's latest class of model (Llama 3.1) launched with a variety of sizes & flavors. This ...

📝 131,072 ctx Compare →

mistralai

$0.02/1M

Mistral: Mistral Nemo

A 12B parameter model with a 128k token context length built by Mistral in collaboration w...

📝 131,072 ctx Compare →

openai

$0.15/1M

OpenAI: GPT-4o-mini (2024-07-18)

GPT-4o mini is OpenAI's newest model after [GPT-4 Omni](/models/openai/gpt-4o), supporting...

📝 128,000 ctx Compare →

openai

$0.15/1M

OpenAI: GPT-4o-mini

GPT-4o mini is OpenAI's newest model after [GPT-4 Omni](/models/openai/gpt-4o), supporting...

📝 128,000 ctx Compare →

openai

$5.00/1M

OpenAI: GPT-4o (2024-05-13)

GPT-4o ("o" for "omni") is OpenAI's latest AI model, supporting both text and image inputs...

📝 128,000 ctx Compare →

openai

$2.50/1M

OpenAI: GPT-4o

GPT-4o ("o" for "omni") is OpenAI's latest AI model, supporting both text and image inputs...

📝 128,000 ctx Compare →

openai

$6.00/1M

OpenAI: GPT-4o (extended)

GPT-4o ("o" for "omni") is OpenAI's latest AI model, supporting both text and image inputs...

📝 128,000 ctx Compare →

openai

$10.00/1M

OpenAI: GPT-4 Turbo

The latest GPT-4 Turbo model with vision capabilities. Vision requests can now use JSON mo...

📝 128,000 ctx Compare →

anthropic

$0.25/1M

Anthropic: Claude 3 Haiku

Claude 3 Haiku is Anthropic's fastest and most compact model for near-instant responsivene...

📝 200,000 ctx Compare →

mistralai

$2.00/1M

Mistral Large

This is Mistral AI's flagship model, Mistral Large 2 (version `mistral-large-2407`). It's ...

📝 128,000 ctx Compare →

openai

$10.00/1M

OpenAI: GPT-4 Turbo Preview

The preview GPT-4 model with improved instruction following, JSON mode, reproducible outpu...

📝 128,000 ctx Compare →

openrouter

$-1,000,000.00/1M

Auto Router

Your prompt will be processed by a meta-model and routed to one of dozens of models (see b...

📝 2,000,000 ctx Compare →

openai

$10.00/1M

OpenAI: GPT-4 Turbo (older v1106)

The latest GPT-4 Turbo model with vision capabilities. Vision requests can now use JSON mo...

📝 128,000 ctx Compare →

Long Context Models

Google: Gemma 4 26B A4B

Google: Gemma 4 31B

Qwen: Qwen3.6 Plus (free)

Z.ai: GLM 5V Turbo

Arcee AI: Trinity Large Thinking

xAI: Grok 4.20 Multi-Agent

xAI: Grok 4.20

Google: Lyria 3 Pro Preview

Google: Lyria 3 Clip Preview

Kwaipilot: KAT-Coder-Pro V2

Xiaomi: MiMo-V2-Omni

Xiaomi: MiMo-V2-Pro

MiniMax: MiniMax M2.7

OpenAI: GPT-5.4 Nano

OpenAI: GPT-5.4 Mini

Mistral: Mistral Small 4

Z.ai: GLM 5 Turbo

NVIDIA: Nemotron 3 Super (free)

NVIDIA: Nemotron 3 Super

ByteDance Seed: Seed-2.0-Lite

Qwen: Qwen3.5-9B

OpenAI: GPT-5.4 Pro

OpenAI: GPT-5.4

Inception: Mercury 2

OpenAI: GPT-5.3 Chat

Google: Gemini 3.1 Flash Lite Preview

ByteDance Seed: Seed-2.0-Mini

Qwen: Qwen3.5-35B-A3B

Qwen: Qwen3.5-27B

Qwen: Qwen3.5-122B-A10B

Qwen: Qwen3.5-Flash

Google: Gemini 3.1 Pro Preview Custom Tools

OpenAI: GPT-5.3-Codex

AionLabs: Aion-2.0

Google: Gemini 3.1 Pro Preview

Anthropic: Claude Sonnet 4.6

Qwen: Qwen3.5 Plus 2026-02-15

Qwen: Qwen3.5 397B A17B

MiniMax: MiniMax M2.5 (free)

MiniMax: MiniMax M2.5

Qwen: Qwen3 Max Thinking

Anthropic: Claude Opus 4.6

Qwen: Qwen3 Coder Next

Free Models Router

StepFun: Step 3.5 Flash (free)

StepFun: Step 3.5 Flash

Arcee AI: Trinity Large Preview (free)

MoonshotAI: Kimi K2.5

Upstage: Solar Pro 3

Writer: Palmyra X5

OpenAI: GPT Audio

OpenAI: GPT Audio Mini

Z.ai: GLM 4.7 Flash

OpenAI: GPT-5.2-Codex

ByteDance Seed: Seed 1.6 Flash

ByteDance Seed: Seed 1.6

MiniMax: MiniMax M2.1

Z.ai: GLM 4.7

Google: Gemini 3 Flash Preview

Xiaomi: MiMo-V2-Flash

NVIDIA: Nemotron 3 Nano 30B A3B (free)

NVIDIA: Nemotron 3 Nano 30B A3B

OpenAI: GPT-5.2 Chat

OpenAI: GPT-5.2 Pro

OpenAI: GPT-5.2

Mistral: Devstral 2 2512

Relace: Relace Search

Z.ai: GLM 4.6V

Nex AGI: DeepSeek V3.1 Nex N1

Body Builder (beta)

OpenAI: GPT-5.1-Codex-Max

Amazon: Nova 2 Lite

Mistral: Ministral 3 14B 2512

Mistral: Ministral 3 8B 2512

Mistral: Ministral 3 3B 2512

Mistral: Mistral Large 3 2512

Arcee AI: Trinity Mini (free)

Arcee AI: Trinity Mini

DeepSeek: DeepSeek V3.2 Speciale