Long Context Models
Writer: Palmyra X5
Palmyra X5 is Writer's most advanced model, purpose-built for building and scaling AI agen...
OpenAI: GPT Audio
The gpt-audio model is OpenAI's first generally available audio model. The new snapshot fe...
OpenAI: GPT Audio Mini
A cost-efficient version of GPT Audio. The new snapshot features an upgraded decoder for m...
Z.AI: GLM 4.7 Flash
As a 30B-class SOTA model, GLM-4.7-Flash offers a new option that balances performance and...
OpenAI: GPT-5.2-Codex
GPT-5.2-Codex is an upgraded version of GPT-5.1-Codex optimized for software engineering a...
ByteDance Seed: Seed 1.6 Flash
Seed 1.6 Flash is an ultra-fast multimodal deep thinking model by ByteDance Seed, supporti...
ByteDance Seed: Seed 1.6
Seed 1.6 is a general-purpose model released by the ByteDance Seed team. It incorporates m...
MiniMax: MiniMax M2.1
MiniMax-M2.1 is a lightweight, state-of-the-art large language model optimized for coding,...
Z.AI: GLM 4.7
GLM-4.7 is Z.AIβs latest flagship model, featuring upgrades in two key areas: enhanced p...
Google: Gemini 3 Flash Preview
Gemini 3 Flash Preview is a high speed, high value thinking model designed for agentic wor...
Xiaomi: MiMo-V2-Flash
MiMo-V2-Flash is an open-source foundation language model developed by Xiaomi. It is a Mix...
NVIDIA: Nemotron 3 Nano 30B A3B (free)
NVIDIA Nemotron 3 Nano 30B A3B is a small language MoE model with highest compute efficien...
NVIDIA: Nemotron 3 Nano 30B A3B
NVIDIA Nemotron 3 Nano 30B A3B is a small language MoE model with highest compute efficien...
OpenAI: GPT-5.2 Chat
GPT-5.2 Chat (AKA Instant) is the fast, lightweight member of the 5.2 family, optimized fo...
OpenAI: GPT-5.2 Pro
GPT-5.2 Pro is OpenAIβs most advanced model, offering major improvements in agentic codi...
OpenAI: GPT-5.2
GPT-5.2 is the latest frontier-grade model in the GPT-5 series, offering stronger agentic ...
Mistral: Devstral 2 2512 (free)
Devstral 2 is a state-of-the-art open-source model by Mistral AI specializing in agentic c...
Mistral: Devstral 2 2512
Devstral 2 is a state-of-the-art open-source model by Mistral AI specializing in agentic c...
Relace: Relace Search
The relace-search model uses 4-12 `view_file` and `grep` tools in parallel to explore a co...
Z.AI: GLM 4.6V
GLM-4.6V is a large multimodal model designed for high-fidelity visual understanding and l...
Nex AGI: DeepSeek V3.1 Nex N1
DeepSeek V3.1 Nex-N1 is the flagship release of the Nex-N1 series β a post-trained model...
Body Builder (beta)
Transform your natural language requests into structured OpenRouter API request objects. D...
OpenAI: GPT-5.1-Codex-Max
GPT-5.1-Codex-Max is OpenAIβs latest agentic coding model, designed for long-running, hi...
Amazon: Nova 2 Lite
Nova 2 Lite is a fast, cost-effective reasoning model for everyday workloads that can proc...
Mistral: Ministral 3 14B 2512
The largest model in the Ministral 3 family, Ministral 3 14B offers frontier capabilities ...
Mistral: Ministral 3 8B 2512
A balanced model in the Ministral 3 family, Ministral 3 8B is a powerful, efficient tiny l...
Mistral: Ministral 3 3B 2512
The smallest model in the Ministral 3 family, Ministral 3 3B is a powerful, efficient tiny...
Mistral: Mistral Large 3 2512
Mistral Large 3 2512 is Mistralβs most capable model to date, featuring a sparse mixture...
Arcee AI: Trinity Mini (free)
Trinity Mini is a 26B-parameter (3B active) sparse mixture-of-experts language model featu...
Arcee AI: Trinity Mini
Trinity Mini is a 26B-parameter (3B active) sparse mixture-of-experts language model featu...
DeepSeek: DeepSeek V3.2 Speciale
DeepSeek-V3.2-Speciale is a high-compute variant of DeepSeek-V3.2 optimized for maximum re...
DeepSeek: DeepSeek V3.2
DeepSeek-V3.2 is a large language model designed to harmonize high computational efficienc...
Prime Intellect: INTELLECT-3
INTELLECT-3 is a 106B-parameter Mixture-of-Experts model (12B active) post-trained from GL...
TNG: R1T Chimera (free)
TNG-R1T-Chimera is an experimental LLM with a faible for creative storytelling and charact...
TNG: R1T Chimera
TNG-R1T-Chimera is an experimental LLM with a faible for creative storytelling and charact...
Anthropic: Claude Opus 4.5
Claude Opus 4.5 is Anthropicβs frontier reasoning model optimized for complex software e...
xAI: Grok 4.1 Fast
Grok 4.1 Fast is xAI's best agentic tool calling model that shines in real-world use cases...
Google: Gemini 3 Pro Preview
Gemini 3 Pro is Googleβs flagship frontier model for high-precision multimodal reasoning...
Deep Cogito: Cogito v2.1 671B
Cogito v2.1 671B MoE represents one of the strongest open models globally, matching perfor...
OpenAI: GPT-5.1
GPT-5.1 is the latest frontier-grade model in the GPT-5 series, offering stronger general-...
OpenAI: GPT-5.1 Chat
GPT-5.1 Chat (AKA Instant is the fast, lightweight member of the 5.1 family, optimized for...
OpenAI: GPT-5.1-Codex
GPT-5.1-Codex is a specialized version of GPT-5.1 optimized for software engineering and c...
OpenAI: GPT-5.1-Codex-Mini
GPT-5.1-Codex-Mini is a smaller and faster version of GPT-5.1-Codex
Kwaipilot: KAT-Coder-Pro V1
KAT-Coder-Pro V1 is KwaiKAT's most advanced agentic coding model in the KAT-Coder series. ...
MoonshotAI: Kimi K2 Thinking
Kimi K2 Thinking is Moonshot AIβs most advanced open reasoning model to date, extending ...
Amazon: Nova Premier 1.0
Amazon Nova Premier is the most capable of Amazonβs multimodal models for complex reason...
Perplexity: Sonar Pro Search
Exclusively available on the OpenRouter API, Sonar Pro's new Pro Search mode is Perplexity...
OpenAI: gpt-oss-safeguard-20b
gpt-oss-safeguard-20b is a safety reasoning model from OpenAI built upon gpt-oss-20b. This...
NVIDIA: Nemotron Nano 12B 2 VL (free)
NVIDIA Nemotron Nano 2 VL is a 12-billion-parameter open multimodal reasoning model design...
NVIDIA: Nemotron Nano 12B 2 VL
NVIDIA Nemotron Nano 2 VL is a 12-billion-parameter open multimodal reasoning model design...
MiniMax: MiniMax M2
MiniMax-M2 is a compact, high-efficiency large language model optimized for end-to-end cod...
Qwen: Qwen3 VL 32B Instruct
Qwen3-VL-32B-Instruct is a large-scale multimodal vision-language model designed for high-...
IBM: Granite 4.0 Micro
Granite-4.0-H-Micro is a 3B parameter from the Granite 4 family of models. These models ar...
OpenAI: GPT-5 Image Mini
GPT-5 Image Mini combines OpenAI's advanced language capabilities, powered by [GPT-5 Mini]...
Anthropic: Claude Haiku 4.5
Claude Haiku 4.5 is Anthropicβs fastest and most efficient model, delivering near-fronti...
Qwen: Qwen3 VL 8B Thinking
Qwen3-VL-8B-Thinking is the reasoning-optimized variant of the Qwen3-VL-8B multimodal mode...
Qwen: Qwen3 VL 8B Instruct
Qwen3-VL-8B-Instruct is a multimodal vision-language model from the Qwen3-VL series, built...
OpenAI: GPT-5 Image
[GPT-5](https://openrouter.ai/openai/gpt-5) Image combines OpenAI's GPT-5 model with state...
OpenAI: o3 Deep Research
o3-deep-research is OpenAI's advanced model for deep research, designed to tackle complex,...
OpenAI: o4 Mini Deep Research
o4-mini-deep-research is OpenAI's faster, more affordable deep research modelβideal for ...
NVIDIA: Llama 3.3 Nemotron Super 49B V1.5
Llama-3.3-Nemotron-Super-49B-v1.5 is a 49B-parameter, English-centric reasoning/chat model...
Baidu: ERNIE 4.5 21B A3B Thinking
ERNIE-4.5-21B-A3B-Thinking is Baidu's upgraded lightweight MoE model, refined to boost rea...
Qwen: Qwen3 VL 30B A3B Thinking
Qwen3-VL-30B-A3B-Thinking is a multimodal model that unifies strong text generation with v...
Qwen: Qwen3 VL 30B A3B Instruct
Qwen3-VL-30B-A3B-Instruct is a multimodal model that unifies strong text generation with v...
OpenAI: GPT-5 Pro
GPT-5 Pro is OpenAIβs most advanced model, offering major improvements in reasoning, cod...
Z.AI: GLM 4.6
Compared with GLM-4.5, this generation brings several key improvements: Longer context wi...
Z.AI: GLM 4.6 (exacto)
Compared with GLM-4.5, this generation brings several key improvements: Longer context wi...
Anthropic: Claude Sonnet 4.5
Claude Sonnet 4.5 is Anthropicβs most advanced Sonnet model to date, optimized for real-...
DeepSeek: DeepSeek V3.2 Exp
DeepSeek-V3.2-Exp is an experimental large language model released by DeepSeek as an inter...
TheDrummer: Cydonia 24B V4.1
Uncensored and creative writing model based on Mistral Small 3.2 24B with good recall, pro...
Relace: Relace Apply 3
Relace Apply 3 is a specialized code-patching LLM that merges AI-suggested edits straight ...
Google: Gemini 2.5 Flash Preview 09-2025
Gemini 2.5 Flash Preview September 2025 Checkpoint is Google's state-of-the-art workhorse ...
Google: Gemini 2.5 Flash Lite Preview 09-2025
Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized...
Qwen: Qwen3 VL 235B A22B Thinking
Qwen3-VL-235B-A22B Thinking is a multimodal model that unifies strong text generation with...
Qwen: Qwen3 VL 235B A22B Instruct
Qwen3-VL-235B-A22B Instruct is an open-weight multimodal model that unifies strong text ge...
Qwen: Qwen3 Max
Qwen3-Max is an updated release built on the Qwen3 series, offering major improvements in ...
Qwen: Qwen3 Coder Plus
Qwen3 Coder Plus is Alibaba's proprietary version of the Open Source Qwen3 Coder 480B A35B...
OpenAI: GPT-5 Codex
GPT-5-Codex is a specialized version of GPT-5 optimized for software engineering and codin...
DeepSeek: DeepSeek V3.1 Terminus (exacto)
DeepSeek-V3.1 Terminus is an update to [DeepSeek V3.1](/deepseek/deepseek-chat-v3.1) that ...
DeepSeek: DeepSeek V3.1 Terminus
DeepSeek-V3.1 Terminus is an update to [DeepSeek V3.1](/deepseek/deepseek-chat-v3.1) that ...
xAI: Grok 4 Fast
Grok 4 Fast is xAI's latest multimodal model with SOTA cost-efficiency and a 2M token cont...
Tongyi DeepResearch 30B A3B
Tongyi DeepResearch is an agentic large language model developed by Tongyi Lab, with 30 bi...
Qwen: Qwen3 Coder Flash
Qwen3 Coder Flash is Alibaba's fast and cost efficient version of their proprietary Qwen3 ...
Qwen: Qwen3 Next 80B A3B Thinking
Qwen3-Next-80B-A3B-Thinking is a reasoning-first chat model in the Qwen3-Next line that ou...
Qwen: Qwen3 Next 80B A3B Instruct (free)
Qwen3-Next-80B-A3B-Instruct is an instruction-tuned chat model in the Qwen3-Next series op...
Qwen: Qwen3 Next 80B A3B Instruct
Qwen3-Next-80B-A3B-Instruct is an instruction-tuned chat model in the Qwen3-Next series op...
Meituan: LongCat Flash Chat
LongCat-Flash-Chat is a large-scale Mixture-of-Experts (MoE) model with 560B total paramet...
Qwen: Qwen Plus 0728
Qwen Plus 0728, based on the Qwen3 foundation model, is a 1 million context hybrid reasoni...
Qwen: Qwen Plus 0728 (thinking)
Qwen Plus 0728, based on the Qwen3 foundation model, is a 1 million context hybrid reasoni...
NVIDIA: Nemotron Nano 9B V2 (free)
NVIDIA-Nemotron-Nano-9B-v2 is a large language model (LLM) trained from scratch by NVIDIA,...
NVIDIA: Nemotron Nano 9B V2
NVIDIA-Nemotron-Nano-9B-v2 is a large language model (LLM) trained from scratch by NVIDIA,...
MoonshotAI: Kimi K2 0905
Kimi K2 0905 is the September update of [Kimi K2 0711](moonshotai/kimi-k2). It is a large-...
MoonshotAI: Kimi K2 0905 (exacto)
Kimi K2 0905 is the September update of [Kimi K2 0711](moonshotai/kimi-k2). It is a large-...
xAI: Grok Code Fast 1
Grok Code Fast 1 is a speedy and economical reasoning model that excels at agentic coding....
Nous: Hermes 4 70B
Hermes 4 70B is a hybrid reasoning model from Nous Research, built on Meta-Llama-3.1-70B. ...
Nous: Hermes 4 405B
Hermes 4 is a large-scale reasoning model built on Meta-Llama-3.1-405B and released by Nou...
OpenAI: GPT-4o Audio
The gpt-4o-audio-preview model adds support for audio inputs as prompts. This enhancement ...
Mistral: Mistral Medium 3.1
Mistral Medium 3.1 is an updated version of Mistral Medium 3, which is a high-performance ...
AI21: Jamba Mini 1.7
Jamba Mini 1.7 is a compact and efficient member of the Jamba open model family, incorpora...
AI21: Jamba Large 1.7
Jamba Large 1.7 is the latest model in the Jamba open family, offering improvements in gro...
OpenAI: GPT-5 Chat
GPT-5 Chat is designed for advanced, natural, multimodal, and context-aware conversations ...
OpenAI: GPT-5
GPT-5 is OpenAIβs most advanced model, offering major improvements in reasoning, code qu...
OpenAI: GPT-5 Mini
GPT-5 Mini is a compact version of GPT-5, designed to handle lighter-weight reasoning task...
OpenAI: GPT-5 Nano
GPT-5-Nano is the smallest and fastest variant in the GPT-5 system, optimized for develope...
OpenAI: gpt-oss-120b (free)
gpt-oss-120b is an open-weight, 117B-parameter Mixture-of-Experts (MoE) language model fro...
OpenAI: gpt-oss-120b
gpt-oss-120b is an open-weight, 117B-parameter Mixture-of-Experts (MoE) language model fro...
OpenAI: gpt-oss-120b (exacto)
gpt-oss-120b is an open-weight, 117B-parameter Mixture-of-Experts (MoE) language model fro...
OpenAI: gpt-oss-20b (free)
gpt-oss-20b is an open-weight 21B parameter model released by OpenAI under the Apache 2.0 ...
OpenAI: gpt-oss-20b
gpt-oss-20b is an open-weight 21B parameter model released by OpenAI under the Apache 2.0 ...
Anthropic: Claude Opus 4.1
Claude Opus 4.1 is an updated version of Anthropicβs flagship model, offering improved p...
Mistral: Codestral 2508
Mistral's cutting-edge language model for coding released end of July 2025. Codestral spec...
Qwen: Qwen3 Coder 30B A3B Instruct
Qwen3-Coder-30B-A3B-Instruct is a 30.5B parameter Mixture-of-Experts (MoE) model with 128 ...
Qwen: Qwen3 30B A3B Instruct 2507
Qwen3-30B-A3B-Instruct-2507 is a 30.5B-parameter mixture-of-experts language model from Qw...
Z.AI: GLM 4.5
GLM-4.5 is our latest flagship foundation model, purpose-built for agent-based application...
Z.AI: GLM 4.5 Air (free)
GLM-4.5-Air is the lightweight variant of our latest flagship model family, also purpose-b...
Z.AI: GLM 4.5 Air
GLM-4.5-Air is the lightweight variant of our latest flagship model family, also purpose-b...
Qwen: Qwen3 235B A22B Thinking 2507
Qwen3-235B-A22B-Thinking-2507 is a high-performance, open-weight Mixture-of-Experts (MoE) ...
Z.AI: GLM 4 32B
GLM 4 32B is a cost-effective foundation language model. It can efficiently perform compl...
Qwen: Qwen3 Coder 480B A35B (free)
Qwen3-Coder-480B-A35B-Instruct is a Mixture-of-Experts (MoE) code generation model develop...
Qwen: Qwen3 Coder 480B A35B
Qwen3-Coder-480B-A35B-Instruct is a Mixture-of-Experts (MoE) code generation model develop...
Qwen: Qwen3 Coder 480B A35B (exacto)
Qwen3-Coder-480B-A35B-Instruct is a Mixture-of-Experts (MoE) code generation model develop...
ByteDance: UI-TARS 7B
UI-TARS-1.5 is a multimodal vision-language agent optimized for GUI-based environments, in...
Google: Gemini 2.5 Flash Lite
Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized...
Qwen: Qwen3 235B A22B Instruct 2507
Qwen3-235B-A22B-Instruct-2507 is a multilingual, instruction-tuned mixture-of-experts lang...
Switchpoint Router
Switchpoint AI's router instantly analyzes your request and directs it to the optimal AI f...
MoonshotAI: Kimi K2 0711
Kimi K2 Instruct is a large-scale Mixture-of-Experts (MoE) language model developed by Moo...
Mistral: Devstral Medium
Devstral Medium is a high-performance code generation and agentic reasoning model develope...
Mistral: Devstral Small 1.1
Devstral Small 1.1 is a 24B parameter open-weight language model for software engineering ...
xAI: Grok 4
Grok 4 is xAI's latest reasoning model with a 256k context window. It supports parallel to...
Tencent: Hunyuan A13B Instruct
Hunyuan-A13B is a 13B active parameter Mixture-of-Experts (MoE) language model developed b...
TNG: DeepSeek R1T2 Chimera (free)
DeepSeek-TNG-R1T2-Chimera is the second-generation Chimera model from TNG Tech. It is a 67...
TNG: DeepSeek R1T2 Chimera
DeepSeek-TNG-R1T2-Chimera is the second-generation Chimera model from TNG Tech. It is a 67...
Morph: Morph V3 Large
Morph's high-accuracy apply model for complex code edits. ~4,500 tokens/sec with 98% accur...
Inception: Mercury
Mercury is the first diffusion large language model (dLLM). Applying a breakthrough discre...
Mistral: Mistral Small 3.2 24B
Mistral-Small-3.2-24B-Instruct-2506 is an updated 24B parameter model from Mistral optimiz...
MiniMax: MiniMax M1
MiniMax-M1 is a large-scale, open-weight reasoning model designed for extended context and...
Google: Gemini 2.5 Flash
Gemini 2.5 Flash is Google's state-of-the-art workhorse model, specifically designed for a...
Google: Gemini 2.5 Pro
Gemini 2.5 Pro is Googleβs state-of-the-art AI model designed for advanced reasoning, co...
MoonshotAI: Kimi Dev 72B
Kimi-Dev-72B is an open-source large language model fine-tuned for software engineering an...
OpenAI: o3 Pro
The o-series of models are trained with reinforcement learning to think before they answer...
xAI: Grok 3 Mini
A lightweight model that thinks before responding. Fast, smart, and great for logic-based ...
xAI: Grok 3
Grok 3 is the latest model from xAI. It's their flagship model that excels at enterprise u...
Google: Gemini 2.5 Pro Preview 06-05
Gemini 2.5 Pro is Googleβs state-of-the-art AI model designed for advanced reasoning, co...
DeepSeek: R1 0528 (free)
May 28th update to the [original DeepSeek R1](/deepseek/deepseek-r1) Performance on par wi...
DeepSeek: R1 0528
May 28th update to the [original DeepSeek R1](/deepseek/deepseek-r1) Performance on par wi...
Anthropic: Claude Opus 4
Claude Opus 4 is benchmarked as the worldβs best coding model, at time of release, bring...
Anthropic: Claude Sonnet 4
Claude Sonnet 4 significantly enhances the capabilities of its predecessor, Sonnet 3.7, ex...
Mistral: Mistral Medium 3
Mistral Medium 3 is a high-performance enterprise-grade language model designed to deliver...
Google: Gemini 2.5 Pro Preview 05-06
Gemini 2.5 Pro is Googleβs state-of-the-art AI model designed for advanced reasoning, co...
Arcee AI: Spotlight
Spotlight is a 7βbillionβparameter visionβlanguage model derived from Qwenβ―2.5βV...
Arcee AI: Maestro Reasoning
Maestro Reasoning is Arcee's flagship analysis model: a 32β―Bβparameter derivative of Q...
Arcee AI: Virtuoso Large
VirtuosoβLarge is Arcee's topβtier generalβpurpose LLM at 72β―B parameters, tuned t...
Inception: Mercury Coder
Mercury Coder is the first diffusion large language model (dLLM). Applying a breakthrough ...
Meta: Llama Guard 4 12B
Llama Guard 4 is a Llama 4 Scout-derived multimodal pretrained model, fine-tuned for conte...
TNG: DeepSeek R1T Chimera (free)
DeepSeek-R1T-Chimera is created by merging DeepSeek-R1 and DeepSeek-V3 (0324), combining t...
TNG: DeepSeek R1T Chimera
DeepSeek-R1T-Chimera is created by merging DeepSeek-R1 and DeepSeek-V3 (0324), combining t...
OpenAI: o4 Mini High
OpenAI o4-mini-high is the same model as [o4-mini](/openai/o4-mini) with reasoning_effort ...
OpenAI: o3
o3 is a well-rounded and powerful model across domains. It sets a new standard for math, s...
OpenAI: o4 Mini
OpenAI o4-mini is a compact reasoning model in the o-series, optimized for fast, cost-effi...
OpenAI: GPT-4.1
GPT-4.1 is a flagship large language model optimized for advanced instruction following, r...
OpenAI: GPT-4.1 Mini
GPT-4.1 Mini is a mid-sized model delivering performance competitive with GPT-4o at substa...
OpenAI: GPT-4.1 Nano
For tasks that demand low latency, GPTβ4.1 nano is the fastest and cheapest model in the...
xAI: Grok 3 Mini Beta
Grok 3 Mini is a lightweight, smaller thinking model. Unlike traditional models that gener...
xAI: Grok 3 Beta
Grok 3 is the latest model from xAI. It's their flagship model that excels at enterprise u...
NVIDIA: Llama 3.1 Nemotron Ultra 253B v1
Llama-3.1-Nemotron-Ultra-253B-v1 is a large language model (LLM) optimized for advanced re...
Meta: Llama 4 Maverick
Llama 4 Maverick 17B Instruct (128E) is a high-capacity multimodal language model from Met...
Meta: Llama 4 Scout
Llama 4 Scout 17B Instruct (16E) is a mixture-of-experts (MoE) language model developed by...
DeepSeek: DeepSeek V3 0324
DeepSeek V3, a 685B-parameter, mixture-of-experts model, is the latest iteration of the fl...
OpenAI: o1-pro
The o1 series of models are trained with reinforcement learning to think before they answe...
Mistral: Mistral Small 3.1 24B (free)
Mistral Small 3.1 24B Instruct is an upgraded variant of Mistral Small 3 (2501), featuring...
Mistral: Mistral Small 3.1 24B
Mistral Small 3.1 24B Instruct is an upgraded variant of Mistral Small 3 (2501), featuring...
AllenAI: Olmo 2 32B Instruct
OLMo-2 32B Instruct is a supervised instruction-finetuned variant of the OLMo-2 32B March ...
Google: Gemma 3 12B
Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It ha...
Cohere: Command A
Command A is an open-weights 111B parameter model with a 256k context window focused on de...
OpenAI: GPT-4o-mini Search Preview
GPT-4o mini Search Preview is a specialized model for web search in Chat Completions. It i...
OpenAI: GPT-4o Search Preview
GPT-4o Search Previewis a specialized model for web search in Chat Completions. It is trai...
Google: Gemma 3 27B (free)
Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It ha...
Perplexity: Sonar Reasoning Pro
Note: Sonar Pro pricing includes Perplexity search pricing. See [details here](https://doc...
Perplexity: Sonar Pro
Note: Sonar Pro pricing includes Perplexity search pricing. See [details here](https://doc...
Perplexity: Sonar Deep Research
Sonar Deep Research is a research-focused model designed for multi-step retrieval, synthes...
Google: Gemini 2.0 Flash Lite
Gemini 2.0 Flash Lite offers a significantly faster time to first token (TTFT) compared to...
Anthropic: Claude 3.7 Sonnet (thinking)
Claude 3.7 Sonnet is an advanced large language model with improved reasoning, coding, and...
Anthropic: Claude 3.7 Sonnet
Claude 3.7 Sonnet is an advanced large language model with improved reasoning, coding, and...
Llama Guard 3 8B
Llama Guard 3 is a Llama-3.1-8B pretrained model, fine-tuned for content safety classifica...
OpenAI: o3 Mini High
OpenAI o3-mini-high is the same model as [o3-mini](/openai/o3-mini) with reasoning_effort ...
Google: Gemini 2.0 Flash
Gemini Flash 2.0 offers a significantly faster time to first token (TTFT) compared to [Gem...
AionLabs: Aion-1.0
Aion-1.0 is a multi-model system designed for high performance across various tasks, inclu...
AionLabs: Aion-1.0-Mini
Aion-1.0-Mini 32B parameter model is a distilled version of the DeepSeek-R1 model, designe...
Qwen: Qwen VL Max
Qwen VL Max is a visual understanding model with 7500 tokens context length. It excels in ...
Qwen: Qwen-Turbo
Qwen-Turbo, based on Qwen2.5, is a 1M context model that provides fast speed and low cost,...
Qwen: Qwen-Plus
Qwen-Plus, based on the Qwen2.5 foundation model, is a 131K context model with a balanced ...
OpenAI: o3 Mini
OpenAI o3-mini is a cost-efficient language model optimized for STEM reasoning tasks, part...
DeepSeek: R1 Distill Llama 70B
DeepSeek R1 Distill Llama 70B is a distilled large language model based on [Llama-3.3-70B-...
MiniMax: MiniMax-01
MiniMax-01 is a combines MiniMax-Text-01 for text generation and MiniMax-VL-01 for image u...
DeepSeek: DeepSeek V3
DeepSeek-V3 is the latest model from the DeepSeek team, building upon the instruction foll...
Sao10K: Llama 3.3 Euryale 70B
Euryale L3.3 70B is a model focused on creative roleplay from [Sao10k](https://ko-fi.com/s...
OpenAI: o1
The latest and strongest model family from OpenAI, o1 is designed to spend more time think...
Cohere: Command R7B (12-2024)
Command R7B (12-2024) is a small, fast update of the Command R+ model, delivered in Decemb...
Google: Gemini 2.0 Flash Experimental (free)
Gemini Flash 2.0 offers a significantly faster time to first token (TTFT) compared to [Gem...
Meta: Llama 3.3 70B Instruct (free)
The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction...
Meta: Llama 3.3 70B Instruct
The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction...
Amazon: Nova Lite 1.0
Amazon Nova Lite 1.0 is a very low-cost multimodal model from Amazon that focused on fast ...
Amazon: Nova Micro 1.0
Amazon Nova Micro 1.0 is a text-only model that delivers the lowest latency responses in t...
Amazon: Nova Pro 1.0
Amazon Nova Pro 1.0 is a capable multimodal model from Amazon focused on providing a combi...
OpenAI: GPT-4o (2024-11-20)
The 2024-11-20 version of GPT-4o offers a leveled-up creative writing ability with more na...
Mistral Large 2411
Mistral Large 2 2411 is an update of [Mistral Large 2](/mistralai/mistral-large) released ...
Mistral Large 2407
This is Mistral AI's flagship model, Mistral Large 2 (version mistral-large-2407). It's a ...
Mistral: Pixtral Large 2411
Pixtral Large is a 124B parameter, open-weight, multimodal model built on top of [Mistral ...
Anthropic: Claude 3.5 Haiku
Claude 3.5 Haiku features offers enhanced capabilities in speed, coding accuracy, and tool...
Anthropic: Claude 3.5 Sonnet
New Claude 3.5 Sonnet delivers better-than-Opus capabilities, faster-than-Sonnet speeds, a...
Mistral: Ministral 8B
Ministral 8B is an 8B parameter model featuring a unique interleaved sliding-window attent...
Mistral: Ministral 3B
Ministral 3B is a 3B parameter model optimized for on-device and edge computing. It excels...
NVIDIA: Llama 3.1 Nemotron 70B Instruct
NVIDIA's Llama 3.1 Nemotron 70B is a language model designed for generating precise and us...
Meta: Llama 3.2 3B Instruct (free)
Llama 3.2 3B is a 3-billion-parameter multilingual large language model, optimized for adv...
Meta: Llama 3.2 3B Instruct
Llama 3.2 3B is a 3-billion-parameter multilingual large language model, optimized for adv...
Meta: Llama 3.2 11B Vision Instruct
Llama 3.2 11B Vision is a multimodal model with 11 billion parameters, designed to handle ...
Cohere: Command R (08-2024)
command-r-08-2024 is an update of the [Command R](/models/cohere/command-r) with improved ...
Cohere: Command R+ (08-2024)
command-r-plus-08-2024 is an update of the [Command R+](/models/cohere/command-r-plus) wit...
Nous: Hermes 3 405B Instruct (free)
Hermes 3 is a generalist language model with many improvements over Hermes 2, including ad...
Nous: Hermes 3 405B Instruct
Hermes 3 is a generalist language model with many improvements over Hermes 2, including ad...
OpenAI: ChatGPT-4o
OpenAI ChatGPT 4o is continually updated by OpenAI to point to the current version of GPT-...
OpenAI: GPT-4o (2024-08-06)
The 2024-08-06 version of GPT-4o offers improved performance in structured outputs, with t...
Meta: Llama 3.1 405B Instruct (free)
The highly anticipated 400B class of Llama3 is here! Clocking in at 128k context with impr...
Meta: Llama 3.1 70B Instruct
Meta's latest class of model (Llama 3.1) launched with a variety of sizes & flavors. This ...
Mistral: Mistral Nemo
A 12B parameter model with a 128k token context length built by Mistral in collaboration w...
OpenAI: GPT-4o-mini (2024-07-18)
GPT-4o mini is OpenAI's newest model after [GPT-4 Omni](/models/openai/gpt-4o), supporting...
OpenAI: GPT-4o-mini
GPT-4o mini is OpenAI's newest model after [GPT-4 Omni](/models/openai/gpt-4o), supporting...
OpenAI: GPT-4o (2024-05-13)
GPT-4o ("o" for "omni") is OpenAI's latest AI model, supporting both text and image inputs...
OpenAI: GPT-4o
GPT-4o ("o" for "omni") is OpenAI's latest AI model, supporting both text and image inputs...
OpenAI: GPT-4o (extended)
GPT-4o ("o" for "omni") is OpenAI's latest AI model, supporting both text and image inputs...
OpenAI: GPT-4 Turbo
The latest GPT-4 Turbo model with vision capabilities. Vision requests can now use JSON mo...
Anthropic: Claude 3 Haiku
Claude 3 Haiku is Anthropic's fastest and most compact model for near-instant responsivene...
Mistral Large
This is Mistral AI's flagship model, Mistral Large 2 (version `mistral-large-2407`). It's ...
OpenAI: GPT-4 Turbo Preview
The preview GPT-4 model with improved instruction following, JSON mode, reproducible outpu...
Auto Router
Your prompt will be processed by a meta-model and routed to one of dozens of models (see b...
OpenAI: GPT-4 Turbo (older v1106)
The latest GPT-4 Turbo model with vision capabilities. Vision requests can now use JSON mo...