nvidia

NVIDIA: Llama 3.1 Nemotron Ultra 253B v1

Llama-3.1-Nemotron-Ultra-253B-v1 is a large language model (LLM) optimized for advanced reasoning, human-interactive chat, retrieval-augmented generation (RAG), and tool-calling tasks. Derived from Meta’s Llama-3.1-405B-Instruct, it has been significantly customized using Neural...

Input Cost
$0.60
per 1M tokens
Output Cost
$1.80
per 1M tokens
Context Window
131,072
tokens
Compare vs GPT-4o
Developer ID: nvidia/llama-3.1-nemotron-ultra-253b-v1

Related Models

nvidia
Free/1M

NVIDIA: Nemotron 3 Super (free)

NVIDIA Nemotron 3 Super is a 120B-parameter open hybrid MoE model, activating just 12B par...

πŸ“ 262,144 ctx Compare →
nvidia
$0.05/1M

NVIDIA: Nemotron 3 Nano 30B A3B

NVIDIA Nemotron 3 Nano 30B A3B is a small language MoE model with highest compute efficien...

πŸ“ 262,144 ctx Compare →
nvidia
Free/1M

NVIDIA: Nemotron 3 Nano 30B A3B (free)

NVIDIA Nemotron 3 Nano 30B A3B is a small language MoE model with highest compute efficien...

πŸ“ 256,000 ctx Compare →