xiaomi

Xiaomi: MiMo-V2-Omni

MiMo-V2-Omni is a frontier omni-modal model that natively processes image, video, and audio inputs within a unified architecture. It combines strong multimodal perception with agentic capability - visual grounding, multi-step...

Input Cost

$0.40

per 1M tokens

Output Cost

$2.00

per 1M tokens

Context Window

262,144

tokens

Compare vs GPT-4o

                Developer ID: xiaomi/mimo-v2-omni            

Related Models

xiaomi

$1.00/1M

Xiaomi: MiMo-V2-Pro

MiMo-V2-Pro is Xiaomi's flagship foundation model, featuring over 1T total parameters and ...

📝 1,048,576 ctx Compare →

xiaomi

$0.09/1M

Xiaomi: MiMo-V2-Flash

MiMo-V2-Flash is an open-source foundation language model developed by Xiaomi. It is a Mix...

📝 262,144 ctx Compare →

google

$0.14/1M

Google: Gemma 4 31B

Gemma 4 31B Instruct is Google DeepMind's 30.7B dense multimodal model supporting text and...

📝 262,144 ctx Compare →