xiaomi

Xiaomi: MiMo-V2-Omni

MiMo-V2-Omni is a frontier omni-modal model that natively processes image, video, and audio inputs within a unified architecture. It combines strong multimodal perception with agentic capability - visual grounding, multi-step...

Input Cost
$0.40
per 1M tokens
Output Cost
$2.00
per 1M tokens
Context Window
262,144
tokens
Compare vs GPT-4o
Developer ID: xiaomi/mimo-v2-omni

Related Models

xiaomi
$1.00/1M

Xiaomi: MiMo-V2-Pro

MiMo-V2-Pro is Xiaomi's flagship foundation model, featuring over 1T total parameters and ...

📝 1,048,576 ctx Compare →
xiaomi
$0.09/1M

Xiaomi: MiMo-V2-Flash

MiMo-V2-Flash is an open-source foundation language model developed by Xiaomi. It is a Mix...

📝 262,144 ctx Compare →
google
$0.14/1M

Google: Gemma 4 31B

Gemma 4 31B Instruct is Google DeepMind's 30.7B dense multimodal model supporting text and...

📝 262,144 ctx Compare →