AIniverse Get the App
Xiaomi: MiMo-V2-Omni logo

Xiaomi: MiMo-V2-Omni

MiMo-V2-Omni is a frontier omni-modal model that natively processes image, video, and audio inputs within a unified architecture. It combines strong multimodal perception with agentic capability - visual grounding, multi-step planning, tool use, and code execution - making it well-suited for complex real-world tasks that span modalities, 256K context window.

Context
262144
Modality
text+image+audio+video->text
License
proprietary

Open Xiaomi: MiMo-V2-Omni in AIniverse

Compare versions, read real ratings, save to your stack.

Open in App