Kimodo

NVIDIA

A kinematic motion diffusion model from NVIDIA's Spatial Intelligence Lab that generates high-quality 3D human and humanoid-robot motions from text prompts and kinematic constraints (keyframes, joint positions, waypoints, paths). Trained on 700 hours of optical mocap data using a two-stage transformer denoiser that separates root and body prediction. Supports SOMA, Unitree G1, and SMPL-X skeletons.

Modality

text->motion

License

apache_2

Open source

Yes

Open Kimodo in AIniverse

Compare versions, read real ratings, save to your stack.

Open in App