1、2025-1-7Cosmos World F oundation Model Platform for Physical AINVIDIA1AbstractPhysical AI needs to be trained digitally first.It needs a digital twin of itself,the policy model,and adigital twin of the world,the world model.In this paper,we present the Cosmos World F oundation ModelPlatform to help
2、developers build customized world models for their Physical AI setups.We positiona world foundation model as a general-purpose world model that can be fine-tuned into customizedworld models for downstream applications.Our platform covers a video curation pipeline,pre-trainedworld foundation models,e
3、xamples of post-training of pre-trained world foundation models,and videotokenizers.To help Physical AI builders solve the most critical problems of our society,we make ourplatform open-source and our models open-weight with permissive licenses available via NVIDIA Cosmos.1.IntroductionPhysical AI i
4、s an AI system equipped with sensors and actuators:the sensors allow it to observe the world,and the actuators allow it to interact with and modify the world.It holds the promise of freeing humanworkers from physical tasks that are dangerous,laborious,or tedious.While several fields of AI have advan
5、cedsignificantly thanks to data and compute scaling in the recent decade,Physical AI only inches forward.Thisis largely because scaling training data for Physical AI is much more challenging,as the desired data mustcontain sequences of interleaved observations and actions.These actions perturb the p
6、hysical world and maycause severe damage to the system and the world.This is especially true when the AI is still in its infancy whenexploratory actions are essential.A World F oundation Model(WFM),a digital twin of the physical world that aPhysical AI can safely interact with,has been a long-sought