《蒋宇东-AniSoraQCon.pdf》由会员分享,可在线阅读,更多相关《蒋宇东-AniSoraQCon.pdf(37页珍藏版)》请在三个皮匠报告上搜索。
1、AniSora 动画视频生成技术应用蒋宇东目录01020304动画视频生成的问题和挑战动画视频生成的问题和挑战AniSora技术技术框架框架技术落地挑战与解决方案技术落地挑战与解决方案动画视频智能创作的未来动画视频智能创作的未来01动画视频生成的问题和挑战动画视频生成的问题和挑战视频生成技术发展背景U-Net2024.022024.03Open-Sora 0.7BOpen-Sora-Plan2024.06Kling 1.0Gen-32024.072024.08CogVideoX-5B2024.09Hailuo AI2024.102024.12SoraSora TurboGoogle Veo2K
2、ling1.6 Hunyuan-13B2025.02WanX2.1-14B2025.07Kling2.0Jimeng2.0AniSora 1.0AniSora 2.02025.09AniSora 3.0WanX2.2视频生成关键技术框架-DIT视频生成关键技术框架-MMDIT动画视频生成的核心技术问题多样的艺术风格现实物理 Vs 动画物理针对动画的benchmark构建动画领域人物一致性场景一致性一致性多模态引导控制动态大小、运镜、运动区域等可控性质量*信息密度*时长AI Director长视频生成02Anisora技术架构Anisora技术架构概览数据集构建模型训练10M10M高质量动画切高
3、质量动画切片片Benchmark构建强化学习调优多控制合一训练多控制合一训练6 6个维度立体评估个维度立体评估强化学习后训练调优强化学习后训练调优AniSora:Exploring the Frontiers of Animation Video Generation in the Sora Era.Yudong Jiang,Baohan Xu et.al,Accepted by IJCAI25Aligning Anime Video Generation with Human Feedback.Yidi Wu,Bingwen Zhu et.al Under ReviewGitHub-bili
4、bili/Index-anisora SOTA Animation Video Generator,2.1K StarData PipelineRaw ClipsAniSoraDataSetDistribution of Resolution and DurationCaptionAestheticsOCROptical FlowCaption Length and FramesDataset&BenchmarkVisual Smoothness=(1,.,)where,denote the single frame and the total frames,and denotes the r
5、egression head,and denotes the vision encoder.Visual MotionThe magnitude of primary motion in anime videos.=(),()where denotes the finetuning action model.represents the generation video and denotes the designed motion prompt.Visual AppealThe fundamental quality of video generation.=(0,1,)()where,an
6、d denote the key frame extraction method,feature encoder method and aesthetic evaluation method,and denotes the number of the keyframes.Text-Video Consistency=(),()where denotes the regression head,and,denote the vision and text encoder,respectively.Image-Video Consistency=(),()where denote the part