报告预览

TsinghuaNLP：2025MiniCPM-V 4.5技术报告：解构新一代高效端侧多模态模型养成指南（英文版）（26页）.pdf

编号：921674

PDF 中文版 DOCX 26页 9.80MB 下载积分：VIP专享

下载报告请您先登录！

TsinghuaNLP：2025MiniCPM-V 4.5技术报告：解构新一代高效端侧多模态模型养成指南（英文版）（26页）.pdf

1、MiniCPM-V 4.5:Cooking Efficient MLLMs viaArchitecture,Data and Training RecipesTianyu YuZefan WangChongyi WangFuwei HuangWenshuo MaZhihui HeTianchi CaiWeize ChenYuxiang HuangYuanqian ZhaoBokai XuJunbo CuiYingjing XuLiqing RuanLuoyuan ZhangHanyu LiuJingkun TangHongyuan LiuQining GuoWenhao HuBingxiang

2、 HeJie ZhouJie CaiJi QiZonghao GuoChi ChenGuoyang ZengYuxuan LiGanqu CuiNing DingXu HanYuan YaoZhiyuan LiuMaosong SunMiniCPM-V Team,OpenBMBMiniCPM-V 4.5 CodeMiniCPM-V 4.5 ModelAbstractMultimodal Large Language Models(MLLMs)are undergoing rapid progress andrepresent the frontier of AI development.How

3、ever,their training and inferenceeffi ciency have emerged as a core bottleneck in making MLLMs more accessi-ble and scalable.To address the challenges,we present MiniCPM-V 4.5,an 8Bparameter model designed for high effi ciency and strong performance.We intro-duce three core improvements in model arc

4、hitecture,data strategy and trainingmethod:a unifi ed 3D-Resampler model architecture for highly compact encod-ing over images and videos,a unifi ed learning paradigm for document knowledgeand text recognition without heavy data engineering,and a hybrid reinforcementlearning strategy for profi cienc

5、y in both short and long reasoning modes.Compre-hensive experimental results in OpenCompass evaluation show that MiniCPM-V4.5 surpasses widely used proprietary models such as GPT-4o-latest,and signifi-cantly larger open-source models such as Qwen2.5-VL 72B.Notably,the strongperformance is achieved w

6、ith remarkable effi ciency.For example,on the widelyadopted VideoMME benchmark,MiniCPM-V 4.5 achieves state-of-the-art per-formance among models under 30B size,using just 46.7%GPU memory cost and8.7%inference time of Qwen2.5-VL 7B.1IntroductionMultimodal Large Language Models(MLLMs)1,2,3,4,5,6,7 are

友情提示

1、下载报告失败解决办法
2、PDF文件下载后，可能会被浏览器默认打开，此种情况可以点击浏览器菜单，保存网页到桌面，就可以正常下载了。
3、本站不支持迅雷下载，请使用电脑自带的IE浏览器，或者360浏览器、谷歌浏览器下载即可。
4、本站报告下载后的文档和图纸-无水印,预览文档经过压缩，下载后原文更清晰。

本文（TsinghuaNLP：2025MiniCPM-V 4.5技术报告：解构新一代高效端侧多模态模型养成指南（英文版）（26页）.pdf）为本站（111111）主动上传，三个皮匠报告文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若此文所含内容侵犯了您的版权或隐私，请立即通知三个皮匠报告文库（点击联系客服），我们立即给予删除！

温馨提示：如果因为网速或其他原因下载失败请重新下载，重复下载不扣分。