在资源受限的边缘计算平台上高效部署大型语言模型.pdf

上传人：芦苇

编号：651849

2025-05-01

PDF 44页 12.78MB

《在资源受限的边缘计算平台上高效部署大型语言模型.pdf》由会员分享，可在线阅读，更多相关《在资源受限的边缘计算平台上高效部署大型语言模型.pdf（44页珍藏版）》请在三个皮匠报告上搜索。

1、2/16/251Efficient Deployment of Large Language Models on Resource Constrained Edge Computing PlatformsYiyu Shi,Ph.D.Professor,Dept.of Computer Science and Engineering,Site Director,NSF I/UCRC on Alternative and Sustainable Intelligent Computing,University of Notre Dame yshi4nd.edu11The Success of La

2、rge Language ModelsChemistryMedicineMathBusinessAnalyticsHosted on Cluster2“As models scale,they approach or surpass task-specific baselines,showing promise as universal systems for natural language understanding”-By Scaling Law from OpenAI22/16/252LLM is powerful,butOfflineData PrivacyAI Centraliza

3、tion(Fairness)CustomizationVision:LLM hosted on cluster can achieve many tasks,but is compromised by certain concerns:Offline Internet is unavailable/unstable,but real-time reaction is required(suicide detection,auto-drive)Data Privacy Medical history,personal informationAI Centralization Only large

4、 corps can own models,data,and computational resources(clusters)Customization LLM needs to adapt users with distinct situations33Edge-based LLM can be a solution“Data in local”“Model weights in local”“Free from Internet”“Customize the LLM via local data”LLM deployed on the edge device can avoid thes

5、e concerns.Microsofts Phi model,has successfully demonstrated the power of edge-friendly LLM442/16/253Gap Between LLM and Edge DevicesGAPLLM is growing much faster than the upgrade of edge devicesChallenges:Computation complexityMemory capacityEnergy efficiency55A Successful Edge LLM should be able

6、to Tradeoff:Use resource wisely among model weights and user data during training/inferencePersonalization:Generate user-preferred/related responseRobustness:Continuously growing performance over experienceHandle out-of-distribution scenarios 662/16/254Build Up Efficient LLM on Edge Devices Edge LLM

在资源受限的边缘计算平台上高效部署大型语言模型.pdf

相关报告