当前位置:首页 > 报告详情

基于 AI Blueprints 的 Kubernetes 上的高效可扩展 GenAI [PAN1798].pdf

上传人: Fl****zo 编号:971019 2025-11-08 17页 1.17MB

1、 Compute-Efficient and Scaled GenAI on Kubernetes with OCI AI BlueprintsVishnu Kammari,Principal Product Manager,OCIDennis Kennetz,Sr.Machine Learning Engineer,OCIAgenda212345Enterprise Pain-PointsBest PracticesOCI Solutions&DemoCustomer StoriesPanel DiscussionEnterprises self-host LLMs on GPUs for

2、a variety of reasons.3Copyright 2025,Oracle and/or its affiliates|Confidential:Internal/Restricted/Highly RestrictedSecurity&ComplianceKeep sensitive data in-house.Meet regulatory or contractual obligations(e.g.healthcare,public sector).Customization&ControlFine-tune models with proprietary data.Avo

3、id API rate limits.Control over model upgrades.Performance&Cost EfficiencyDeploy close to enterprise data sources.Minimize latency.Reduce per-token costs at scale.When enterprises self-host LLMs,driving compute-efficiency and scale introduces 3 key challenges.4Copyright 2025,Oracle and/or its affili

4、ates|Confidential:Internal/Restricted/Highly RestrictedSoftware and Framework ChoicesIntegration,MLOps,and Infra MonitoringOnboarding&Infra ChoicesChallenge#1:Enterprises spend months right sizing and configuring infrastructure to ensure performance and compliance.5Copyright 2025,Oracle and/or its a

5、ffiliates|Confidential:Internal/Restricted/Highly RestrictedOnboarding&Infra ChoicesOptimize network setup and integrate storage(e.g.local NVMe,object storage,and Oracle network file storage service integration with tiering)to minimize latencyEstimate right number and size of GPUs(e.g.H100 vs H200 f

6、or inference workloads)Auto-provision RDMA networking for clustered GPU nodes(e.g.Llama-405B on 2 H100 nodes)Configure secure GPU access and compliance(e.g.IAM policies,network security rules)Install GPU drivers,CUDA,and other libraries while avoiding compatibility/performance issuesChallenge#2:Ente

word格式文档无特别注明外均可编辑修改,预览文件经过压缩,下载原文更清晰!
三个皮匠报告文库所有资源均是客户上传分享,仅供网友学习交流,未经上传用户书面授权,请勿作商用。
根据报告的内容,全文主要内容概括如下: - **企业自托管LLM的挑战**:企业自托管LLM面临软件选择、模型和数据处理、性能监控等挑战。 - **OCI AI Blueprints解决方案**:OCI AI Blueprints提供基础设施自动化、单一观察界面、预配置蓝图和自动化部署,以解决这些挑战。 - **关键点**: - 企业自托管LLM的原因包括安全性、合规性、定制化和性能成本效率。 - 三大挑战:基础设施配置、软件框架选择、性能监控。 - OCI AI Blueprints提供优化网络、GPU选择、软件框架、模型兼容性、MLOps和监控工具等解决方案。 - 核心数据:例如,使用OCI AI Blueprints,企业可以将LLM推理基准测试时间从数月缩短到数周。 - 案例研究:eCommerce企业使用OCI AI Blueprints进行LLM推理基准测试和高效模型服务;Etisalat使用Llama Stack和OCI AI Blueprints部署客户支持聊天机器人。
"企业自建LLM,挑战与解决方案" 高效AI部署秘诀" "AI蓝图助力企业智能升级"
客服
商务合作
小程序
服务号
折叠