《即时扩展到 1000 个 GPU实现无服务器 AI 推理(由 Modal 赞助).pdf》由会员分享,可在线阅读,更多相关《即时扩展到 1000 个 GPU实现无服务器 AI 推理(由 Modal 赞助).pdf(32页珍藏版)》请在三个皮匠报告上搜索。
1、 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.A I M 2 2 0 1-SScaling instantly to 1000 GPUs for Serverless AI inferenceErik Bernhardson 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.ModalMod
2、al is AI infrastructureBased in NYC,SF,StockholmWe power customers such as Suno,Lovable,and S 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.Modal is an infrastructure platform that makes it easy for developers to build and scale AI applications.2025,Amazon Web Services,Inc.or it
3、s affiliates.All rights reserved.Todays infrastructure is not built for AI applicationsWhy Modal?2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.Traditional backend workloads:Lots of small requests I/O bound workloads CPUs Fairly predictable workloads Infrequent deploys Infinite c
4、loud capacityOld infrastructure doesnt work for AIAI workloads are different Highly iterative Compute-bound workloads GPUs Lots of capacity constraints High cost of over-provisioning Need to run globally 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.Modal is purpose-built for AI
5、 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.Modal powers a wide range of workloads 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.powers many different users 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.in many different verticals 202
6、5,Amazon Web Services,Inc.or its affiliates.All rights reserved.For people running their own models,or custom workflows,or need high control over the code.Modal is not an“AI API”.Think of Modal as Kubernetes or AWS Lambda,but:Focused on AI/ML use casesAwesome developer experienceFully managedThis fl