1、NVIDIAMULTI-INSTANCE GPU (MIG)深度学习最佳用法示例张雪萌,杨值#page#AGENDAIntroduction to MIG Multi-Instance GPU)MIG managementKubernetes support for MIGMIG for deep learningTraining#page#MotivationWhy we use Multiple-lnstance GPU (MIG)why?To maximize GPU utilization.when?If your application cannot fully utilize a
2、single GPU.How?UseMG to run multiple workloads in parallel ona single A100 GPU.One GPU to serve single user with multiple applications, or multiple users.#page#MULTI-INSTANCE GPU (MIG)Optimize GPU Utilization, Expand Access to More Users with Guaranteed Quality of ServiceUSERDUp To7GPU Instances In
3、a Single A100;防GPUInstance0Dedicated SM,Memory,L2 cache,Bandwidth forhardware QoS8isolationUSER1GPUInstance1Simultaneous Workload Execution WithUSER2Guaranteed Quality Of Service:GPUInstance2AU MIG instances run in parallel with predictablethroughputalatencyUSER3GPUInstance3公Right Sized GPU Allocati
4、on:Different sized MIG instances based on targetUSER4GPUInstance4workloadsUSEREDiverse Deployment Environments:GPUInstance5Supportedwith Bare metal,Docker,Kubernetes,VirtualizedEnv.USER公HGPU Instance粥MG User Guide:https:/ guide/index.html#page#MIG USE CASESWorkloads need QoS guaranteed with low late
5、ncy responseSingle Tenant:Multiple UsersMultiple Tenant:Multiple UsersSingle User:Multiple AppsSingle or Multiple AppAppADDAPDADSupportinternal usergroupsortoprovideAsingle user running multiple GPU-basedAbilitytoprovide separateGPUInstancesapplicationstodifferentcustomers.EgMultipleinference jobs(e
6、batchsize=1)E.g. Shared resource usageE.g.Jupyternotebooksformodelexploration#page#GPU INSTANCE (GI)Hardware partition of a GPU:Streaming Multiprocessors (SMS),memory, bandwidth.Applications running simultaneously in all GPU Instances.Ability to provide separate GPU Instances to smal/medium/large si