当前位置:首页 > 报告详情

通过“系统之系统”方法实现宏观异质性.pdf

上传人: 明**** 编号:1011986 2025-12-21 15页 1.61MB

1、Samantika SuryLarry Kaplan,Matt Turner,Eric Borch,Bob Wisniewski,Aalap Tripathy,Tushar Krishna*Hewlett Packard Enterprise*Georgia Institute of TechnologyEnabling Macroheterogeneity through a“System-of-Systems”ApproachSERVER:AI HW SW CO-DESIGN/NIC/HPCMacroheterogeneity enables building a system-of-sy

2、stemsData center macroheterogeneityContains many loosely-coupled systemsHPC,AI,general cloud,private cloud,and I/O as separate components with estate-like fabric across data centerCloud-like technologies for multi-tenancy,virtualization,and containerizationSystem macroheterogeneitySingle system with

3、 main compute partition and specialized partitions tightly connected by a high-performance data and network fabric Couple mod-sim with AI and/or quantum accelerators to enable workflows and provide a pathway to extensible and modular systemsHigh-performance fabric as a unified network across system

4、partitionsMacroheterogeneity Main system of many nodesSpecialized accelerator(s)CPU partition for non-accelerated codesData partitionUEC based network connects partitionsSystemPrivate/Science CloudData centerPotential future system architecture framework for Hybrid HPC:HPC+AI+QuantumSW and HW define

5、d partitions that work together as one systemProvide a way to integrate new architecture at scale or run workflow across the systemTight coupling of partitions so data can be shared without needing to write to diskShared system,network(e.g.HPE Slingshot),and resource managementChallenge is to enable

6、 a tightly-coupled system of systems with high utilizationGeneral use case categories(details later):Inner kernels being replaced by surrogates with inference queries within mod/simAI steering of mod/sim:often in an ensemble environment;need to couple and sync a large number of jobsUse mod/sim to ge

word格式文档无特别注明外均可编辑修改,预览文件经过压缩,下载原文更清晰!
三个皮匠报告文库所有资源均是客户上传分享,仅供网友学习交流,未经上传用户书面授权,请勿作商用。
根据报告的内容,全文主要内容概括如下: - **宏观异构性**:通过“系统-of-系统”方法实现,允许构建包含多个松散耦合系统的数据中心,如HPC、AI、云等。 - **系统架构**:包括主计算分区和专用分区,通过高性能数据和网络连接。 - **应用案例**:气候模型和燃烧模拟,利用不同硬件分区提高效率和准确性。 - **挑战**:实现紧密耦合的系统-of-系统,提高利用率。 - **工具和技术**:网络和数据共享、软件管理、作业调度、性能分析工具。 - **模拟框架**:如AstraSim,用于模拟不同硬件分区。 - **未来方向**:集成新架构、优化数据共享和性能分析。 核心数据: - GPU分区比CPU分区在LES上快1500倍。 - 需要一个可扩展、模块化和开源的系统级模拟框架。
HPC+AI+量子未来?" HPC与AI的完美融合?" 挑战与机遇?"
客服
商务合作
小程序
服务号
折叠