1、Zipper:Latency-Tolerant Optimizations for High-Performance BusesShibo ChenHailun ZhangTodd Austin University of Michigan Ann Arbor University of Wisconsin MadisonCompute Offload Overhead Connect to the accelerator Store input data to shared memory Launch kernel through MMIOCPU/HostRead the resultsAc
2、celeratorShared MemoryOverhead 500 nsOverhead 500 nsMemory AccessHost-Accelerator CommunicationCompute Process Start computation Write back results¬ify the host Fetch Operands21000ns Overhead for Round Trip LatencyBackgroundMotivationCase StudiesChallengesZipperEvaluationConclusionsPO=Equation fo
3、r Computing Offload Trade-offs3Ratio of Raw Time Saved Over Offload OverheadPO:BackgroundMotivationCase StudiesChallengesZipperEvaluationConclusionsPO 1:Beneficial to offloadPO=1:Not beneficial to offload(_ _)_=()The Death Zone of Compute Offload00.20.40.60.811.21.40100200300P/OLocality/ParallelismU
4、noptimizedNoneLowMediumHighBeneficial to OffloadDo NOT OffloadIntel 8087First Floating Point CoProcessor4200 cycles PO=()BackgroundMotivationCase StudiesChallengesZipperEvaluationConclusions1000 nsMore Forgiving Trade-Offs with Bus Optimizations00.20.40.60.811.21.42336192122153183214245275Performanc
5、e Gain(ns)Locality/ParallelismUnoptimizedZipperNoneLowMediumHighNew Design SpaceBeneficial to OffloadDo NOT Offload5PO=()BackgroundMotivationCase StudiesChallengesZipperEvaluationConclusionsCase StudiesCase Study#1:Sequestered Encryption Enclave+VIP-Bench Support RISC-like instructions Compute on en
6、crypted operands Running privacy-focused algorithms Case Study#2:Posit Hardware Kernel+NAS Parallel Benchmark Posit is an alternative to IEEE 754 Floating Point Support arithmetic operations Running scientific applications6HostEnclavekeyHostPosit KernelBackgroundMotivationCase StudiesChallengesZippe