1、Towards Efficient Data Parallelism on Spatial CGRA via Constraint Satisfaction and Graph ColoringXuchen GaoFudan University,CPresenter:Institution:Email:Authors:Yuan Dai,Xuchen Gao*,et al.2ContentsIntroductionProblem FormulationProposed MethodologyExperimentConclusion3ContentsIntroductionProblem For
2、mulationProposed MethodologyExperimentConclusion4BackgroundThe Workflow of Coarse-Grained Reconfigurable Architecture(CGRA)for(int i=0;i n;+i)ci=ai+bi;application codecompilerAn example of CGRAMemorySwitchesProcessing Elements(PE)b0:na0:nc0:nDFG GenerationSchedulingMapping.5Motivation Motivation Exa
3、mpleMotivating example codeDFG of example code6Motivation Motivation ExampleInvalid scheme with =2 and =327Motivation Motivation ExampleValid scheme with =4 and =168ContentsIntroductionProblem FormulationProposed MethodologyExperimentConclusion9Definitions Affine Address AccessAccess PatternControl
4、Step Access Patternq()memory accesses at cycle10Definitions Memory Partition():bank function ():offset function Iteration Distance11Target Access II Iteration Count VectorGiven an-level loop in the iteration domain with affine memory accesses on the same array=11,find the and such that:Minimize in&H
5、old the equation 12ContentsIntroductionProblem FormulationProposed MethodologyExperimentConclusion13End-to-End Framework:HardwareThe SoC supports parallel memory accessEnhanced IOB to access different banks during executionSpecific mechanism for loading/storing data from/to external memory14End-to-E
6、nd Framework:SoftwareSoftware flow with the proposed memory partition algorithm15End-to-End Framework:Software16CSP formulationConstraint Satisfaction Problem(CSP)Iteration Constraint(pre-mapping):Before mapping,all the accesses in the are regarded in the same iterationBank Const