1、Yuxuan Pan,The University of TokyoA Coarse-and Fine-Grained LUT Segmentation Method Enabling Single FPGA Implementation of Wired-Logic DNN ProcessorYuxuan Pan,Dongzhu Li,Mototsugu Hamada,Atsutake KosugeThe University of Tokyo2/6Yuxuan Pan,The University of Tokyo FPGAs are reconfigurable for AI tasks
2、 but less energy-efficient Wired-logic architecture using Non-linear Neural Network(NNN)1.Binarize weights to+1/12.Prune unnecessary synapses 3.Learn the non-linear activation function at each neuron individually Low latency&high energy efficiency via eliminating memory accessWired-Logic Architectur
3、e Using NNNTrainable Non-Linear Function3/6Yuxuan Pan,The University of Tokyo The implementation of Non-Linear Functions(NLFs)consumes significant LUT resources The sharp increase in LUT requirements makes it impossible to implement scaled NNN models with long-bit-width data on a single FPGAChalleng
4、e of Processing Long-Bit-Width Data4/6Yuxuan Pan,The University of Tokyo Segment the input bits of the NLF into High/Low order based on Coarse/Fine granularity The output switching is realized by monitoring the upper bits of the input signalCoarse-and Fine-Grained LUT Segmentation5/6Yuxuan Pan,The U
5、niversity of Tokyo When high-order bits have meaningful values,information in the low-order bits is lost,leads to accuracy reduction for complex tasks Redundant bits are merged into the segmented input to improve accuracyRedundant Bits to Restore Accuracyw/o redundant bitsw/2 redundant bits6/6Yuxuan Pan,The University of Tokyo Keyword Spotting(GSCD-10/20):93%NLF-LUTs saving,1.5%accuracy drop 15-20 x more energy efficiency improvement than Ref.1Comparison with FPGA Implementations1 Mazumder,A.N.,&Mohsenin,T.M.,arXiv preprint arXiv:2202.02361,2022.