1、Macronix ProprietaryPersonal Data(D)ECLABMacronix ProprietaryPersonal Data(D)In-Storage Read-Centric Seed Location Filtering Using 3D-NAND Flash for Genome Sequence AnalysisYou-Kai Zheng,Ming-Liang Wei,Hsiang-Yun Cheng,Chia-Lin Yang,Ming-Hsiang Tsai,Chia-Chun Chien,Yuan-Hao Zhong,Po-Hao Tseng,Hsiang
2、-Pang LiNational Taiwan University,Taipei,TaiwanAcademia Sinica,Taipei,TaiwanMacronix International Co.,Ltd.,Hsinchu,TaiwanMacronix ProprietaryPersonal Data(D)ECLABOutlineBackgroundMotivationProposed DesignEvaluationConclusion2Macronix ProprietaryPersonal Data(D)ECLABApplications of Genomic Sequenci
3、ngOutbreak TrackingPersonalizedMedicineDetection of Genetic DisordersPhylogenetic Tree3Macronix ProprietaryPersonal Data(D)ECLABBackgroundGenomic Sequence Analysis PipelineDNA SampleAmplified DNA FragmentsACGTAGGGGGCATCATACGCATCCTTAGAACCTTTGGGAACTACTDNA ReadsDenature+AmplifyBaseCallingRead MappingSe
4、quencing MachineReadsReference SequenceReadsMapMap DNA reads to the reference sequences4Macronix ProprietaryPersonal Data(D)ECLABBackground Read Mapping Process SeedingPre-alignment FilteringSequence AlignmentRead MappingSequencing MachineReadsReadReference SequencesSeeding:Quickly find possible map
5、ped positionFiltering:remove unelated seeds Fine Alignment Process(Dynamic Programming)Base-pair wise mappingATC.GGAAT.CGCAReadRef.Exp:Hash table5Macronix ProprietaryPersonal Data(D)ECLABBackground Prior Q-Gram Convert the sequence into a vector,and check their similarity through inner product.Seedi
6、ngPre-alignment FilteringSequence AlignmentReadQ-gram Filter+ChainingDotAAAAAAAAATTTTTTTTTTG.11AAAAATTTTT01.110101101101DotOutput 1DotOutput 2DotOutput 3 th thK-length sequence(K-mer)as a feature to generate the binary vectorAAAAATTTTTAAAAAAAAATTTTTTAAAAAAAAATTTTTTTTTTG.1101.VectorizeBinary Case6Mac