1、Model Batch Inference on Massive Data:Exploration and Practice of Efficient Automatedand Intelligent SolutionsYaping ZhoueBay目录CONTENTS01020304Background IntroductionBatch Inference ScalingBatch Inference Workflow Auto-generation and SchedulingSummary and OutlookPART ONEBackground IntroductionBackgr
2、ound Risk Model Batch Inference Challengedriver setNLP driver setFess(odl)prci featuresNLP model 1NLP model 2NLP model 3NLP model 4assemble featurelgb model 1lgb model 2lgb model 3lgb model 4SearchAdsRiskNLP model 5Listing Quality Model Inference WorkflowBackground Risk Model Batch Inference Challen
3、gedriver setNLP driver setFess(odl)prci featuresNLP model 1NLP model 2NLP model 3NLP model 4assemble featurelgb model 1lgb model 2lgb model 3lgb model 4SearchAdsRiskHigh Data Volume:daily 300 1000 M,1200 featuresNLP model 5Background Risk Model Batch Inference Challengedriver setNLP driver setFess(o
4、dl)prci featuresNLP model 1NLP model 2NLP model 3NLP model 4assemble featurelgb model 1lgb model 2lgb model 3lgb model 4SearchAdsRiskHigh Data Volume:daily 300 1000 M,1200 featuresModel Complexity:4 lightgbm+5 nlp models;high dimensionNLP model 5Background Risk Model Batch Inference Challengedriver
5、setNLP driver setFess(odl)prci featuresNLP model 1NLP model 2NLP model 3NLP model 4assemble featurelgb model 1lgb model 2lgb model 3lgb model 4SearchAdsRiskHigh Data Volume:daily 300 1000 M,1200 featuresModel Complexity:4 lightgbm+5 nlp models;high dimensionNLP inference duration 130 hours.But e2e t
6、arget duration:1 dayNLP model 5Background-Status Analysis and Goals Design and validate models with small datasets Use dev environments(notebooks)Inference scaling,optimize inference stacks;Workflow auto-generation,one-click deployment&cross-platform schedulingFrom Prototype to ProductionFocus on mo