《A1--周亚平--海量数据模型批量推理 —— 效率、稳定与跨平台调度的新策略.pdf》由会员分享,可在线阅读,更多相关《A1--周亚平--海量数据模型批量推理 —— 效率、稳定与跨平台调度的新策略.pdf(40页珍藏版)》请在三个皮匠报告上搜索。
1、Model Batch Inference on Massive Data:Exploration and Practice of Efficient Automatedand Intelligent SolutionsYaping ZhoueBay目录CONTENTS01020304Background IntroductionBatch Inference ScalingBatch Inference Workflow Auto-generation and SchedulingSummary and OutlookPART ONEBackground IntroductionBackgr
2、ound Risk Model Batch Inference Challengedriver setNLP driver setFess(odl)prci featuresNLP model 1NLP model 2NLP model 3NLP model 4assemble featurelgb model 1lgb model 2lgb model 3lgb model 4SearchAdsRiskNLP model 5Listing Quality Model Inference WorkflowBackground Risk Model Batch Inference Challen
3、gedriver setNLP driver setFess(odl)prci featuresNLP model 1NLP model 2NLP model 3NLP model 4assemble featurelgb model 1lgb model 2lgb model 3lgb model 4SearchAdsRiskHigh Data Volume:daily 300 1000 M,1200 featuresNLP model 5Background Risk Model Batch Inference Challengedriver setNLP driver setFess(o
4、dl)prci featuresNLP model 1NLP model 2NLP model 3NLP model 4assemble featurelgb model 1lgb model 2lgb model 3lgb model 4SearchAdsRiskHigh Data Volume:daily 300 1000 M,1200 featuresModel Complexity:4 lightgbm+5 nlp models;high dimensionNLP model 5Background Risk Model Batch Inference Challengedriver
5、setNLP driver setFess(odl)prci featuresNLP model 1NLP model 2NLP model 3NLP model 4assemble featurelgb model 1lgb model 2lgb model 3lgb model 4SearchAdsRiskHigh Data Volume:daily 300 1000 M,1200 featuresModel Complexity:4 lightgbm+5 nlp models;high dimensionNLP inference duration 130 hours.But e2e t
6、arget duration:1 dayNLP model 5Background-Status Analysis and Goals Design and validate models with small datasets Use dev environments(notebooks)Inference scaling,optimize inference stacks;Workflow auto-generation,one-click deployment&cross-platform schedulingFrom Prototype to ProductionFocus on mo