1、1101001000110110111010101001011001101001000110110111O在ApacheFlink中使用GPU来完成机器学习任务郭肠泽阿里巴巴/开发101010010陈成超阿里巴巴/开发11010010001107C01010101010101001101001000110110111010101001101#page#01Background#page#Apache FlinkApache FlinkOAn unified batch and stream processing engineo Widely used in recommendation ETL
2、 realtime riskmanagement and other scenarios.#page#Apache FlinkMachine learning workflowModel InferenceTensorFlowFeatureModel TrainingAnalysisEngineeringFlinkTensorFlowFlinkModel ServingTennsorFlow Serving#page#Apache FlinkProblemsO Users do feature engineering modeltraining and modelprediction with
3、 two framework.O Distributed programs often run in clusters but its not friendlyto use TensorFlow for distributed training to determine IP andportfirst.#page#Apache FlinkGoal0Model inferenceFlinkFeatureModel TrainingAnalysisEngineeringFlinkFlinkFlinkModel ServingTensorFlow Serving#page#Apache FlinkW
4、hat we need?OAdd GPU support in FlinkO Integrates Flink with TensorFlow to run machine learning tasksand support using GPU00110#page#02External Resource Framework#page#Apache FlinkMotivation101001000110110111010101001011001101001000110O Supporting the Machine Learning scenarios is one of Flinksroadm
5、ap targets.OMachine Learning models are often the performancebottleneck of theentire machine learning task.We need toaccelerate this part especially forthe streaming jobOGPU is widely used as the accelerator in the machine learningworkload.#page#Apache FlinkExternal Resource FrameworkO Supports requ
6、esting various types ofresources from theunderlying resource management systems.O Supplies information needed for using these resources to theoperators.oDifferent resource types can be supported in a pluggablemanner Flink pprovides frist-party GPU plugin.#page#Apache FlinkHow to useO Enable the Exte