1、Scaling MLOps to Retrain 50k Weekly Models in Parallel Using UDFs Kaleb A.Lowe,PhD Ranks by data.ai:Capturing 360 Mobile PerformanceRanks by data.ai:Capturing 360 Mobile PerformanceCOPYRIGHT 2023|DATA.AlData.ai is the premier provider of mobile marketplace data and ecosystem insights.Mobile App KPIs
2、ML ModelApp RanksDownload Estimate2024 Databricks Inc.All rights reservedOne of data.ais cornerstone products is our best-in-class downloads estimates.Downloads estimates are produced using Ranks by data.ai,among other things.Ranks by data.ai is an ML model that uses our understanding and quantifica
3、tion of the mobile ecosystem to rank app performance.These ranks and downloads estimates allow our customers to benchmark their performance against their competitors.What is MLOps?What is MLOps?MMachine L Learning OpOperations sAccording to Databricks:MLOps is“.focused on streamlining the process of
4、 taking machine learning models to production,and then maintaining and monitoring them.”MLOps provides the benefits of:1.Efficiency2.Risk reduction3.ScalabilityMLOps processes are agnostic to the ML problem or even industry;lessons learned by scaling MLOps are applicable across the board.2024 Databr
5、icks Inc.All rights reservedScaling MLOps at data.ai Presents ChallengesScaling MLOps at data.ai Presents ChallengesThe Ranks by data.ai model has many features,but is itself a simple enough model.The technical difficulty is in combinatorics:we have to scale to accommodate 175 countries,multiple met
6、rics per country,sub-models,etc.2024 Databricks Inc.All rights reservedThis product requires managing model training,storage,inference,etc.every week for more than 50 thousandindividual models.How do we approach model development,How do we approach model development,training,and maintenance for 50k