1、InsightPilot:Towards LLM-Empowered Automated Data Exploration丁锐 微软演讲嘉宾丁锐微软 首席研究员丁锐是微软的数据、知识和智能(DKI)团队的首席研究员。丁锐一直致力于数据分析当中的洞见(insights)研究,这对于在商业和日常生活中理解数据及有效决策至关重要。丁锐的研究主要集中在两个主题上。第一个主题是如何将洞见概念转化为可计算的数据实体,这是洞见发现(即检测和挖掘)的基础问题。另一个主题是数据分析的可解释性以及因果性在其中的重要角色,这也是使洞见具有解释性,可靠性及泛化性的关键。丁锐的研究成果主要发表在SIGMOD和SIGKD
2、D等会议上。此外,与洞见相关的研究在微软也有一系列产品转化,作为微软产品中用于数据分析的功能,包括Power BI的QuickInsights、Excel 的 AnalyzeData 和 FormsInsights。目 录CONTENTS1.Insight-Based Exploratory Data Analysis2.InsightPilot:LLM-Empowered Automated Data Analysis ParadigmInsight-Based Exploratory Data AnalysisPART 01Outline The concept and formulati
3、on of insight The analysis space established from insight MetaInsight:Enriching the intension of insight XInsight:Expanding extension of insightWhat is Exploratory Data AnalysisExploratory Data Analysis(EDA)is a process of analyzing data to summarize its main characteristics,for the purpose of Gaini
4、ng knowledge from data Facilitating further in-depth data analysisImportance of Interesting Pattern Discovery Increasingly popular in the era of big data Values of discovering interesting data pattern Data understanding Knowledge discovery Further drill-down data analysis Common practice Exploratory
5、 data analysis Visual/interactive data analysisChallenges of Interesting Pattern Discovery Existing work Mainly focus on dealing with individual types of interesting patterns Lack of unified formulation of“interesting patterns”Lack of general mining frameworks Users target is often broad or vague In
6、sights hidden amongst subspaces across different semantic levels E.g.,Ford vs.Ford SUV vs.Ford SUV China Insights hidden amongst different measure columns E.g.,price,volume,revenueKey problems to be solved Insight Definition:What are the general abstraction&tangible form of interesting patterns?Insi