1、2024 Databricks Inc.All rights reserved1OPTIMIZING OPTIMIZING MERGE MERGE PERFORMANCE PERFORMANCE using Liquid Clusteringusing Liquid ClusteringBart SamwelBart SamwelJune 11,2024June 11,20242024 Databricks Inc.All rights reservedBart SamwelBart SamwelPrincipal Software Engineer Databricks Amsterdam2
2、ABOUT MEABOUT MELook whos talking!Look whos talking!2024 Databricks Inc.All rights reserved2024 Databricks Inc.All rights reserved3WHATWHATISISA AMERGE?MERGE?2024 Databricks Inc.All rights reservedMERGE is the workhorse of ETL.Incremental ETL(e.g.UPSERTs)Change Data CaptureChange Data Capture(CDC)re
3、playe.g.APPLY CHANGES INTODelta Live TablesMaterialized Views4What is a MERGE?What is a MERGE?And why do I care?And why do I care?2024 Databricks Inc.All rights reservedMERGE INTO MyTargetTable AS tUSING MySourceTable AS sON s.ColX=t.ColX AND s.ColY=t.ColYWHEN MATCHED AND s.action=updateTHEN UPDATE
4、SET*WHEN MATCHED AND s.action=deleteTHEN DELETEWHEN NOT MATCHEDTHEN INSERT*5MERGE ExampleMERGE Example2024 Databricks Inc.All rights reservedMERGE INTO MyTargetTable AS tINTO MyTargetTable AS tUSING MySourceTable AS sUSING MySourceTable AS sON s.ColX=t.ColX AND s.ColY=t.ColYWHEN MATCHED AND s.Action
5、=updateTHEN UPDATE SET*WHEN MATCHED AND s.Action=deleteTHEN DELETEWHEN NOT MATCHEDTHEN INSERT*6MERGE ExampleMERGE ExampleColXColYValZ1320Foo4737Bar6380BazMyTargetTableColXColYValZAction4737Pebblesupdate6380Bazdelete7787BambamMySourceTable2024 Databricks Inc.All rights reservedMERGE INTO MyTargetTabl
6、e AS tUSING MySourceTable AS sON s.ColX=t.ColX AND s.ColY=t.ColYs.ColX=t.ColX AND s.ColY=t.ColYWHEN MATCHED AND s.Action=updateTHEN UPDATE SET*WHEN MATCHED AND s.Action=deleteTHEN DELETEWHEN NOT MATCHEDTHEN INSERT*7MERGE ExampleMERGE ExampleColXColYValZ1320Foo4737Bar6380BazMyTargetTableColXColYValZA