1、HTAP Made SimpleFallacies and PitfallsRossi Sun2023/02/07About MeRossi Sun(孙若曦)Tech lead,Compute Arch.&Engine team PingCAPWas Tech lead,SQL on Hadoop team Transwarp Tech lead,GPU Arch Infra team NVIDIAOver 10-year experience on infra System-level software Database kernel&big-data GPU techniquesHTAPH
2、TAP:Hybrid Transactional/Analytical ProcessingCoined by Gartner in 2014 Reatime Analytics Analytics see the exact operational data Simplify data infrastructure(no data movement)Going mainstream 20%database buyers cited HTAP capabilities 40%marked HTAP as Top 3 factorTiDB Architecture 1.0Pitfall:Non-
3、Scalable APAuto-shardingDistributed transactionTP scales wellAP:we can do join and aggregation,only that on a single nodeSELECT COUNT(*)FROM t0 JOIN t1 ON t0.c1=t1.c1 WHERE t0.c2=xxx and t1.c2=yyyIntroducing TiSpark(Since 2.0)Introducing TiSpark(Cont.)Fallacy:One Size Fits AllThe“Hybrid”what?Row sto
4、re performs poor for APidnameage0962Jane307658John453589Jim205523Susan52id0962765835895523nameJaneJohnJimSusanage30452052Fallacy:One Size Fits All(Cont.)TP is severely interferedOLTPOLAPIntroducing Columnar Store(POC)Pitfall:Unbounded StalenessPitfall:ComplexityEmbracing Raft(Since 4.0)Introducing M
5、PP(Since 5.0)Introducing MPP(Cont.)Fallacy:Consistency&FreshnessUse MVCC+learner read(raft)to guarantee data consistency and freshnessJust like serializable isolation level Ideal,but at the cost of performanceFallacy:Consistency&Freshness(Cont.)Raft Learner.TS:10$100MVCCRaft Leader42Timestamp:11bala
6、nce=?Fallacy:Consistency&Freshness(Cont.)Raft Learner.TS:10$100TS:11$200TS:12$150MVCCRaft Leader44Timestamp:11balance=$200Stale Read(Coming in 6.6)Bounded stalenessData integrityGets 30%more QPS in our internal POCRaft Learner.TS:10$100TS:11?MVCCRaft Leader42Timestamp:11Staleness