lecture 17 joins - github pages · 2017-12-13 · how do i choose the right graduate program?...
TRANSCRIPT
Lecture17:Joins
Lecture17
GraduateSchoolInformationPanel
ShouldIattendgraduateschoolinCS?
Thursday,Nov9@3:00PM1240CS
HowdoIprepareacompetitiveapplication?
JoinusforaliveQ&AwithCSfaculty,graduatestudents,anda
graduateschooladmissionscoordinator!
HowdoIchoosetherightgraduateprogram?
Lecture17:Joins
Lecture17
Today’sLecture
1. Recap:Select,Project
2. Joins
3. JoinsandBufferManagement
4
Lecture17
1.Recap
5
Lecture17
Lecture17
LogicalPlan=How
SELECT P.buyerFROM Purchase P, Person QWHERE P.buyer=Q.nameAND Q.city=‘Madison’
SELECTSELECTcity=‘Madison’
JOINbuyer=name
PROJECTonbuyer
Purchase Person
Lecture17
PhysicalPlan=What
SELECT P.buyerFROM Purchase P, Person QWHERE P.buyer=Q.nameAND Q.city=‘Madison’
TableScan IndexScan
NestedLoopJoin
Hash-basedProject
Purchase Person
Lecture16
SelectOperator
accesspath =waytoretrievetuplesfromatable• FileScan• scantheentirefile• I/Ocost:O(N),whereN=#pages
• IndexScan:• useanindexavailableonsomepredicate• I/Ocost:itvariesdependingontheindex
Lecture16
IndexMatching
• Wesaythatanindexmatches aselectionpredicateiftheindexcanbeusedtoevaluateit• Consideraconjunction-onlyselection.Anindexmatches(partof)apredicateif• Hash:onlyequalityoperation&thepredicateincludesall indexattributes
• B+Tree:theattributesareaprefixofthesearchkey(anyopsarepossible)
Lecture16
ChoosingtheRightIndex
• Selectivity ofanaccesspath=fraction ofdatapagesthatneedtoberetrieved• Wewanttochoosethemostselectivepath!• Estimatingtheselectivityofanaccesspathisahardproblem
Lecture17
Projection
Simplecase:SELECT R.a, R.d• scanthefileandforeachtupleoutputR.a,R.d
Hardcase:SELECT DISTINCT R.a, R.d• projectouttheattributes• eliminateduplicatetuples(thisisthedifficultpart!)
Lecture17
Projection:Sort-based
Wecanimproveuponthenaïvealgorithmbymodifyingthesortingalgorithm:
1. InPass0 ofsorting,projectouttheattributes
2. Insubsequentpasses,eliminatetheduplicateswhilemergingtheruns
Lecture17
Projection:Hash-based
2-phasealgorithm:
• partitioning• projectoutattributesandsplittheinputintoB-1partitionsusingahashfunctionh
• duplicateelimination• readeachpartitionintomemoryanduseanin-memoryhashtable(withadifferent hashfunction)toremoveduplicates
2.Joins
14
Lecture17
Whatyouwilllearnaboutinthissection
1. RECAP:Joins
2. NestedLoopJoin(NLJ)
3. BlockNestedLoopJoin(BNLJ)
4. IndexNestedLoopJoin(INLJ)
15
Lecture17
1.NestedLoopJoins
16
Lecture17
Whatyouwilllearnaboutinthissection
1. RECAP:Joins
2. NestedLoopJoin(NLJ)
3. BlockNestedLoopJoin(BNLJ)
4. IndexNestedLoopJoin(INLJ)
17
Lecture17
RECAP:Joins
Lecture17>Joins
19
Joins:Example
Example: Returnsallpairsoftuplesr ∈ 𝑅, 𝑠 ∈ 𝑆suchthat𝑟. 𝐴 = 𝑠. 𝐴
A D
3 7
2 2
2 3
A B C
1 0 1
2 3 4
2 5 2
3 1 1
R SA B C D
2 3 4 2
𝐑 ⋈ 𝑺
Lecture17>Joins
SELECT R.A,B,C,DFROM R, SWHERE R.A = S.A
20
Joins:Example
Example: Returnsallpairsoftuplesr ∈ 𝑅, 𝑠 ∈ 𝑆suchthat𝑟. 𝐴 = 𝑠. 𝐴
A D
3 7
2 2
2 3
A B C
1 0 1
2 3 4
2 5 2
3 1 1
R SA B C D
2 3 4 2
2 3 4 3
𝐑 ⋈ 𝑺
Lecture17>Joins
SELECT R.A,B,C,DFROM R, SWHERE R.A = S.A
21
Joins:Example
Example: Returnsallpairsoftuplesr ∈ 𝑅, 𝑠 ∈ 𝑆suchthat𝑟. 𝐴 = 𝑠. 𝐴
A D
3 7
2 2
2 3
A B C
1 0 1
2 3 4
2 5 2
3 1 1
R SA B C D
2 3 4 2
2 3 4 3
2 5 2 2
𝐑 ⋈ 𝑺
Lecture17>Joins
SELECT R.A,B,C,DFROM R, SWHERE R.A = S.A
22
Joins:Example
Example: Returnsallpairsoftuplesr ∈ 𝑅, 𝑠 ∈ 𝑆suchthat𝑟. 𝐴 = 𝑠. 𝐴
A D
3 7
2 2
2 3
A B C
1 0 1
2 3 4
2 5 2
3 1 1
R SA B C D
2 3 4 2
2 3 4 3
2 5 2 2
2 5 2 3
𝐑 ⋈ 𝑺
Lecture17>Joins
SELECT R.A,B,C,DFROM R, SWHERE R.A = S.A
23
Joins:Example
Example: Returnsallpairsoftuplesr ∈ 𝑅, 𝑠 ∈ 𝑆suchthat𝑟. 𝐴 = 𝑠. 𝐴
A D
3 7
2 2
2 3
A B C
1 0 1
2 3 4
2 5 2
3 1 1
R SA B C D
2 3 4 2
2 3 4 3
2 5 2 2
2 5 2 3
3 1 1 7
𝐑 ⋈ 𝑺
Lecture17>Joins
SELECT R.A,B,C,DFROM R, SWHERE R.A = S.A
24
Semantically:ASubsetoftheCrossProduct
SELECT R.A,B,C,DFROM R, SWHERE R.A = S.A
Example: Returnsallpairsoftuplesr ∈ 𝑅, 𝑠 ∈ 𝑆suchthat𝑟. 𝐴 = 𝑠. 𝐴
A D
3 7
2 2
2 3
A B C
1 0 1
2 3 4
2 5 2
3 1 1
R S A B C D
2 3 4 2
2 3 4 3
2 5 2 2
2 5 2 3
3 1 1 7
×CrossProduct
Filterbyconditions(r.A =s.A)
… Canweactuallyimplementajoininthisway?
𝐑 ⋈ 𝑺
Lecture17>Joins
Notes
• Wewrite𝐑 ⋈ 𝑺 tomeanjoinRandSbyreturningalltuplepairswhereallsharedattributesareequal
• Wewrite𝐑 ⋈ 𝑺 onA tomeanjoinRandSbyreturningalltuplepairswhereattribute(s)Aareequal
• Forsimplicity,we’llconsiderjoinsontwotables andwithequalityconstraints(“equijoins”)
Howeverjoinscanmerge>2tables,andsomealgorithmsdosupportnon-equalityconstraints!
Lecture17>Joins
26
NestedLoopJoins
Lecture17>NLJ
Notes• Weareagainconsidering“IOaware”algorithms:careaboutdiskIO
• GivenarelationR,let:• T(R)=#oftuplesinR• P(R)=#ofpagesinR
• Notealsothatweomitceilingsincalculations…goodexercisetoputbackin!
Lecture17>NLJ
Recallthatweread/writeentirepageswithdiskIO
NestedLoopJoin(NLJ)
Compute R ⋈ 𝑆𝑜𝑛𝐴:for r in R:for s in S:if r[A] == s[A]:yield (r,s)
Lecture17>NLJ
NestedLoopJoin(NLJ)
Compute R ⋈ 𝑆𝑜𝑛𝐴:for r in R:for s in S:if r[A] == s[A]:yield (r,s)
Lecture17>NLJ
P(R)
1. LoopoverthetuplesinR
NotethatourIOcostisbasedonthenumberofpagesloaded,notthenumberoftuples!
Cost:
NestedLoopJoin(NLJ)
Compute R ⋈ 𝑆𝑜𝑛𝐴:for r in R:for s in S:if r[A] == s[A]:yield (r,s)
Lecture17>NLJ
P(R)+T(R)*P(S)
HavetoreadallofSfromdiskforeverytupleinR!
1. LoopoverthetuplesinR
2. ForeverytupleinR,loopoverallthetuplesinS
Cost:
NestedLoopJoin(NLJ)
Compute R ⋈ 𝑆𝑜𝑛𝐴:for r in R:for s in S:if r[A] == s[A]:yield (r,s)
Lecture17>NLJ
P(R)+T(R)*P(S)
NotethatNLJcanhandlethingsotherthanequalityconstraints…justcheckintheifstatement!
1. LoopoverthetuplesinR
2. ForeverytupleinR,loopoverallthetuplesinS
3. Checkagainstjoinconditions
Cost:
NestedLoopJoin(NLJ)
Compute R ⋈ 𝑆𝑜𝑛𝐴:for r in R:for s in S:if r[A] == s[A]:yield (r,s)
Lecture17>NLJ
P(R)+T(R)*P(S)+OUT
1. LoopoverthetuplesinR
2. ForeverytupleinR,loopoverallthetuplesinS
3. Checkagainstjoinconditions
4. Writeout(topage,thenwhenpagefull,todisk)
Cost:
WhatwouldOUTbeifourjoinconditionistrivial(ifTRUE)?
OUT couldbebiggerthanP(R)*P(S)…butusuallynotthatbad
NestedLoopJoin(NLJ)
Compute R ⋈ 𝑆𝑜𝑛𝐴:for r in R:for s in S:if r[A] == s[A]:yield (r,s)
Lecture17>NLJ
P(R)+T(R)*P(S)+OUT
WhatifR(“outer”)andS(“inner”)switched?
Cost:
P(S)+T(S)*P(R)+OUT
Outervs.innerselectionmakesahugedifference-DBMSneedstoknowwhichrelationissmaller!
IO-AwareApproach
Lecture17>BNLJ
BlockNestedLoopJoin(BNLJ)
Compute R ⋈ 𝑆𝑜𝑛𝐴:for each B-1 pages pr of R:
for page ps of S:for each tuple r in pr:
for each tuple s in ps:if r[A] == s[A]:
yield (r,s)
Lecture17>BNLJ
P(𝑅)
GivenB+1pagesofmemory
1. LoadinB-1pagesofRatatime(leaving1pageeachfreeforS&output)
Cost:
Note:Therecouldbesomespeeduphereduetothefactthatwe’rereadinginmultiplepagessequentiallyhoweverwe’llignorethishere!
BlockNestedLoopJoin(BNLJ)
Compute R ⋈ 𝑆𝑜𝑛𝐴:for each B-1 pages pr of R:
for page ps of S:for each tuple r in pr:
for each tuple s in ps:if r[A] == s[A]:
yield (r,s)
Lecture17>BNLJ
P 𝑅 +𝑃 𝑅𝐵 − 1𝑃(𝑆)
GivenB+1pagesofmemory
Note:Fastertoiterateoverthesmaller relationfirst!
1. LoadinB-1pagesofRatatime(leaving1pageeachfreeforS&output)
2. Foreach(B-1)-pagesegmentofR,loadeachpageofS
Cost:
BlockNestedLoopJoin(BNLJ)
Compute R ⋈ 𝑆𝑜𝑛𝐴:for each B-1 pages pr of R:
for page ps of S:for each tuple r in pr:
for each tuple s in ps:if r[A] == s[A]:
yield (r,s)
Lecture17>BNLJ
GivenB+1pagesofmemory
1. LoadinB-1pagesofRatatime(leaving1pageeachfreeforS&output)
2. Foreach(B-1)-pagesegmentofR,loadeachpageofS
3. Checkagainstthejoinconditions
BNLJcanalsohandlenon-equalityconstraints
Cost:
P 𝑅 +𝑃 𝑅𝐵 − 1𝑃(𝑆)
BlockNestedLoopJoin(BNLJ)
Compute R ⋈ 𝑆𝑜𝑛𝐴:for each B-1 pages pr of R:
for page ps of S:for each tuple r in pr:
for each tuple s in ps:if r[A] == s[A]:
yield (r,s)
Lecture17>BNLJ
P 𝑅 +; <=>?
𝑃(𝑆) +OUT
GivenB+1pagesofmemory
1. LoadinB-1pagesofRatatime(leaving1pageeachfreeforS&output)
2. Foreach(B-1)-pagesegmentofR,loadeachpageofS
3. Checkagainstthejoinconditions
4. Writeout
Cost:
Again,OUT couldbebiggerthanP(R)*P(S)…butusuallynotthatbad
BNLJvs.NLJ:BenefitsofIOAware
• InBNLJ,byloadinglargerchunksofR,weminimizethenumberoffulldiskreads ofS• WeonlyreadallofSfromdiskforevery(B-1)-pagesegmentofR!• Stillthefullcross-product,butmoredoneonlyinmemory
P 𝑅 +; <=>?
𝑃(𝑆) +OUTP(R)+T(R)*P(S)+OUTNLJ BNLJ
BNLJisfasterbyroughly(=>?)@(<);(<)
!
Lecture17>BNLJ
BNLJvs.NLJ:BenefitsofIOAware
• Example:• R:500pages• S:1000pages• 100tuples/page• Wehave12pagesofmemory(B=11)
• NLJ:Cost=500+50,000*1000 =50MillionIOs~=140hours
• BNLJ:Cost=500+ABB∗?BBB?B
=50Thousand IOs~=0.14hours
Lecture17>BNLJ
Averyrealdifferencefromasmallchangeinthealgorithm!
IgnoringOUThere…
SmarterthanCross-Products
Lecture17>INLJ
SmarterthanCross-Products:FromQuadratictoNearlyLinear• Alljoinsthatcomputethefullcross-product havesomequadraticterm• Forexamplewesaw:
• Nowwe’llseesome(nearly)linearjoins:• ~O(P(R)+P(S)+OUT),whereagainOUT couldbequadraticbutisusuallybetter
P 𝑅 +𝑷 𝑹=>?
𝑷(𝑺) +OUT
P(R)+T(R)P(S)+OUTNLJ
BNLJ
Wegetthisgainbytakingadvantageofstructure- movingtoequalityconstraints(“equijoin”)only!
Lecture17>INLJ
IndexNestedLoopJoin(INLJ)
Compute R ⋈ 𝑆𝑜𝑛𝐴:Given index idx on S.A:
for r in R:s in idx(r[A]):
yield r,s
Lecture17>INLJ
P(R)+T(R)*L+OUT
àWecanuseanindex (e.g.B+Tree)toavoiddoingthefullcross-product!
whereListheIOcosttoaccessallthedistinctvaluesintheindex;assumingthesefitononepage,L~3 isgoodest.
Cost:
3.JoinsandMemory
44
Lecture17
Whatyouwilllearnaboutinthissection
1. Sort-MergeJoin(SMJ)
2. HashJoin(HJ)
3. SMJvs.HJ
45
Lecture17
Sort-MergeJoin(SMJ)
46
Lecture17
Whatyouwilllearnaboutinthissection
1. Sort-MergeJoin
2. “Backup”&TotalCost
3. Optimizations
47
Lecture17
SortMergeJoin(SMJ):BasicProcedure
TocomputeR ⋈ 𝑆𝑜𝑛𝐴:
1. SortR,SonAusingexternalmergesort
2. Scan sortedfilesand“merge”
3. [Mayneedto“backup”- seenextsubsection]
NotethatifR,SarealreadysortedonA,SMJwillbeawesome!
Lecture17>SMJ
Notethatweareonlyconsideringequalityjoinconditionshere
SMJExample:R ⋈ 𝑆𝑜𝑛𝐴with3pagebuffer
• Forsimplicity:Leteachpagebeonetuple,andletthefirstvaluebeA
DiskMainMemory
BufferR (5,b) (3,j)(0,a)
S (7,f) (0,j)(3,g)
WeshowthefileHEAD,whichisthenextvaluetoberead!
Lecture17>SMJ
SMJExample:R ⋈ 𝑆𝑜𝑛𝐴with3pagebuffer
1.SorttherelationsR,Sonthejoinkey(firstvalue)
DiskMainMemory
BufferR (5,b) (3,j)(0,a)
S (7,f) (0,j)(3,g)
(3,j) (5,b)(0,a)
(3,g) (7,f)(0,j)
Lecture17>SMJ
SMJExample:R ⋈ 𝑆𝑜𝑛𝐴with3pagebuffer
2.Scanand“merge”onjoinkey!
DiskMainMemory
BufferR
S (3,g) (7,f)
(3,j) (5,b)
Output
(0,j)
(0,a)(0,a)
(0,j)
Lecture17>SMJ
SMJExample:R ⋈ 𝑆𝑜𝑛𝐴with3pagebuffer
2.Scanand“merge”onjoinkey!
DiskMainMemory
BufferR
S (3,g) (7,f)
(3,j) (5,b)
Output
(0,j)(0,a)
(0,a)
(0,j)(0,a,j)
Lecture17>SMJ
SMJExample:R ⋈ 𝑆𝑜𝑛𝐴with3pagebuffer
2.Scanand“merge”onjoinkey!
DiskMainMemory
BufferR
S (3,g) (7,f)
(3,j) (5,b)
Output
(0,a)
(0,j)
(0,a,j)
(3,j,g)
(3,j)
(3,g)
(5,b)
(7,f)
Lecture17>SMJ
SMJExample:R ⋈ 𝑆𝑜𝑛𝐴with3pagebuffer
2.Done!
DiskMainMemory
BufferR
S 3,g 7,f
3,j 5,b
Output
(0,a)
(0,j)
(0,a,j)
(3,j)
(3,g)
(3,j,g)
(5,b)
(7,f)
Lecture17>SMJ
Whathappenswithduplicatejoinkeys?
Lecture17>Backup
MultipletupleswithSameJoinKey:“Backup”
1.Startwithsortedrelations,andbeginscan/merge…
DiskMainMemory
BufferR
S 3,g 7,f
3,j 5,b
Output
(0,j)
(0,g)
(0,b)
(7,f)
(0,a)
(0,j)
(0,a)
(0,j)
Lecture17>Backup
1.Startwithsortedrelations,andbeginscan/merge…
DiskMainMemory
BufferR
S 3,g 7,f
3,j 5,b
Output
(0,j)
(0,g)
(0,b)
(7,f)
(0,a)
(0,a)(0,j)
(0,j) (0,a,j)
MultipletupleswithSameJoinKey:“Backup”
Lecture17>Backup
1.Startwithsortedrelations,andbeginscan/merge…
DiskMainMemory
BufferR
S (0,g) 7,f
(0,j) 5,b
Output
(0,b)
(7,f)
(0,a)
(0,a)(0,j)
(0,a,j)
(0,a,g)
MultipletupleswithSameJoinKey:“Backup”
(0,g)
(0,j)
Lecture17>Backup
1.Startwithsortedrelations,andbeginscan/merge…
DiskMainMemory
BufferR
S 0,g 7,f
0,j 5,b
Output
(0,j) (0,b)
(7,f)
(0,a)
(0,a,j)
(0,g)
(0,a,g)
(0,j)
Haveto“backup”inthescanofSandreadtuplewe’vealreadyread!
(0,j)
MultipletupleswithSameJoinKey:“Backup”
(0,j)
Lecture17>Backup
Backup
• Atbest,nobackupà scantakesP(R)+P(S)reads• Forex:ifnoduplicatevaluesinjoinattribute
• Atworst(e.g.fullbackupeachtime),scancouldtakeP(R)*P(S)reads!• Forex:ifallduplicatevaluesinjoinattribute,i.e.alltuplesinRandShavethesamevalueforthejoinattribute
• Roughly:ForeachpageofR,we’llhavetobackupandreadeachpageofS…
• Oftennotthatbadhowever,pluswecan:• Leavemoredatainbuffer(forlargerbuffers)• Can“zig-zag”(seeanimation)
Lecture17>Backup
SMJ:Totalcost
• CostofSMJiscostofsorting RandS…
• Plusthecostofscanning:~P(R)+P(S)• Becauseofbackup:inworstcaseP(R)*P(S);butthiswouldbeveryunlikely
• Plusthecostofwritingout:~P(R)+P(S)butinworstcaseT(R)*T(S)
~Sort(P(R))+Sort(P(S))+P(R)+P(S) +OUT
Recall:Sort(N)≈ 2𝑁 log=𝑵
𝟐(𝑩R𝟏)+ 1
Note:thisisusingrepacking,whereweestimatethatwecancreateinitialrunsoflength~2(B+1)
Lecture17>Backup
SMJvs.BNLJ:SteelCageMatch
• Ifwehave100bufferpages,P(R)= 1000pagesandP(S)=500pages:• Sortbothintwopasses:2*2*1000+2*2*500=6,000IOs• Mergephase1000+500=1,500IOs• =7,500IOs+OUT
WhatisBNLJ?• 500+1000* ABB
TU=6,500IOs+OUT
• But,ifwehave35bufferpages?• SortMergehassamebehavior(still2passes)• BNLJ?15,500IOs+OUT!
SMJis~linearvs.BNLJisquadratic…Butit’sallaboutthememory.
Lecture17>Backup
ASimpleOptimization:MergesMerged!
• SMJiscomposedofasortphaseandamergephase
• Duringthesortphase,runpassesofexternalmergesortonRandS• Supposeatsomepoint,RandShave<=B(sorted)runsintotal
• Wecoulddotwomerges(foreachofR&S)atthispoint,completethesortphase,andstartthemergephase…
• OR,wecouldcombinethem:doone B-waymergeandcompletethejoin!
GivenB+1bufferpages
Lecture17>Backup
Merge/JoinPhase
SortPhase(Ext.MergeSort)
Un-OptimizedSMJ
SR
Split&sortSplit&sort
MergeMerge
MergeMerge
GivenB+1bufferpages
Joinedoutputfilecreated!
Unsortedinputrelations
Lecture17>Backup
Merge/JoinPhase
SortPhase(Ext.MergeSort)
SimpleSMJOptimization
SR
Split&sortSplit&sort
MergeMerge
GivenB+1bufferpages
Joinedoutputfilecreated!
Unsortedinputrelations
<=Btotalruns
B-WayMerge/Join
Lecture17>Backup
SimpleSMJOptimization
• Now,onthislastpass,weonlydoP(R)+P(S)IOstocompletethejoin!
• IfwecaninitiallysplitRandSintoBtotalrunseachoflengthapprox.<=2(B+1),assumingrepackingletsuscreateinitialrunsof~2(B+1)- thenweonlyneed3(P(R)+P(S))+OUT forSMJ!• 2R/Wperpagetosortrunsinmemory,1RperpagetoB-waymerge/join!
• Howmuchmemoryforthistohappen?• ; < R;(V)
=≤ 2 𝐵 + 1 ⇒ ~P R + P S ≤ 2𝐵Z
• Thus,𝐦𝐚𝐱{𝐏 𝐑 , 𝐏 𝐒 } ≤ 𝑩𝟐 isanapproximatesufficientcondition
GivenB+1bufferpages
IfthelargerofR,Shas<=B2 pages,thenSMJcosts3(P(R)+P(S))+OUT!
Lecture17>Backup
TakeawaypointsfromSMJ
Ifinputalreadysortedonjoinkey,skipthesorts.• SMJisbasicallylinear.• Nastybutunlikelycase:Manyduplicatejoinkeys.
SMJneedstosortbothrelations• Ifmax{P(R),P(S)}<B2 thencostis3(P(R)+P(S))+OUT
Lecture17>Summary
HashJoin(HJ)
68
Lecture17
Whatyouwilllearnaboutinthissection
1. HashJoin
2. Memoryrequirements
69
Lecture17
Recall:Hashing
• Magicofhashing:• AhashfunctionhB mapsinto[0,B-1]• Andmapsnearlyuniformly
• Ahashcollision iswhenx!=ybuthB(x)=hB(y)• Notehoweverthatitwillnever occurthatx=ybuthB(x)!=hB(y)
• WehashonanattributeA,soourhasfunctionishB(t)hastheformhB(t.A).• Collisionsmaybemorefrequent.
Lecture17>HJ
HashJoin:High-levelprocedure
TocomputeR ⋈ 𝑆𝑜𝑛𝐴:
1. PartitionPhase:Usingone(shared)hashfunctionhB,partitionRandSintoB buckets
2. MatchingPhase:Takepairsofbucketswhosetupleshavethesamevaluesforh,andjointhese1. UseBNLJhere;orhashagainà eitherway,operatingonsmallpartitionsso
fast!
Noteagainthatweareonlyconsideringequalityconstraintshere
Wedecompose theproblemusinghB,thencompletethejoin
Lecture17>HJ
HashJoin:High-levelprocedure
1.PartitionPhase:Usingone(shared)hashfunctionhB,partitionRandSintoB buckets
Disk
R
S
(3,j)(0,j)
(0,a)(0,a)
(3,b)
(5,b)(0,a)(0,j)
Disk
R1
S1
hB
S2
R2
Moredetailinasecond…
(0,a)(0,a)
(0,j)
(3,j)(3,b)
(0,a)(0,j)
(5,b)(5,b)
Noteournewconvention:pageseachhavetwotuples(oneperrow)
Lecture17>HJ
HashJoin:High-levelprocedure
Disk
R
S
(3,j)(0,j)
(0,a)(0,a)
(3,b)
(5,b)(0,a)(0,j)
Disk
R1
S1
hB
S2
R2
(0,a)(0,a)
(0,j)
(0,a)(0,j)
(5,b)(5,b)
Joinmatchingbuckets
2.MatchingPhase:TakepairsofbucketswhosetupleshavethesamevaluesforhB,andjointhese
Lecture17>HJ
(3,j)(3,b)
HashJoin:High-levelprocedure
Disk
R
S
(3,j)(0,j)
(0,a)(0,a)
(3,b)
(5,b)(0,a)(0,j)
Disk
R1
S1
hB
S2
R2
(0,a)(0,a)
(0,j)
(0,a)(0,j)
(5,b)(5,b)
Don’thavetojointheothers!E.g.(S1 andR2)!
2.MatchingPhase:TakepairsofbucketswhosetupleshavethesamevaluesforhB,andjointhese
Lecture17>HJ
(3,j)(3,b)
HashJoinPhase1:Partitioning
Goal:Foreachrelation,partitionrelationintobuckets suchthatifhB(t.A)=hB(t’.A)theyareinthesamebucket
GivenB+1bufferpages,wepartitionintoBbuckets:• WeuseBbufferpagesforoutput(oneforeachbucket),and1forinput
• The“dual”ofsorting.• Foreachtupletininput,copytobufferpageforhB(t.A)• Whenpagefillsup,flushtodisk.
Lecture17>HJ
Howbigaretheresultingbuckets?
• GivenNinputpages,wepartitionintoBbuckets:• à Ideallyourbucketsareeachofsize~N/Bpages
• Whathappensiftherearehashcollisions?• Bucketscouldbe>N/B• We’lldoseveralpasses…
• Whathappensifthereareduplicatejoinkeys?• Nothingwecandohere…couldhavesomeskew insizeofthebuckets
GivenB+1buffer pages
Lecture17>HJ
Howbigdowewant theresultingbuckets?
• Ideally,ourbucketswouldbeofsize≤ 𝑩− 𝟏 pages• 1 forinputpage, 1foroutputpage,B-1 foreachbucket
• Recall:IfwewanttojoinabucketfromRandonefromS,wecandoBNLJinlineartimeifforoneofthem(wlog sayR),𝑷(𝑹) ≤ 𝑩 − 𝟏!• Andmoregenerally,beingabletofitbucketinmemoryisadvantageous
• Wecankeeppartitioningbucketsthatare>B-1pages,untiltheyare≤ 𝑩− 𝟏 pages• Usinganewhashkeywhichwillsplitthem… We’llcalleachofthese
a“pass”again…
GivenB+1buffer pages
RecallforBNLJ:
P 𝑅 +𝑃 𝑅 𝑃(𝑆)𝐵 − 1
Lecture17>HJ
WepartitionintoB=2 bucketsusinghashfunctionh2 sothatwecanhaveonebufferpageforeachpartition(andoneforinput)
HashJoinPhase1:Partitioning
Disk
R
(3,j)(0,j)
GivenB+1=3buffer pages
(5,b) (5,a)(0,j)
(0,a)(3,a)
Forsimplicity,we’lllookatpartitioningoneofthetworelations- wejustdothesamefortheotherrelation!
Recall:ourgoalwillbetogetB=2buckets ofsize<=B-1à 1pageeach
Lecture17>HJ
1.WereadpagesfromRintothe“input”pageofthebuffer…
MainMemory
Buffer
HashJoinPhase1:Partitioning
Inputpage
0 1
Output(bucket)pages
Disk
R
GivenB+1=3buffer pages
(3,j)(0,j)
(5,b) (5,a)(0,j)
(0,a)(3,a)
Lecture17>HJ
2.Thenweusehashfunctionh2 tosortintothebuckets,whicheachhaveonepageinthebuffer
MainMemory
Buffer
HashJoinPhase1:Partitioning
Inputpage
0 1
Output(bucket)pages
Disk
R
GivenB+1=3buffer pages
(3,a)
h2(0)=0
(0,a)(3,a)
(0,a)
(3,j)(0,j)
(5,b) (5,a)(0,j)
Lecture17>HJ
MainMemory
Buffer
HashJoinPhase1:Partitioning
Inputpage
0 1
Output(bucket)pages
Disk
R
GivenB+1=3buffer pages
(3,a)
h2(3)=1
(0,a) (3,a)
(3,j)(0,j)
(5,b) (5,a)(0,j)
2.Thenweusehashfunctionh2 tosortintothebuckets,whicheachhaveonepageinthebuffer
Lecture17>HJ
3.Werepeatuntilthebufferbucketpagesarefull…
MainMemory
Buffer
HashJoinPhase1:Partitioning
Inputpage
0 1
Output(bucket)pages
Disk
R
GivenB+1=3buffer pages
(0,a) (3,a)
(3,j)(0,j)
(5,b) (5,a)(0,j)
Lecture17>HJ
3.Werepeatuntilthebufferbucketpagesarefull…
MainMemory
Buffer
HashJoinPhase1:Partitioning
Inputpage
0 1
Output(bucket)pages
Disk
R
GivenB+1=3buffer pages
(0,a) (3,a)(0,j)
(5,b) (5,a)(0,j)
h2(3)=1
(3,j)(0,j)
(3,a)(3,j)
Lecture17>HJ
3.Werepeatuntilthebufferbucketpagesarefull…
MainMemory
Buffer
HashJoinPhase1:Partitioning
Inputpage
0 1
Output(bucket)pages
Disk
R
GivenB+1=3buffer pages
(0,a) (3,a)(0,j)
(5,b) (5,a)(0,j)
h2(0)=0
(3,a)(3,j)
(0,a)(0,j)
Lecture17>HJ
3.Werepeatuntilthebufferbucketpagesarefull…thenflushtodisk
MainMemory
Buffer
HashJoinPhase1:Partitioning
Inputpage
0 1
Output(bucket)pages
Disk
R
GivenB+1=3buffer pages
(5,b) (5,a)(0,j)
B0
B1
(3,a)(3,j)
(0,a)(0,j)
Lecture17>HJ
3.Werepeatuntilthebufferbucketpagesarefull…thenflushtodisk
MainMemory
Buffer
HashJoinPhase1:Partitioning
Inputpage
0 1
Output(bucket)pages
Disk
R
GivenB+1=3buffer pages
(5,b)
B0
B1
(0,a)(0,j)
(3,a)(3,j)
(5,a)(0,j)
Lecture17>HJ
Notethatcollisionscanoccur!
MainMemory
Buffer
HashJoinPhase1:Partitioning
Inputpage
0 1
Output(bucket)pages
Disk
R
GivenB+1=3buffer pages
(5,b)
B0
B1
(0,a)(0,j)
(3,a)(3,j)
(0,j)
h2(5)=1
Collision!!!
(5,a)(0,j)
(5,a)
Lecture17>HJ
h2(5)=h2(3)=1
Finishthispass…
MainMemory
Buffer
HashJoinPhase1:Partitioning
Inputpage
0 1
Output(bucket)pages
Disk
R
GivenB+1=3buffer pages
(5,b)
B0
B1
(0,a)(0,j)
(3,a)(3,j)
(0,j)
h2(0)=0
(5,a)(0,j)
Lecture17>HJ
Finishthispass…
MainMemory
Buffer
HashJoinPhase1:Partitioning
Inputpage
0 1
Output(bucket)pages
Disk
R
GivenB+1=3buffer pages
B0
B1
(0,a)(0,j)
(3,a)(3,j)
(5,a)(0,j)
(5,b)
Lecture17>HJ
Finishthispass…
MainMemory
Buffer
HashJoinPhase1:Partitioning
Inputpage
0 1
Output(bucket)pages
Disk
R
GivenB+1=3buffer pages
B0
B1
(0,a)(0,j)
(3,a)(3,j)
(5,a)(0,j)(5,b)
h2(5)=1
(5,a)(5,b)
Lecture17>HJ
h2(5)=h2(3)=1
Collision!!!
Finishthispass…
MainMemory
Buffer
HashJoinPhase1:Partitioning
Inputpage
0 1
Output(bucket)pages
Disk
R
GivenB+1=3buffer pages
B0
B1
(0,a)(0,j)
(3,a)(3,j)
(0,j) (5,a)(5,b)
Lecture17>HJ
HashJoinPhase1:Partitioning
Disk
GivenB+1=3buffer pages
B0
B1
(0,a)(0,j)
(3,a)(3,j)
(0,j)
(5,a)(5,b)
WewantedbucketsofsizeB-1=1…howeverwegotlargeronesdueto:
(1)Duplicatejoinkeys
(2)Hashcollisions
Lecture17>HJ
HashJoinPhase1:Partitioning
Disk
GivenB+1=3buffer pages
B0
B1
(0,a)(0,j)
(3,a)(3,j)
(0,j)
(5,a)(5,b)
Totakecareoflargerbucketscausedby(2)hashcollisions,wecanjustdoanotherpass!
Whathashfunctionshouldweuse?
Doanotherpasswithadifferenthashfunction,h’2,ideallysuchthat:
h’2(3)!=h’2(5)
Lecture17>HJ
HashJoinPhase1:Partitioning
Disk
GivenB+1=3buffer pages
B0
B1
(0,a)(0,j)
(3,a)(3,j)
(0,j)
Totakecareoflargerbucketscausedby(2)hashcollisions,wecanjustdoanotherpass!
Whathashfunctionshouldweuse?
Doanotherpasswithadifferenthashfunction,h’2,ideallysuchthat:
h’2(3)!=h’2(5)
B2(5,a)(5,b)
Lecture17>HJ
HashJoinPhase1:Partitioning
Disk
GivenB+1=3buffer pages
B0
B1
(0,a)(0,j)
(3,a)(3,j)
(0,j)
Whataboutduplicatejoinkeys?Unfortunatelythisisaproblem…butusuallynotahugeone.
B2(5,a)(5,b)
Wecallthisunevennessinthebucketsizeskew
Lecture17>HJ
NowthatwehavepartitionedRandS…
Lecture17>HJ
HashJoinPhase2:Matching
• Now,wejustjoinpairsofbucketsfromRandSthathavethesamehashvaluetocompletethejoin!
Disk
R
S
(3,j)(0,j)
(0,a)(0,a)
(3,b)
(5,b)(0,a)(0,j)
Disk
R1
S1
hB
S2
R2
(0,a)(0,a)
(0,j)
(0,a)(0,j)
(5,b)(5,b)
Joinmatchingbuckets
Lecture17>HJ
(3,j)(3,b)
HashJoinPhase2:Matching
• Notethatsincex=yà h(x)=h(y),weonlyneedtoconsiderpairsofbuckets(onefromR,onefromS)thathavethesamehashfunctionvalue
• Ifourbucketsare~𝑩 − 𝟏 pages, canjoineachsuchpairusingBNLJinlineartime;recall(withP(R)=B-1):
BNLJCost: P 𝑅 +; < ;(V)=>?
= 𝑃(𝑅) + (=>?);(V)=>?
=P(R)+P(S)
Joiningthepairsofbucketsislinear!(Aslongassmallerbucket<=B-1pages)
Lecture17>HJ
HashJoinPhase2:Matching
h(1)=0
h(1)=0
h(2)=0
h(3)=1
h(3)=1
h(4)=1
h(5)=2
h(6)=2
h(6)=2
h(1) h(1) h(2) h(2) h(3) h(4) h(5) h(6) h(6)
R.Ahashedvalues
S.Ahashedvalues
R ⋈ 𝑆𝑜𝑛𝐴
Lecture17>HJ
HashJoinPhase2:Matching
h(1)=0
h(1)=0
h(2)=0
h(3)=1
h(3)=1
h(4)=1
h(5)=2
h(6)=2
h(6)=2
h(1) h(1) h(2) h(2) h(3) h(4) h(5) h(6) h(6)
R.Ahashedvalues
S.Ahashedvalues
R ⋈ 𝑆𝑜𝑛𝐴
Toperformthejoin,weideallyjustneedtoexplorethedarkblueregions
=thetupleswithsamevaluesofthejoinkeyA
A=1
A=2
A=3
A=4A=5
A=6
Lecture17>HJ
HashJoinPhase2:Matching
h(1)=0
h(1)=0
h(2)=0
h(3)=1
h(3)=1
h(4)=1
h(5)=2
h(6)=2
h(6)=2
h(1) h(1) h(2) h(2) h(3) h(4) h(5) h(6) h(6)
R.Ahashedvalues
S.Ahashedvalues
R ⋈ 𝑆𝑜𝑛𝐴
WithajoinalgorithmlikeBNLJthatdoesn’ttakeadvantageofequijoinstructure,we’dhavetoexplorethiswholegrid!
Lecture17>HJ
HashJoinPhase2:Matching
h(1)=0
h(1)=0
h(2)=0
h(3)=1
h(3)=1
h(4)=1
h(5)=2
h(6)=2
h(6)=2
h(1) h(1) h(2) h(2) h(3) h(4) h(5) h(6) h(6)
R.Ahashedvalues
S.Ahashedvalues
R ⋈ 𝑆𝑜𝑛𝐴h(A)=0
h(A)=1
h(A)=2
WithHJ,weonlyexploretheblueregions
=thetupleswithsamevaluesofh(A)!
WecanapplyBNLJtoeachoftheseregions
Lecture17>HJ
HashJoinPhase2:Matching
R.Ahashedvalues
S.Ahashedvalues
R ⋈ 𝑆𝑜𝑛𝐴h'(A)=0
h'(A)=2
AnalternativetoapplyingBNLJ:
Wecouldalsohashagain,andkeepdoingpassesinmemorytoreducefurther!
h'(A)=1
h'(A)=3 h'(A)
=4
h'(A)=5
Lecture17>HJ
HowmuchmemorydoweneedforHJ?
• GivenB+1bufferpages
• Suppose(reasonably)thatwecanpartitionintoBbucketsin2passes:• ForR,wegetBbucketsofsize~P(R)/B• Tojointhesebucketsinlineartime,weneedthesebucketstofitinB-1pages,sowehave:
+WLOG:AssumeP(R)<=P(S)
𝐵 − 1 ≥𝑃 𝑅𝐵 ⇒ ~𝑩𝟐 ≥ 𝑷(𝑹)
Quadraticrelationshipbetweensmallerrelation’ssize&memory!
Lecture17>Memoryrequirements
HashJoinSummary
• Givenenoughbufferpagesasonpreviousslide…
• Partitioning requiresreading+writingeachpageofR,S• à 2(P(R)+P(S))IOs
• Matching (withBNLJ)requiresreadingeachpageofR,S• à P(R)+P(S)IOs
• Writingoutresults couldbeasbadasP(R)*P(S)…butprobablyclosertoP(R)+P(S)
HJtakes~3(P(R)+P(S))+OUT IOs!
Lecture17>Memoryrequirements
SMJvs.HJ
Lecture17
Sort-Mergev.HashJoin
• Givenenoughmemory,bothSMJandHJhaveperformance:
• “Enough”memory=
• SMJ:B2 >max{P(R),P(S)}
• HJ:B2 >min{P(R),P(S)}
HashJoinsuperiorifrelationsizesdiffergreatly.Why?
~3(P(R)+P(S))+OUT
Lecture17
FurtherComparisonsofHashandSortJoins
• HashJoinsarehighlyparallelizable.
• Sort-Mergelesssensitivetodataskew andresultissorted
Lecture17
Summary
• SawIO-awarejoinalgorithms• Massivedifference
• Memorysizeskeyinhashversussortjoin• HashJoin=Littledog(dependsonsmallerrelation)
• Skewisalsoamajorfactor
Lecture17