cse 373: data structures and algorithms · hash tables: review •a data-structure for the...
TRANSCRIPT
![Page 1: CSE 373: Data Structures and Algorithms · Hash Tables: Review •A data-structure for the dictionary ADT •Average case O(1) find, insert, and delete (when under some often-reasonable](https://reader036.vdocuments.us/reader036/viewer/2022063018/5fdcf3745ba22426b2586f82/html5/thumbnails/1.jpg)
Instructor:LiliandeGreefQuarter:Summer2017
CSE373:DataStructuresandAlgorithmsLecture7:HashTableCollisions
![Page 2: CSE 373: Data Structures and Algorithms · Hash Tables: Review •A data-structure for the dictionary ADT •Average case O(1) find, insert, and delete (when under some often-reasonable](https://reader036.vdocuments.us/reader036/viewer/2022063018/5fdcf3745ba22426b2586f82/html5/thumbnails/2.jpg)
Today
• Announcements• HashTableCollisions• CollisionResolutionSchemes• SeparateChaining• OpenAddressing/Probing
• LinearProbing• QuadraticProbing• DoubleHashing
• Rehashing
![Page 3: CSE 373: Data Structures and Algorithms · Hash Tables: Review •A data-structure for the dictionary ADT •Average case O(1) find, insert, and delete (when under some often-reasonable](https://reader036.vdocuments.us/reader036/viewer/2022063018/5fdcf3745ba22426b2586f82/html5/thumbnails/3.jpg)
Announcements• Reminder:homework2duetomorrow• Homework3:HashTables• Willbeouttomorrownight• Pair-programmingopportunity!(workwithapartner)• Ideasforfindingpartner:before/afterclass,section,Piazza
• Pair-programming:writecodetogether• 2people,1keyboard• Oneisthe“navigator,”theotherthe“driver”• Regularlyswitchofftospendequaltimeinbothroles• Sidenote:ourbrainstendtoeditoutwhenwemaketypos• Needtobeinsamephysicalspaceforentireassignment,sopartnerandplanaccordingly!
![Page 4: CSE 373: Data Structures and Algorithms · Hash Tables: Review •A data-structure for the dictionary ADT •Average case O(1) find, insert, and delete (when under some often-reasonable](https://reader036.vdocuments.us/reader036/viewer/2022063018/5fdcf3745ba22426b2586f82/html5/thumbnails/4.jpg)
Review:HashTables&Collisions
![Page 5: CSE 373: Data Structures and Algorithms · Hash Tables: Review •A data-structure for the dictionary ADT •Average case O(1) find, insert, and delete (when under some often-reasonable](https://reader036.vdocuments.us/reader036/viewer/2022063018/5fdcf3745ba22426b2586f82/html5/thumbnails/5.jpg)
HashTables:Review
• Adata-structureforthedictionaryADT• AveragecaseO(1)find,insert,anddelete
(whenundersomeoften-reasonableassumptions)• Anarraystoring(key,value)pairs• Usehashvalueandtablesizetocalculatearrayindex• Hashvaluecalculatedfromkeyusinghashfunction
find,insert,ordelete(key,value)
applyhashfunctionh(key)=hashvalue
index=hashvalue%tablesize
array[index]=(key,value)
ifcollision,applycollisionresolution
![Page 6: CSE 373: Data Structures and Algorithms · Hash Tables: Review •A data-structure for the dictionary ADT •Average case O(1) find, insert, and delete (when under some often-reasonable](https://reader036.vdocuments.us/reader036/viewer/2022063018/5fdcf3745ba22426b2586f82/html5/thumbnails/6.jpg)
HashTableCollisions:Review
• Collision:
• Wetrytoavoid themby
• Unfortunately,collisionsareunavoidableinpractice• Numberofpossiblekeys>>tablesize• Noperfecthashfunction&table-indexcombo
![Page 7: CSE 373: Data Structures and Algorithms · Hash Tables: Review •A data-structure for the dictionary ADT •Average case O(1) find, insert, and delete (when under some often-reasonable](https://reader036.vdocuments.us/reader036/viewer/2022063018/5fdcf3745ba22426b2586f82/html5/thumbnails/7.jpg)
CollisionResolutionSchemes:yourideas
![Page 8: CSE 373: Data Structures and Algorithms · Hash Tables: Review •A data-structure for the dictionary ADT •Average case O(1) find, insert, and delete (when under some often-reasonable](https://reader036.vdocuments.us/reader036/viewer/2022063018/5fdcf3745ba22426b2586f82/html5/thumbnails/8.jpg)
CollisionResolutionSchemes:yourideas
![Page 9: CSE 373: Data Structures and Algorithms · Hash Tables: Review •A data-structure for the dictionary ADT •Average case O(1) find, insert, and delete (when under some often-reasonable](https://reader036.vdocuments.us/reader036/viewer/2022063018/5fdcf3745ba22426b2586f82/html5/thumbnails/9.jpg)
SeparateChainingOneofseveralcollisionresolutionschemes
![Page 10: CSE 373: Data Structures and Algorithms · Hash Tables: Review •A data-structure for the dictionary ADT •Average case O(1) find, insert, and delete (when under some often-reasonable](https://reader036.vdocuments.us/reader036/viewer/2022063018/5fdcf3745ba22426b2586f82/html5/thumbnails/10.jpg)
SeparateChaining
Allkeysthatmaptothesametablelocation(aka“bucket”)arekeptinalist(“chain”).
Example:insert10,22,107,12,42andTableSize =10(forillustrativepurposes,we’reinsertinghashvalues)
0123456789
![Page 11: CSE 373: Data Structures and Algorithms · Hash Tables: Review •A data-structure for the dictionary ADT •Average case O(1) find, insert, and delete (when under some often-reasonable](https://reader036.vdocuments.us/reader036/viewer/2022063018/5fdcf3745ba22426b2586f82/html5/thumbnails/11.jpg)
SeparateChaining:Worst-Case
What’stheworst-casescenarioforfind?
What’stheworst-caserunningtimeforfind?
Butonlywithreallybadluckorreallybadhashfunction
![Page 12: CSE 373: Data Structures and Algorithms · Hash Tables: Review •A data-structure for the dictionary ADT •Average case O(1) find, insert, and delete (when under some often-reasonable](https://reader036.vdocuments.us/reader036/viewer/2022063018/5fdcf3745ba22426b2586f82/html5/thumbnails/12.jpg)
SeparateChaining:FurtherAnalysis
• Howcanfind becomeslowwhenwehaveagoodhashfunction?
• Howcanwereduceitslikelihood?
![Page 13: CSE 373: Data Structures and Algorithms · Hash Tables: Review •A data-structure for the dictionary ADT •Average case O(1) find, insert, and delete (when under some often-reasonable](https://reader036.vdocuments.us/reader036/viewer/2022063018/5fdcf3745ba22426b2586f82/html5/thumbnails/13.jpg)
RigorousAnalysis:LoadFactor
Definition: Theloadfactor(l) ofahashtablewithN elementsis
𝜆 = #$%&'(*+,(
Underseparatechaining,theaveragenumberofelementsperbucketis_____
Forarandom find,onaverage• anunsuccessfulfind comparesagainst_______items
• asuccessfulfind comparesagainst_______items
![Page 14: CSE 373: Data Structures and Algorithms · Hash Tables: Review •A data-structure for the dictionary ADT •Average case O(1) find, insert, and delete (when under some often-reasonable](https://reader036.vdocuments.us/reader036/viewer/2022063018/5fdcf3745ba22426b2586f82/html5/thumbnails/14.jpg)
RigorousAnalysis:LoadFactor
Definition: Theloadfactor(l) ofahashtablewithN elementsis
𝜆 = #$%&'(*+,(
Tochooseagoodloadfactor,whatareourgoals?
Soforseparatechaining,agoodloadfactoris
![Page 15: CSE 373: Data Structures and Algorithms · Hash Tables: Review •A data-structure for the dictionary ADT •Average case O(1) find, insert, and delete (when under some often-reasonable](https://reader036.vdocuments.us/reader036/viewer/2022063018/5fdcf3745ba22426b2586f82/html5/thumbnails/15.jpg)
OpenAddressing/ProbingAnotherfamilyofcollisionresolutionschemes
![Page 16: CSE 373: Data Structures and Algorithms · Hash Tables: Review •A data-structure for the dictionary ADT •Average case O(1) find, insert, and delete (when under some often-reasonable](https://reader036.vdocuments.us/reader036/viewer/2022063018/5fdcf3745ba22426b2586f82/html5/thumbnails/16.jpg)
Idea:useemptyspaceinthetable
• Ifh(key) isalreadyfull,• try(h(key) + 1) % TableSize.Iffull,• try(h(key) + 2) % TableSize.Iffull,• try(h(key) + 3) % TableSize.Iffull…
• Example:insert38,19,8,109,10
0123456789
![Page 17: CSE 373: Data Structures and Algorithms · Hash Tables: Review •A data-structure for the dictionary ADT •Average case O(1) find, insert, and delete (when under some often-reasonable](https://reader036.vdocuments.us/reader036/viewer/2022063018/5fdcf3745ba22426b2586f82/html5/thumbnails/17.jpg)
OpenAddressingTerminology
Tryingthenextspotiscalled(alsocalled )
• Wejustdidith probewas(h(key) + i) % TableSize
• Ingeneralhavesomef anduse(h(key) + f(i)) % TableSize
![Page 18: CSE 373: Data Structures and Algorithms · Hash Tables: Review •A data-structure for the dictionary ADT •Average case O(1) find, insert, and delete (when under some often-reasonable](https://reader036.vdocuments.us/reader036/viewer/2022063018/5fdcf3745ba22426b2586f82/html5/thumbnails/18.jpg)
DictionaryOperationswithOpenAddressinginsert findsanopentablepositionusingaprobefunction
Whataboutfind?
Whataboutdelete?
• Note:delete withseparatechainingisplain-oldlist-remove
![Page 19: CSE 373: Data Structures and Algorithms · Hash Tables: Review •A data-structure for the dictionary ADT •Average case O(1) find, insert, and delete (when under some often-reasonable](https://reader036.vdocuments.us/reader036/viewer/2022063018/5fdcf3745ba22426b2586f82/html5/thumbnails/19.jpg)
Practice:Thekeys12,18,13,2,3,23,5and15areinsertedintoaninitiallyemptyhashtableoflength10usingopenaddressingwithhashfunctionh(k)=kmod10andlinearprobing.Whatistheresultanthashtable?
012 23 2345 15678 189
012 123 1345 5678 189
012 123 134 25 36 237 58 189 15
012 12, 23 13, 3, 2345 5, 15678 189
(A) (B) (C) (D)
![Page 20: CSE 373: Data Structures and Algorithms · Hash Tables: Review •A data-structure for the dictionary ADT •Average case O(1) find, insert, and delete (when under some often-reasonable](https://reader036.vdocuments.us/reader036/viewer/2022063018/5fdcf3745ba22426b2586f82/html5/thumbnails/20.jpg)
OpenAddressing:LinearProbing
• Quicktocompute!J• Butmostlyabadidea.Why?
![Page 21: CSE 373: Data Structures and Algorithms · Hash Tables: Review •A data-structure for the dictionary ADT •Average case O(1) find, insert, and delete (when under some often-reasonable](https://reader036.vdocuments.us/reader036/viewer/2022063018/5fdcf3745ba22426b2586f82/html5/thumbnails/21.jpg)
(Primary)Clustering
Linearprobingtendstoproduceclusters,whichleadtolongprobingsequences
• Called
• Saw thisstartinginourexample
[R.Sedgewick]
![Page 22: CSE 373: Data Structures and Algorithms · Hash Tables: Review •A data-structure for the dictionary ADT •Average case O(1) find, insert, and delete (when under some often-reasonable](https://reader036.vdocuments.us/reader036/viewer/2022063018/5fdcf3745ba22426b2586f82/html5/thumbnails/22.jpg)
AnalysisofLinearProbing
• Foranyl <1,linearprobingwillfindanemptyslot• Itis“safe”inthissense:noinfiniteloopunlesstableisfull
• Non-trivialfactswewon’tprove:Average#ofprobesgivenl (inthelimitasTableSize→¥ )
• Unsuccessfulsearch:
• Successfulsearch:
• Thisisprettybad:needtoleavesufficientemptyspaceinthetabletogetdecentperformance(seechart)
( ) ÷÷ø
öççè
æ-
+ 2111
21
l
( )÷÷øö
ççè
æ-
+l111
21
![Page 23: CSE 373: Data Structures and Algorithms · Hash Tables: Review •A data-structure for the dictionary ADT •Average case O(1) find, insert, and delete (when under some often-reasonable](https://reader036.vdocuments.us/reader036/viewer/2022063018/5fdcf3745ba22426b2586f82/html5/thumbnails/23.jpg)
Analysis:LinearProbing• Linear-probingperformancedegradesrapidlyastablegetsfull
(Formulaassumes“largetable”butpointremains)
• Bycomparison,chainingperformanceislinearinl andhasnotroublewithl>1
0.002.004.006.008.0010.0012.0014.0016.0018.0020.00
0.00 0.20 0.40 0.60 0.80 1.00
Average#ofProbe
s
LoadFactor
LinearProbing
linearprobingfound
linearprobingnotfound
0.00
50.00
100.00
150.00
200.00
250.00
300.00
350.00
0.00 0.20 0.40 0.60 0.80 1.00
Average#ofProbe
sLoadFactor
LinearProbing
linearprobingfound
linearprobingnotfound
![Page 24: CSE 373: Data Structures and Algorithms · Hash Tables: Review •A data-structure for the dictionary ADT •Average case O(1) find, insert, and delete (when under some often-reasonable](https://reader036.vdocuments.us/reader036/viewer/2022063018/5fdcf3745ba22426b2586f82/html5/thumbnails/24.jpg)
Anyideasforalternatives?
![Page 25: CSE 373: Data Structures and Algorithms · Hash Tables: Review •A data-structure for the dictionary ADT •Average case O(1) find, insert, and delete (when under some often-reasonable](https://reader036.vdocuments.us/reader036/viewer/2022063018/5fdcf3745ba22426b2586f82/html5/thumbnails/25.jpg)
OpenAddressing:QuadraticProbing
• Wecanavoidprimaryclusteringbychangingtheprobefunction(h(key) + f(i)) % TableSize
• Acommontechniqueisquadraticprobing:f(i) = i2• Soprobesequenceis:• 0th probe:h(key) % TableSize• 1st probe:• 2nd probe:• 3rd probe:• …• ith probe:(h(key) + i2) % TableSize
• Intuition:Probesquickly“leavetheneighborhood”
![Page 26: CSE 373: Data Structures and Algorithms · Hash Tables: Review •A data-structure for the dictionary ADT •Average case O(1) find, insert, and delete (when under some often-reasonable](https://reader036.vdocuments.us/reader036/viewer/2022063018/5fdcf3745ba22426b2586f82/html5/thumbnails/26.jpg)
QuadraticProbingExample#1
TableSize =10Insert:
8918495879
ith probe:(h(key) + i2) % TableSize
0123456789
![Page 27: CSE 373: Data Structures and Algorithms · Hash Tables: Review •A data-structure for the dictionary ADT •Average case O(1) find, insert, and delete (when under some often-reasonable](https://reader036.vdocuments.us/reader036/viewer/2022063018/5fdcf3745ba22426b2586f82/html5/thumbnails/27.jpg)
QuadraticProbingExample#2
TableSize =7Insert:
76 (76%7=6)40 (40%7=5)48 (48%7=6)5 (5%7=5)55 (55%7=6)47 (47%7=5)
ith probe:(h(key) + i2) % TableSize
0123456