![Page 1: 308-203A Introduction to Computing II Lecture 11: Hashtables Fall Session 2000](https://reader035.vdocuments.us/reader035/viewer/2022062222/5697bfbd1a28abf838ca22a1/html5/thumbnails/1.jpg)
308-203AIntroduction to Computing II
Lecture 11: Hashtables
Fall Session 2000
![Page 2: 308-203A Introduction to Computing II Lecture 11: Hashtables Fall Session 2000](https://reader035.vdocuments.us/reader035/viewer/2022062222/5697bfbd1a28abf838ca22a1/html5/thumbnails/2.jpg)
Dictionary
An Abstract class which defines data-structureswhich support:
• void put(Object key, Object value)
• Object get(Object key)
• void remove(Object key)
![Page 3: 308-203A Introduction to Computing II Lecture 11: Hashtables Fall Session 2000](https://reader035.vdocuments.us/reader035/viewer/2022062222/5697bfbd1a28abf838ca22a1/html5/thumbnails/3.jpg)
Implementations for Dictionary?
If we use an unsorted linked list:
put O( 1 )get O( n )remove O( n )
Naïve solution: must search all possibilities
![Page 4: 308-203A Introduction to Computing II Lecture 11: Hashtables Fall Session 2000](https://reader035.vdocuments.us/reader035/viewer/2022062222/5697bfbd1a28abf838ca22a1/html5/thumbnails/4.jpg)
Implementations for Dictionary?
If we use a binary tree (assume depth = d):
put O( 1og d )get O( log d )remove O( log d )
Good, unless the tree is unbalanced…
![Page 5: 308-203A Introduction to Computing II Lecture 11: Hashtables Fall Session 2000](https://reader035.vdocuments.us/reader035/viewer/2022062222/5697bfbd1a28abf838ca22a1/html5/thumbnails/5.jpg)
Implementations for Dictionary?
If we use a heap:
put O( log n )get O( n )remove O( n )
Insert is easy, but finding arbitrary elements is hard…
![Page 6: 308-203A Introduction to Computing II Lecture 11: Hashtables Fall Session 2000](https://reader035.vdocuments.us/reader035/viewer/2022062222/5697bfbd1a28abf838ca22a1/html5/thumbnails/6.jpg)
Implementations for Dictionary?
If we use a sorted array:
put O(n )get O( log n )remove O( n )
Binary search is easy, but lots of copying is needed
![Page 7: 308-203A Introduction to Computing II Lecture 11: Hashtables Fall Session 2000](https://reader035.vdocuments.us/reader035/viewer/2022062222/5697bfbd1a28abf838ca22a1/html5/thumbnails/7.jpg)
Implementations for Dictionary?
If we use an array with enough space forevery possible key (not realistic):
put O( 1 )get O( 1 )remove O( 1 )
All operations are quick and easy, but requires enormous(i.e. infinite) memory
![Page 8: 308-203A Introduction to Computing II Lecture 11: Hashtables Fall Session 2000](https://reader035.vdocuments.us/reader035/viewer/2022062222/5697bfbd1a28abf838ca22a1/html5/thumbnails/8.jpg)
Hashtables
We can try to patch this “perfect solution” so thatit is feasible.
![Page 9: 308-203A Introduction to Computing II Lecture 11: Hashtables Fall Session 2000](https://reader035.vdocuments.us/reader035/viewer/2022062222/5697bfbd1a28abf838ca22a1/html5/thumbnails/9.jpg)
The “Perfect” Solution
If we had an array that was infinitely large and eachkey had it’s own slot, every access would be O( 1 )[ and we would waste a lot of space on null pointers]
1 2 3 4 j-1 j j+1 j+ 2
Key = 3 Key = j
… …
![Page 10: 308-203A Introduction to Computing II Lecture 11: Hashtables Fall Session 2000](https://reader035.vdocuments.us/reader035/viewer/2022062222/5697bfbd1a28abf838ca22a1/html5/thumbnails/10.jpg)
Hash Function
Definition: A hash function is a functionwhich maps keys to a finite range of integers,called hashcodes:
f: keys [ 0, (m-1) ]
![Page 11: 308-203A Introduction to Computing II Lecture 11: Hashtables Fall Session 2000](https://reader035.vdocuments.us/reader035/viewer/2022062222/5697bfbd1a28abf838ca22a1/html5/thumbnails/11.jpg)
Example
Let the keys be non-negative integers: { 0, 1, … }
Let the hash function be f(x) = x mod 7
For the keys (4, 15, 26):
f(4) = 4f(15) = 1f(26) = 5
![Page 12: 308-203A Introduction to Computing II Lecture 11: Hashtables Fall Session 2000](https://reader035.vdocuments.us/reader035/viewer/2022062222/5697bfbd1a28abf838ca22a1/html5/thumbnails/12.jpg)
Example
Let the keys be non-negative integers: { 0, 1, … }
Let the hash function be f(x) = x mod 7
For the keys (4, 15, 26):
f(4) = 4f(15) = 1f(26) = 5
4 26
Fits in an array of size 7
150 1 2 3 4 5 6
![Page 13: 308-203A Introduction to Computing II Lecture 11: Hashtables Fall Session 2000](https://reader035.vdocuments.us/reader035/viewer/2022062222/5697bfbd1a28abf838ca22a1/html5/thumbnails/13.jpg)
Collisions
Problem:When two or more keys hash to the same slot,there is a possiblity of collision.
![Page 14: 308-203A Introduction to Computing II Lecture 11: Hashtables Fall Session 2000](https://reader035.vdocuments.us/reader035/viewer/2022062222/5697bfbd1a28abf838ca22a1/html5/thumbnails/14.jpg)
Open-Addressing
• A simple way to handle collisions
• When a collision occurs look for an empty slot elsewhere
• Some elements may end up in the slot corresponding a different hashcode
![Page 15: 308-203A Introduction to Computing II Lecture 11: Hashtables Fall Session 2000](https://reader035.vdocuments.us/reader035/viewer/2022062222/5697bfbd1a28abf838ca22a1/html5/thumbnails/15.jpg)
Linear Probing
Find an alternative slot after collision by steppingsequentially through the slots, for example:
4 2615
0 1 2 3 4 5 6
Insert 18 : f(18) = 18 mod 7 = 4
18 Collision in slot 4!
![Page 16: 308-203A Introduction to Computing II Lecture 11: Hashtables Fall Session 2000](https://reader035.vdocuments.us/reader035/viewer/2022062222/5697bfbd1a28abf838ca22a1/html5/thumbnails/16.jpg)
Linear Probing
Find an alternative slot after collision by steppingsequentially through the slots, for example
4 2615
0 1 2 3 4 5 6
Insert 18 : f(18) = 18 mod 7 = 4
18 Slot 5 is also taken
![Page 17: 308-203A Introduction to Computing II Lecture 11: Hashtables Fall Session 2000](https://reader035.vdocuments.us/reader035/viewer/2022062222/5697bfbd1a28abf838ca22a1/html5/thumbnails/17.jpg)
Linear Probing
Find an alternative slot after collision by steppingsequentially through the slots, for example
4 2615
0 1 2 3 4 5 6
Insert 18 : f(18) = 18 mod 7 = 4
18
Slot 6 is free
![Page 18: 308-203A Introduction to Computing II Lecture 11: Hashtables Fall Session 2000](https://reader035.vdocuments.us/reader035/viewer/2022062222/5697bfbd1a28abf838ca22a1/html5/thumbnails/18.jpg)
Disadvantages
• In open-addressing, the table can fill up; Must have (n < m)
• Linear-probing leads to “primary clustering:” A run of filled slots is more likely to receive more collisions
• Although best-case access is O( 1 ), worst-case access O( m )
![Page 19: 308-203A Introduction to Computing II Lecture 11: Hashtables Fall Session 2000](https://reader035.vdocuments.us/reader035/viewer/2022062222/5697bfbd1a28abf838ca22a1/html5/thumbnails/19.jpg)
Chaining
A (Better) Solution to Collisions:
Use the flexibility of the linked-list, but only whenneeded, i.e. within a single slot where collisionsmay occur.
![Page 20: 308-203A Introduction to Computing II Lecture 11: Hashtables Fall Session 2000](https://reader035.vdocuments.us/reader035/viewer/2022062222/5697bfbd1a28abf838ca22a1/html5/thumbnails/20.jpg)
Example (chaining)
0 1 2 3 4 5 6
15 4 26
Insert 39 into the previous hashtable:
![Page 21: 308-203A Introduction to Computing II Lecture 11: Hashtables Fall Session 2000](https://reader035.vdocuments.us/reader035/viewer/2022062222/5697bfbd1a28abf838ca22a1/html5/thumbnails/21.jpg)
Example (chaining)
0 1 2 3 4 5 6
15 4 26
f(39) = 39 mod 7 = 4 collision
39
![Page 22: 308-203A Introduction to Computing II Lecture 11: Hashtables Fall Session 2000](https://reader035.vdocuments.us/reader035/viewer/2022062222/5697bfbd1a28abf838ca22a1/html5/thumbnails/22.jpg)
Worst-Case
If all elements hash to the same entrywe get a linked list:
Therefore put, get and remove are O(n)worst-case.
![Page 23: 308-203A Introduction to Computing II Lecture 11: Hashtables Fall Session 2000](https://reader035.vdocuments.us/reader035/viewer/2022062222/5697bfbd1a28abf838ca22a1/html5/thumbnails/23.jpg)
Best-Case
0 1 2 3 4 5 6
Equal distribution to each slot
![Page 24: 308-203A Introduction to Computing II Lecture 11: Hashtables Fall Session 2000](https://reader035.vdocuments.us/reader035/viewer/2022062222/5697bfbd1a28abf838ca22a1/html5/thumbnails/24.jpg)
Best Case
Definition: The load factor for a hashtablewith n elements hashed into m slots is theaverage number of elements per slot:
= n / m
![Page 25: 308-203A Introduction to Computing II Lecture 11: Hashtables Fall Session 2000](https://reader035.vdocuments.us/reader035/viewer/2022062222/5697bfbd1a28abf838ca22a1/html5/thumbnails/25.jpg)
Best Case
If every slot contains elements (uniformlydistributed hashing):
put, get and remove are O( )
![Page 26: 308-203A Introduction to Computing II Lecture 11: Hashtables Fall Session 2000](https://reader035.vdocuments.us/reader035/viewer/2022062222/5697bfbd1a28abf838ca22a1/html5/thumbnails/26.jpg)
Best Case
If every slot contains elements (uniformlydistributed hashing):
put, get and remove are O( )
If the number of slots is allowed to growas O( n ) :
= n/m = n /O( n ) = O( 1 )
put, get and remove are O( 1 )
![Page 27: 308-203A Introduction to Computing II Lecture 11: Hashtables Fall Session 2000](https://reader035.vdocuments.us/reader035/viewer/2022062222/5697bfbd1a28abf838ca22a1/html5/thumbnails/27.jpg)
Average-Case
More realistic analysis involves determinationof statistics of the data and how well it will behashed.
Example: hashing olympic years by f(x) = x mod 4would be a bad idea (always hash to the same slot)
![Page 28: 308-203A Introduction to Computing II Lecture 11: Hashtables Fall Session 2000](https://reader035.vdocuments.us/reader035/viewer/2022062222/5697bfbd1a28abf838ca22a1/html5/thumbnails/28.jpg)
Java Hashtable Class
• Constructor:
Hashtable(int initialCapacity, float loadFactor)
• Default: initialCapacity = 101, loadFactor = 0.75f
• Collision resolution with chaining
![Page 29: 308-203A Introduction to Computing II Lecture 11: Hashtables Fall Session 2000](https://reader035.vdocuments.us/reader035/viewer/2022062222/5697bfbd1a28abf838ca22a1/html5/thumbnails/29.jpg)
Java Hashtable Class
• hashcode(): defined in java.lang.Object
• equals(): assumed defined for the entries
Keys can be objects of any class providedthe following is appropriately defined:
![Page 30: 308-203A Introduction to Computing II Lecture 11: Hashtables Fall Session 2000](https://reader035.vdocuments.us/reader035/viewer/2022062222/5697bfbd1a28abf838ca22a1/html5/thumbnails/30.jpg)
Java Hashtable Class
Hashtables grow multiplicatively:
• Put() checks if the hashtable contains more than (m) elements and if so m 2m+1
• Hashtables only grow, never shrink, no matter how many elements you delete
![Page 31: 308-203A Introduction to Computing II Lecture 11: Hashtables Fall Session 2000](https://reader035.vdocuments.us/reader035/viewer/2022062222/5697bfbd1a28abf838ca22a1/html5/thumbnails/31.jpg)
Java Hashtable Class
Other Features:
elements() returns an enumeration of everythingin the table.
This works by keeping references into thetable rather than by copying the table itself.
![Page 32: 308-203A Introduction to Computing II Lecture 11: Hashtables Fall Session 2000](https://reader035.vdocuments.us/reader035/viewer/2022062222/5697bfbd1a28abf838ca22a1/html5/thumbnails/32.jpg)
Any questions?