1
Hash Tables (Java)Hash Tables (Java)
Click on
speaker to
play sound.
Use PageUp and
PageDown to
move from
screen to screen.
Algonquin CollegeAlgonquin College
Created by Rex WoollardCreated by Rex Woollard
Linear SearchLinear Search
�� O(O(nn))
�� Suitable for very short listsSuitable for very short lists
2
Binary SearchBinary Search
�� O(logO(log22 nn))
�� Suitable for very large data setsSuitable for very large data sets
�� Array organizationArray organization
�� Best if few insertions / deletionsBest if few insertions / deletions
Overview: Hash Function and Searching 1Overview: Hash Function and Searching 1
�� Array organizationArray organization
�� Access array elements with indexAccess array elements with index
�� Convert key into indexConvert key into index
3
Overview: Collisions with Hashing: InsertionOverview: Collisions with Hashing: Insertion
�� Multiple objects have same indexMultiple objects have same index
�� Need mechanism to find opening Need mechanism to find opening
Overview: Hashing MethodsOverview: Hashing Methods
�� Many methods to generate indexMany methods to generate index
4
Direct HashingDirect Hashing
�� Array organizationArray organization
�� Access array elements with indexAccess array elements with index
�� Convert record key into corresponding indexConvert record key into corresponding index
Hash FunctionHash Function
�� Key value too large for direct hashingKey value too large for direct hashing
�� Limited slots in hash tableLimited slots in hash table
�� Hash function generate even distributionHash function generate even distribution
5
Hash Function: FoldingHash Function: Folding
�� Use leading and trailing digitsUse leading and trailing digits
�� Add them to middle digitsAdd them to middle digits
�� Leading and trailing digits could be reversedLeading and trailing digits could be reversed
�� Discard resulting overflowDiscard resulting overflow
Hash Function: RotationHash Function: Rotation
�� Keys are clusteredKeys are clustered
�� Move least significant Move least significant
6
Hash Function: Dealing with Strings 1Hash Function: Dealing with Strings 1
A 65 a 97
B 66 b 98
C 67 c 99
: : : :
Z 90 z 122
�� Many keys are strings of charactersMany keys are strings of characters
�� Must convert to indexMust convert to index
�� Perform arithmetic on character values (integer)Perform arithmetic on character values (integer)
Hash Function: Using ModulusHash Function: Using Modulus
A 65 a 97
B 66 b 98
C 67 c 99
: : : :
Z 90 z 122
public static int createPoorHashCode(String sKey) {
int nSum = 0;
int nIndex = 0;
while (nIndex < sKey.length())
nSum = nSum + sKey.charAt(nIndex++);
return nSum % nTableSize;
}
7
Hash Function: Need Better DistributionHash Function: Need Better Distribution
A 65 a 97
B 66 b 98
C 67 c 99
: : : :
Z 90 z 122
HendersonHenderson
Henry
HillHill
HobartHobart
Hopkins
public static int createPoorHashCode(String sKey) {
int nSum = 0;
int nIndex = 0;
while (nIndex < sKey.length())
nSum = nSum + sKey.charAt(nIndex++);
return nSum % nTableSize;
}
Collision Resolution: OverviewCollision Resolution: Overview
8
Open Addressing: Linear ProbeOpen Addressing: Linear Probe
�� Collision: search for next available slotCollision: search for next available slot
�� Might be adjacentMight be adjacent
�� Might be some distance if there is clusteringMight be some distance if there is clustering
Open Addressing: PseudorandomOpen Addressing: Pseudorandom
9
BucketsBuckets
�� Each index location can hold several recordsEach index location can hold several records
Chaining with Linked ListsChaining with Linked Lists
List ObjectList Object
ListNode ObjectListNode Object
HashTable object holds a Reference to HashTable object holds a Reference to Array of References to List ObjectsArray of References to List Objects
Array of References to List ObjectsArray of References to List Objects
New ListNode ObjectNew ListNode ObjectKey in Data: Key in Data: RequiemRequiem
New ListNode ObjectNew ListNode ObjectKey in Data: Key in Data: WaterMusicWaterMusic
Data ObjectData Object
Reference to HashTable objectReference to HashTable object
�� Hash table of references to Linked ListsHash table of references to Linked Lists
�� All collisions reside at the same indexAll collisions reside at the same index
�� Lists can grow to accommodate collisionsLists can grow to accommodate collisions
10
Open Addressing: Unsuccessful Search
0
10
20
30
40
50
60
70
80
90
100
010
020
0300 40
050
060
070
080
0900
1000
Table Size
Nu
mb
er
of
Co
mp
ari
so
ns
Linear
Quadratic
BigBig--O: Open AddressingO: Open Addressing
80% Load Factor80% Load FactorOpen Addressing: Successful Search
0
10
20
30
40
50
60
70
80
90
100
010
020
0300 40
050
060
070
080
0900
1000
Table Size
Nu
mb
er
of
Co
mp
ari
so
ns
Linear
Quadratic
Open Addressing: Successful Search
0
10
20
30
40
50
60
70
80
90
100
0 10 20 30 40 50 60 70 80 90 100
% Load Factor (table size 1000)
Nu
mb
er
of
Co
mp
ari
so
ns
Linear
Quadratic
Open Addressing: Unsuccessful Search
0
10
20
30
40
50
60
70
80
90
100
0 10 20 30 40 50 60 70 80 90 100
% Load Factor (table size 1000)
Nu
mb
er
of
Co
mp
ari
so
ns
Linear
Quadratic
BigBig--O: Open AddressingO: Open Addressing
Variable Load FactorVariable Load Factor
11
BigBig--O: ChainingO: Chaining
Chaining: Search
1
1.2
1.4
1.6
1.8
2
2.2
2.4
2.6
2.8
3
0 20 40 60 80 100 120
140
160
180
200
% Load Factor (table size 1000)
Nu
mb
er
of
Co
mp
ari
so
ns
Failure
Success
Chaining: Search
1
1.2
1.4
1.6
1.8
2
2.2
2.4
2.6
2.8
3
010
020
030
040
050
060
070
080
090
010
00
Table Size (load factor 100%)
Nu
mb
er
of
Co
mp
ari
so
ns
Failure
Success
Variable Load FactorVariable Load Factor
100% Load Factor100% Load Factor
BigBig--O: ChainingO: ChainingChaining: Search
1
1.2
1.4
1.6
1.8
2
2.2
2.4
2.6
2.8
3
010
020
030
040
050
060
070
080
090
010
00
Table Size (load factor 100%)
Nu
mb
er
of
Co
mp
ari
so
ns
Failure
Success
Chaining: Search
1
1.2
1.4
1.6
1.8
2
2.2
2.4
2.6
2.8
3
0 20 40 60 80 100 120
140
160
180
200
% Load Factor (table size 1000)
Nu
mb
er
of
Co
mp
ari
so
ns
Failure
Success
Chaining: Search
0
10
20
30
40
50
60
70
80
90
100
010
020
030
040
0500 60
070
080
090
0100
0
Table Size (load factor 100%)
Nu
mb
er
of
Co
mp
ari
so
ns
Failure
Success
100% Load Factor100% Load Factor
Chaining: Search
0
10
20
30
40
50
60
70
80
90
100
0 20 40 60 80 100 120
140
160
180
200
% Load Factor (table size 1000)
Nu
mb
er
of
Co
mp
ari
so
ns
Failure
Success
Variable Load FactorVariable Load Factor
12
Chaining: Unsuccessful Search
0
10
20
30
40
50
60
70
80
90
100
0 20 40 60 80 100 120
140
160
180
200
% Load Factor (table size 1000)
Nu
mb
er
of
Co
mp
ari
so
ns
Failure
Success
At a 200% At a 200% load factor, load factor,
performance performance remains remains excellentexcellent
Open Addressing: Unsuccessful Search
0
10
20
30
40
50
60
70
80
90
100
0 10 20 30 40 50 60 70 80 90 100
% Load Factor (table size 1000)
Nu
mb
er
of
Co
mp
ari
so
ns
Linear
Quadratic
Chaining versus Open AddressingChaining versus Open Addressing
At a 100% At a 100% load factor, load factor,
performance performance is disastrousis disastrous