cis 280 hashing and hash tables. calendar today: hashing and hash tables wednesday: calculus due,...
TRANSCRIPT
CIS 280
Hashing and Hash Tables
Calendar
• Today: Hashing and Hash Tables• Wednesday: Calculus due, work• Friday: Cuckoo hashing and linear hashing• Monday: Hashing wrapup• Wednesday: Work• Friday: Review, final project due• Wednesday, December 16, 8am: Final exam
Calculus
• Questions?• Work an example: x^2 + 3*x - 4
Hashing
Basic idea: create an integer “digest” that identifies a complex value.
• Speeds up equality check (check hash codes first) – example: MD5
• Use as a psuedo array index to create a hash table
An array maps well organized data (integer ranges) to values
A hash table maps irregular data to value
Two Different Problems
• Hash functions: mapping data to integers• Hash tables: mapping hash keys to values in a
sparse way (similar to sparse arrays).http://en.wikipedia.org/wiki/Hash_functionA hash is a sort of finger print for a complex
object.
Hash Functions
Goals:• Quick to compute (linear time in the size of
the object)• Random behavior: closely related objects
should map to non-related keysWe’ll simplify by only hashing strings – any
structure can be hashed though.
String Hashing
Input: a string of characters (bytes), strOutput: a number between 0 and n-1Use a local variable, state, to hold a “state” that
condenses the string, and F, a combining function
state = 0;for (int i = 0; i < str.size(); i++) {
state = F((int) str.charAt(i), state);return state % n;
Choosing F
• Bad choice (checksum): + (why?)• Better choices: functions involving bitwise
operators (they run fast!), prime numbers (why?)
• Assume arithmetic operators don’t overflow (java does calculations mod the word size!)
• Example: F(v, s) = v*31+s• There are a lot of ways to choose F – check
wikipedia for some examples.
Example
Supposed A = 1, B = 2, …, what is the hash of:• “ABC”• “ACB”• “CB”Where F(c, s) = 5 * c + s?
Hash Tables
The following interface describes a Hash table:public interface HashTable<Key, Value>
extends Iterable<Value> { public Value get(Key k); // null if not found public void set(Key k, Value v); public int size();}
Hash Tables
Let’s simplify by assuming that the Key is a string:
public interface HashTable<Value> extends Iterable<Value> {
public Value get(String s); public void set(String s, Value v); public int size();
}
Caching the Hash
public class Hashed<Value> {public Value value;public int hash; }
Note that we can remove the Value if we only hash strings.
How would you compare hashed values for equality?
Factoring Out The Hasher
public interface HashFunction {public HashedString hash(String s); }
Every hash table needs to keep a hash function around.
We can create families of such functions by parameterizing them.
A Naïve Hashtable
Instead of using hash codes to organize the table, we’ll use direct searches.
public class SimpleHashTable<Value> implements HashTable<Value>;
public ArrayList<Pair<HashedString, Value>> values = new ArrayList<Pair<HashedString, Value>> ();
public HashFunction f;public SimpleHashTable(HashFunction f) { this.f = f; }public Value get(String s) {