hash functions and the hashmap class a brief overview on green marble john w. benning

14
Hash Functions and the HashMap Class A Brief Overview On Green Marble John W. Bennin

Upload: sheryl-owens

Post on 29-Dec-2015

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Hash Functions and the HashMap Class A Brief Overview On Green Marble John W. Benning

Hash Functions and the HashMap Class

A Brief Overview

On Green Marble

John W. Benning

Page 2: Hash Functions and the HashMap Class A Brief Overview On Green Marble John W. Benning

What are hash functions?

• (Online definition): A hash function (or hash algorithm) is a way of creating a small digital "fingerprint" from any kind of data. The function chops and mixes the data to create the fingerprint, often called a hash value or hash sum. The hash value is commonly represented as a short string of random-looking letters and numbers.

Page 3: Hash Functions and the HashMap Class A Brief Overview On Green Marble John W. Benning

Hashing Fundamentals

• A fundamental property of all hash functions is that if two hashes (according to the same hash function) are different, then the two inputs are different in some way. On the other hand, the equality of two hash values suggests, but does not guarantee the equality of the two inputs. If a hash value is calculated for a piece of data, and then one bit of that data is changed, a hash function with strong mixing property usually produces a completely different hash value.

Page 4: Hash Functions and the HashMap Class A Brief Overview On Green Marble John W. Benning

Applications of Hash Functions

• Cryptography- assumes the existence of an adversary who can deliberately try to find inputs with the same hash value. A well designed cryptographic hash function is a "one-way" operation: there is no practical way to calculate a particular data input that will result in a desired hash value, so it is also very difficult to forge.

• Hash Tables- enable fast lookup of a data record given its key. (Note: Keys are not usually secret as in cryptography, but both are used to "unlock" or access information.) For example, keys in an English dictionary would be English words, and their associated records would contain definitions. In this case, the hash function must map alphabetic strings to indexes for the hash table's internal array.

Page 5: Hash Functions and the HashMap Class A Brief Overview On Green Marble John W. Benning

Applications of Hash Functions (cont.)

• Error Correction- Using a hash function to detect errors in transmission is straightforward. The hash function is computed for the data at the sender, and the value of this hash is sent with the data. The hash function is performed again at the receiving end, and if the hash values do not match, an error has occurred at some point during the transmission. This is called a redundancy check.

• Audio Identification- for audio identification such as finding out whether an MP3 file matches one of a list of known items

Page 6: Hash Functions and the HashMap Class A Brief Overview On Green Marble John W. Benning

Hash Tables

• We will be looking at using hash algorithms to fill hash tables for data lookup. Below is an illustration of the use of a hash algorithm.

Fox

The red fox runs across the ice

The red fox walksacross the ice 322598

598746

639857

Hash Function

Hash Function

Input Hash Sum

Hash Function

Page 7: Hash Functions and the HashMap Class A Brief Overview On Green Marble John W. Benning

Hash Tables

• Like arrays, hash tables provide constant-time O(1) lookup on average, regardless of the number of items in the table. However, the rare worst-case lookup time can be as bad as O(n). Compared to other associative array data structures, hash tables are most useful when large numbers of records of data are to be stored.

Page 8: Hash Functions and the HashMap Class A Brief Overview On Green Marble John W. Benning

Choosing a hash function• A good hash function is essential for good hash table performance. Hash

collisions are generally resolved by some form of linear search, so if a hash function tends to produce similar values, slow searches will result.

• Because a good hash function can be hard to design, or computationally expensive to execute, much research has been devoted to collision resolution strategies that mitigate poor hashing performance. However, none of them is as effective as using a good hash function in the first place.

• It is desirable to use the same hash function for arrays of any conceivable size. To do this, the index into the hash table's array is generally calculated in two steps:-A generic hash value (hash sum) is calculated which fills a natural machine integer, -This value is reduced to a valid array index by finding its modulus* with the array’s size

*modulus: found by calculating the difference between the hash sum and array size, and then finding an integer (the modulus) that, when divided into both numbers, gives the same remainder

Page 9: Hash Functions and the HashMap Class A Brief Overview On Green Marble John W. Benning

HashMap

• Luckily, Java has a HashMap class.

• So we will be able to create hash tables without creating our own hash functions, though, if you want to create your own, you may be able to improve on Java’s hash algorithm

Page 10: Hash Functions and the HashMap Class A Brief Overview On Green Marble John W. Benning

HashMap• ConstructorsHashMap()

Constructs an empty HashMap with the default initial capacity (16) and the default load factor (0.75).

HashMap(int initialCapacity)           Constructs an empty HashMap with the specified initial capacity and the default load factor (0.75).

HashMap(int initialCapacity, float loadFactor)           Constructs an empty HashMap with the specified initial capacity and load factor.

HashMap(Map m)           Constructs a new HashMap with the same mappings as the specified Map.

*The capacity is the number of buckets in the hash table, and the initial capacity is simply the capacity at the time the hash table is created. The load factor is a measure of how full the hash table is allowed to get before its capacity is automatically increased. When the number of entries in the hash table exceeds the product of the load factor and the current capacity, the capacity is roughly doubled by calling the rehash method.

Page 11: Hash Functions and the HashMap Class A Brief Overview On Green Marble John W. Benning

Method Summary

 voidclear()           Removes all mappings from this map.

 Objectclone()           Returns a shallow copy of this HashMap instance: the keys and values themselves are not cloned.

 booleancontainsKey(Object key)           Returns true if this map contains a mapping for the specified key.

 booleancontainsValue(Object value)           Returns true if this map maps one or more keys to the specified value.

 Set

entrySet()           Returns a collection view of the mappings contained in this map.

 Objectget(Object key)           Returns the value to which the specified key is mapped in this identity hash map, or null if the map contains no mapping for this key.

 booleanisEmpty()           Returns true if this map contains no key-value mappings.

 SetkeySet()           Returns a set view of the keys contained in this map.

Page 12: Hash Functions and the HashMap Class A Brief Overview On Green Marble John W. Benning

 Object

put(Object key, Object value)           Associates the specified value with the specified key in this map.

 void

putAll(Map m)           Copies all of the mappings from the specified map to this map These mappings will replace any mappings that this map had for any of the keys currently in the specified map.

 Object

remove(Object key)           Removes the mapping for this key from this map if present.

 int

size()           Returns the number of key-value mappings in this map.

 Collection

values()           Returns a collection view of the values contained in this map.

 

Page 13: Hash Functions and the HashMap Class A Brief Overview On Green Marble John W. Benning

Simple use of HashMapimport java.util.HashMap;public class hashM{ public static void main(String[] args) { HashMap k=new HashMap(); k.put("five",5); k.put("six",6); k.put("ten",10); System.out.println(k.containsKey("six")); System.out.println(k.containsKey("eleven")); System.out.println(k.containsValue(6)); System.out.println(k.get("ten")); System.out.println(k.size()); }}

*output

truefalsetrue103

Page 14: Hash Functions and the HashMap Class A Brief Overview On Green Marble John W. Benning

HashMap

• Click here to go to the Java HashMap page online