queues and stacks. can receive multiple requests from multiple sources ◦ how do we services these...

25
Data Structures Queues and Stacks

Upload: dawson-butt

Post on 14-Dec-2015

222 views

Category:

Documents


0 download

TRANSCRIPT

Data StructuresQueues and Stacks

Queues

Can receive multiple requests from multiple sources◦ How do we services these requests?

First come, first serve processing Priority based processing

◦ Buffering of requests, as they might arrive faster than they can be processed

You could always use a List structure, with an integer value associated with the item, and then append it to the List using the Add() method◦ Inefficient

Queue

Two jobs added to List

Why List is problematic

Job 1 is processed and slot becomes available

Job 3 grabs first available slot, and Job 4 gets the next available slot

nextJobPost keeps track of the “next” job to be processed in the List

Why List is problematic List will continue to grow, even if jobs are

processed right away◦ The default is to double the size, when the list

requires additional “slots” No reclaiming of the already used slots is done with

Lists If you do reclaim the “used” slots in the List,

then your first-come, first-serve processing scheme will not work

A List represents a linear order

When adding an item, once the last item is used, the “next” will “wrap around” to the 0th item in the array/list◦ A “modulus” function is used to “wrap around”

What happens if all items are filled, and you still need another item?◦ Resize the circular array…!

This is done in the Queue class

A “circular” List

Add / remove buffer items◦ First-come, first-serve (FIFO)◦ Manage space utilization◦ Uses Generics

Type-safe Methods

◦ Enqueue() Adds elements at the “tail” index If not enough space, default growth factor of 2.0 is used to resize

Class constructor can specify other growth factor

◦ Dequeue() Returns the current element from the “head” index Sets the “head” element to null and increments the “head” index

◦ Peek() Allows you to see the head element, without a dequeue, or increasing the head index

counter◦ Contains()

Determine if a specific item exists in the Queue◦ ToArray()

Returns an array containing the Queue’s elements

Queue

LIFO structure Uses a circular array, as does the Queue Methods

◦ Push() Adds an item to the stack

◦ Pop() Removes and returns the item on the “top” of the stack

Size is increased, as required (same as the Queue’s growth factor)

Call Stack as used by the CLR is an example of this structure◦ When calling a function, Push its information onto the stack◦ When returning from that routine, Pop it from the stack and

expose the routine to which it returns control

Stacks

Problem: We often don’t know the “position” of an element within an array◦ Potentially we process all elements before finding the

one we need Reduce the O-time to O(1)

◦ Build an array capable of holding all SS#’s◦ Each element would hold a record based on the SS# as a

“key”◦ Waste

109 possible values, but you only have 1,000 employees Utilization would be 0.0001% of the array

Hashing allows us to “compress” this ordinal indexing

Hashtables

Use the last 4 digits (or 3, or 5) of the SS#◦ Mathematical transformation (mapping) of a nine-

digit value to a four-digit value◦ Array ranges from 0000 to 9999

Constant lookup time (O-time) Better utilization of space Hash table

◦ Array which uses hashing to compress the indexers

Hash function◦ Function which performs the hashing

Hashtables

H(x) = last four digits of x Collisions

◦ When multiple inputs to a hash function result in identical outputs 105 collisions for SS#’s ending in “0000”

◦ Collision of hash value results in attempting to store into a “slot” already occupied by a prior hash result

Hashing

Collision frequency is directly correlated to the hash function◦ SS# assumes that the last four digits are

uniformly distributed If year of birth, or geographical location alters the

distribution Increases collisions

◦ Collision avoidance is the selection of an appropriate hashing algorithm

◦ Collision resolution is locating another slot in the hashtable for entry placement

Collision avoidance / resolution

Linear probing◦ If collision in slot i occurs, proceed to the next

available slot (i+1), theni+2 and so on, if required Alice = 1234, Bob=1234, Cal=1237, Danny=1235,

Edward=1235 Insert Alice Insert Bob Insert Cal Insert Danny Insert Edward

Collision resolution

Searching◦ Start at the hash location, and then perform a linear

search from there until the value is located When/if you reach an empty slot your search value is

NOT in that hashtable Linear probing not very good resolution

◦ Leads to clustering of values Ideally you’d like a uniform distribution of values

Quadratic probing◦ Slot s is taken

Probe s+12, then s-12, then s+22, then s-22, and so on… Can still lead to clustering

Collision resolution

Rehashing◦ Used by the .NET Framework Hashtable class◦ Adding an item to the table

Provide item and unique key to access the item Item and key can be of any type

◦ Retrieving item Index the Hashtable by key

Collision resolution

//Note the use of the ContainsKey() Method, which returns a Booleanusing System;using System.Collections;public class HashtableDemo{ private static Hashtable employees = new Hashtable(); public static void Main() { // Add some values to the Hashtable, indexed by a string key employees.Add("111-22-3333", "Scott"); employees.Add("222-33-4444", "Sam"); employees.Add("333-44-5555", "Jisun"); // Access a particular key if (employees.ContainsKey("111-22-3333")) { string empName = (string) employees["111-22-3333"]; Console.WriteLine("Employee 111-22-3333's name is: " + empName); } else Console.WriteLine("Employee 111-22-3333 is not in the hash table..."); }}

Hashtable Code Example

// Step through all items in the Hashtable

foreach(string key in employees.Keys)Console.WriteLine("Value at employees[\"" + key + "\"] = " + employees[key].ToString());

The order of insertion and order of keys are not necessarily the same◦ Depends on the slot the key was stored in

depends on the hash value of the key Depends on the collision resolution used

◦ The output from the above code results in:

Value at employees["333-44-5555"] = JisunValue at employees["111-22-3333"] = ScottValue at employees["222-33-4444"] = Sam

Hashtable Code Example

Function returns an ordinal value◦ Slot # for the key◦ Function can accept a key of any type◦ GetHashCode()

Any object can be represented as a unique number

Hashtable Class: Hash Function

Rehashing (double hashing)

◦ Set of hash functions H1… Hn

◦ H1 is initially used If collision, then H2 is used, and so on

They differ by multiplicative factors

◦ Each slot in the hash table is visited exactly once when hashsize number of probes are made For a given key, Hi and Hj cannot hash to the same slot in the

table This can work if the results of (1 + (((GetHash(key) >> 5) + 1) %

(hashsize – 1)) and hashsize are “relatively prime” They share no common factors

Guaranteed to be prime if hashsize is a prime number

◦ Better collision avoidance than linear or quadratic probing

Hashtable Class: Collision Resolution

Hashtable class◦ Property: loadFactor

Max ratio of items in the Hash to the total slots in the table 0.5 at most, half the slots can be used, and the other

half must remain empty Values range from 0.1 to 1.0

Microsoft has a default “scaling factor” of 72% If you pass 1.0 to the loadFactor property, it’s still only

0.72 behind the scenes Performance issue

Hashing: Load Factors

Hashtable class◦ Add() method

Performs a check against the loadFactor If exceeded, the Hashtable is expanded

◦ Expansion Slot count is approximately doubled

From the current prime number to the next largest prime number value

Hash value depends on the number of total slots All values in the table need to be rehashed when the

table expands Occurs behind the scenes during Add() method

Hashing

loadFactor ◦ Affects the size of the hash table and number of

probes required on a collision High load factor

Denser hash table, but more collisions Expected number of probes needed when a collision

happens 1/(1-loadFactor)

Default 0.72 loadFactor results in 3.5 probes per collision on average Does not vary based on number of items in the table Asymptotic access time is O(1)

Much more desirable that the O(n) search time for an array

Hashtable

Hashtable is “loosely-typed” structure◦ Developer can add keys and values of any type to

the table Generics allow us to have type-safe implementations

of a class Dictionary class is a “type-safe” class

◦ Types the keys and the values◦ You must specify the types for keys/values when

creating the Dictionary instance◦ Once created, you can add and remove items,

just like the Hashtable

Dictionary Class

Collision resolution◦ Different from the Hashtable◦ Chaining is used

Secondary data structure is used for the collisions◦ Each slot in the Dictionary contain an array of

elements A collision prepends the element to the bucket’s list

Dictionary class

8 buckets (example)◦ Employee object is added to the bucket that its key

hashes to If already occupied, item is prepended

Searching and removing items from a chained hashtable◦ Time proportional to total items and number of

buckets O(n/m)

n=total elements m= total buckets

◦ Dictionary class implemented n=m at all times

Dictionary class