dictionaries cs 105. 10/02/05 l7: dictionaries slide 2 copyright 2005, by the authors of these...

30
Dictionaries CS 105

Upload: ella-poppy-cox

Post on 13-Dec-2015

217 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Dictionaries CS 105. 10/02/05 L7: Dictionaries Slide 2 Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved

Dictionaries

CS 105

Page 2: Dictionaries CS 105. 10/02/05 L7: Dictionaries Slide 2 Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved

L7: DictionariesSlide 2

Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved 10/02/05

Definition

The Dictionary Data Structure structure that facilitates searching objects are stored with search keys;

insertion of an object must include a key searching requires a key and returns the

key-object pair removal also requires a key

Need an Entry interface/class Entry encapsulates the key-object pair

(just like with priority queues)

Page 3: Dictionaries CS 105. 10/02/05 L7: Dictionaries Slide 2 Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved

L7: DictionariesSlide 3

Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved 10/02/05

Sample Applications

An actual dictionary key: word object: word record (definition,

pronunciation, etc.) Record keeping applications

Bank account records (key: account number, object: holder and bank account info)

Student records (key: id number, object: student info)

Page 4: Dictionaries CS 105. 10/02/05 L7: Dictionaries Slide 2 Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved

L7: DictionariesSlide 4

Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved 10/02/05

Dictionary Interface

public interface Dictionary{ public int size(); public boolean isEmpty(); public Entry insert( int key, Object value )

throws DuplicateKeyException;public Entry find( int key );

// return null if not found public Entry remove( int key )

// return null if not found;}

Page 5: Dictionaries CS 105. 10/02/05 L7: Dictionaries Slide 2 Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved

L7: DictionariesSlide 5

Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved 10/02/05

Dictionary details/variations

Key types For simplicity, we assume that the keys are ints But the keys can be any kind of object as long as

they can be ordered (e.g., string and alphabetical ordering)

Duplicate entries (entries with the same key) may be allowed Our textbook calls the data structure that does

not allows duplicates a Map, while a Dictionary allows duplicates

For purposes of this discussion, we assume that dictionaries do not allow duplicates

Page 6: Dictionaries CS 105. 10/02/05 L7: Dictionaries Slide 2 Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved

L7: DictionariesSlide 6

Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved 10/02/05

Dictionary Implementations

Unordered list (section 8.3.1) Ordered table (section 8.3.3) Binary search tree (section 9.1)

Page 7: Dictionaries CS 105. 10/02/05 L7: Dictionaries Slide 2 Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved

L7: DictionariesSlide 7

Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved 10/02/05

Unordered list

Strategy: store the entries in the order that they arrive O( 1 ) insert operation

Can use an array, ArrayList, or linked list Find operation requires scanning the list

until a matching key value is found Scanning implies an O( n ) operation

Remove operation similar to find operation Entries need to be adjusted if using array/ArrayList O( n ) operation

Page 8: Dictionaries CS 105. 10/02/05 L7: Dictionaries Slide 2 Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved

L7: DictionariesSlide 8

Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved 10/02/05

Ordered table

Idea: if the list was ordered by key, searching is simpler/easier

Just like for priority queues, insertion is slightly more complex Need to search for proper position of

element -> O( n ) Find: don’t do a linear scan; instead,

do a binary search Note: use array/ArrayList; not a linked

list

Page 9: Dictionaries CS 105. 10/02/05 L7: Dictionaries Slide 2 Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved

L7: DictionariesSlide 9

Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved 10/02/05

Binary search

Take advantage of the fact that the elements are ordered

Compare the target key with middle element to reduce the search space in half

Repeat the process until the element is found or search space reduces to 1

Arithmetic on array indexes facilitate easy computation of middle position Middle of S[low] and S[high] is S[(low+high)/2] Not possible with linked lists

Page 10: Dictionaries CS 105. 10/02/05 L7: Dictionaries Slide 2 Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved

L7: DictionariesSlide 10

Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved 10/02/05

Binary Search Algorithm

Algorithm BinarySearch( S, k, low, high )

if low > high then return null; // not foundelse mid (low+high)/2 e S[mid]; if k = e.getKey() then return e; else if k < e.getKey() then return BinarySearch( S, k, low, mid-1 ) else return BinarySearch( S, k, mid+1, high )

array of Entries target key

BinarySearch( S, someKey, 0, size-1 );

Page 11: Dictionaries CS 105. 10/02/05 L7: Dictionaries Slide 2 Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved

L7: DictionariesSlide 11

Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved 10/02/05

Binary Search Algorithm

42 5 7 8 9 12 14 17 19 22 25 27 28 33 37

low mid high

find(22)

mid = (low+high)/2

Page 12: Dictionaries CS 105. 10/02/05 L7: Dictionaries Slide 2 Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved

L7: DictionariesSlide 12

Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved 10/02/05

Binary Search Algorithm

42 5 7 8 9 12 14 17 19 22 25 27 28 33 37

highlow mid

find(22)

mid = (low+high)/2

Page 13: Dictionaries CS 105. 10/02/05 L7: Dictionaries Slide 2 Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved

L7: DictionariesSlide 13

Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved 10/02/05

low

Binary Search Algorithm

42 5 7 8 9 12 14 17 19 22 25 27 28 33 37

midhigh

find(22)

mid = (low+high)/2

Page 14: Dictionaries CS 105. 10/02/05 L7: Dictionaries Slide 2 Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved

L7: DictionariesSlide 14

Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved 10/02/05

low=mid=high

Binary Search Algorithm

42 5 7 8 9 12 14 17 19 22 25 27 28 33 37

find(22)

mid = (low+high)/2

Page 15: Dictionaries CS 105. 10/02/05 L7: Dictionaries Slide 2 Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved

L7: DictionariesSlide 15

Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved 10/02/05

Time complexity of binary search

Search space reduces by half until it becomes 1

n -> n/2 -> n/4 -> … -> 1 log n steps

Find operation using binary search isO( log n )

Page 16: Dictionaries CS 105. 10/02/05 L7: Dictionaries Slide 2 Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved

L7: DictionariesSlide 16

Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved 10/02/05

Time complexity

Operation insert()

find() remove()

Unsorted List O( 1 ) O( n ) O( n )

Ordered Table

O( n ) O( log n )

O(n )

Page 17: Dictionaries CS 105. 10/02/05 L7: Dictionaries Slide 2 Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved

L7: DictionariesSlide 17

Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved 10/02/05

Binary Search Tree (BST)

Strategy: store entries as nodes in a tree such that an inorder traversal of the entries would list them in increasing order

Search, remove, and insert are allO( log n ) operations All operations require a search that

mimics binary search: go to left or right subtree depending on target key value

Page 18: Dictionaries CS 105. 10/02/05 L7: Dictionaries Slide 2 Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved

L7: DictionariesSlide 18

Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved 10/02/05

Traversing a BST Insert, remove, and find operations all

require a key First step involves checking for a matching

key in the tree Start with the root, go to left or right child

depending on key value Repeat the process until key is found or a null

child is encountered (not found) For insert operation, duplicate key error occurs if

key already exists Operation is proportional to height of tree

( usually O(log n ) )

Page 19: Dictionaries CS 105. 10/02/05 L7: Dictionaries Slide 2 Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved

L7: DictionariesSlide 19

Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved 10/02/05

Insertion in BST (insert 78)44

17 88

32

28

65 97

8254

7629

80

44

88

65

82

76

80

Page 20: Dictionaries CS 105. 10/02/05 L7: Dictionaries Slide 2 Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved

L7: DictionariesSlide 20

Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved 10/02/05

78

Insertion in BST44

17 88

32

28

65 97

8254

7629

80

44

88

65

82

76

80

Page 21: Dictionaries CS 105. 10/02/05 L7: Dictionaries Slide 2 Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved

L7: DictionariesSlide 21

Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved 10/02/05

78

Removal from BST (Ex. 1)44

17 88

32

28

65 97

8254

7629

80

w

z

Remove 32

Page 22: Dictionaries CS 105. 10/02/05 L7: Dictionaries Slide 2 Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved

L7: DictionariesSlide 22

Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved 10/02/05

78

Removal from BST (Ex. 1)44

17 88

32

28

65 97

8254

7629

80

44

17

32

w

z

Page 23: Dictionaries CS 105. 10/02/05 L7: Dictionaries Slide 2 Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved

L7: DictionariesSlide 23

Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved 10/02/05

28

29

78

Removal from BST (Ex. 1)44

17 88

65 97

8254

76

80

44

17

Page 24: Dictionaries CS 105. 10/02/05 L7: Dictionaries Slide 2 Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved

L7: DictionariesSlide 24

Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved 10/02/05

78

Removal from BST (Ex. 2)44

17 88

32

28

65 97

8254

7629

80

wRemove

65

Page 25: Dictionaries CS 105. 10/02/05 L7: Dictionaries Slide 2 Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved

L7: DictionariesSlide 25

Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved 10/02/05

78

Removal from BST (Ex. 2)44

17 88

32

28

65 97

8254

7629

80

44

88

65

82

76

80

w

y

x

54

Page 26: Dictionaries CS 105. 10/02/05 L7: Dictionaries Slide 2 Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved

L7: DictionariesSlide 26

Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved 10/02/05

Removal from BST (Ex. 2)44

17 88

32

28

65 97

8254

29

44

88

65

82

w

54

78

8080

76

Page 27: Dictionaries CS 105. 10/02/05 L7: Dictionaries Slide 2 Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved

L7: DictionariesSlide 27

Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved 10/02/05

Time complexity for BSTs

O( log n ) operations not guaranteed since resulting tree is not necessarily “balanced”

If tree is excessively skewed, operations would be O( n ) since the structure degenerates to a list

Tree could be periodically reordered to prevent skewedness

Page 28: Dictionaries CS 105. 10/02/05 L7: Dictionaries Slide 2 Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved

L7: DictionariesSlide 28

Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved 10/02/05

Time complexity (average case)

Operation insert() find() remove()

Unsorted List

O( 1 ) O( n ) O( n )

Ordered Table

O( n ) O( log n )

O(n )

BST O( log n )

O( log n )

O( log n )

Page 29: Dictionaries CS 105. 10/02/05 L7: Dictionaries Slide 2 Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved

L7: DictionariesSlide 29

Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved 10/02/05

Time complexity (worst case)

Operation insert() find() remove()

Unsorted List

O( 1 ) O( n ) O( n )

Ordered Table

O( n ) O( log n )

O(n )

BST O( n ) O( n ) O( n )

Page 30: Dictionaries CS 105. 10/02/05 L7: Dictionaries Slide 2 Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved

L7: DictionariesSlide 30

Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved 10/02/05

About BSTs

AVL tree: BST that “self-balances” Ensures that after every operation, the

difference between the left subtree height and the right subtree height is at most 1

O( log n ) operation is guaranteed Many efficient searching methods are

variants of binary search trees Database indexes are B-trees (number of

children > 2, but the same principles apply)