unit 6c - carnegie mellon school of computer science15110-s13/unit06ptc-handout.pdf · unit 6c...

18
1 UNIT 6C Organizing Data: Trees and Graphs 15110 Principles of Computing, Carnegie Mellon University 1 Last Lecture Hash tables Using hash function to map keys of different data types to array indices Constant search time if the load factor is small Associative arrays in Ruby 15110 Principles of Computing, Carnegie Mellon University 2

Upload: lamphuc

Post on 27-Jun-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: UNIT 6C - Carnegie Mellon School of Computer Science15110-s13/Unit06PtC-handout.pdf · UNIT 6C Organizing Data: Trees and Graphs 15110 Principles of Computing, Carnegie Mellon University

1

UNIT 6COrganizing Data: Trees and Graphs

15110 Principles of Computing, Carnegie Mellon University

1

Last Lecture

• Hash tables

– Using hash function to map keys of different data types to array indices

– Constant search time if the load factor is small

• Associative arrays in Ruby

15110 Principles of Computing, Carnegie Mellon University

2

Page 2: UNIT 6C - Carnegie Mellon School of Computer Science15110-s13/Unit06PtC-handout.pdf · UNIT 6C Organizing Data: Trees and Graphs 15110 Principles of Computing, Carnegie Mellon University

2

This Lecture

• Non‐linear data structures

– Trees, specifically, binary trees

– Graphs

15110 Principles of Computing, Carnegie Mellon University

3

Trees

15110 Principles of Computing, Carnegie Mellon University

4

Page 3: UNIT 6C - Carnegie Mellon School of Computer Science15110-s13/Unit06PtC-handout.pdf · UNIT 6C Organizing Data: Trees and Graphs 15110 Principles of Computing, Carnegie Mellon University

3

Trees

• A tree is a hierarchical data structure.– Every tree has a node called the root.

– Each node can have 1 or more nodes as children.

– A node that has no children is called a leaf.

• A common tree in computing is a binary tree.– A binary tree consists of nodes that have at most 2 

children.

– A complete binary tree has the maximum number of nodes on each of its levels.

• Applications: data compression, file storage, game trees

15110 Principles of Computing, Carnegie Mellon University

5

Binary Tree

15110 Principles of Computing, Carnegie Mellon University

6

84

41 96

24

37

50

13

98

Which one is the root?Which ones are the leaves?Is this a complete binary tree?What is the height of this tree?

Page 4: UNIT 6C - Carnegie Mellon School of Computer Science15110-s13/Unit06PtC-handout.pdf · UNIT 6C Organizing Data: Trees and Graphs 15110 Principles of Computing, Carnegie Mellon University

4

Binary Tree

15110 Principles of Computing, Carnegie Mellon University

7

84

41 96

24

37

50

13

98

The root has the data value 84.There are 4 leaves in this binary tree: 13, 37, 50, 98.This binary tree is not complete.This binary tree has height 3.

Binary Trees: Implementation

• One common implementation of binary trees uses nodes like a linked list does.

– Instead of having a “next” pointer, each node has a “left” pointer and a “right” pointer. 

15110 Principles of Computing, Carnegie Mellon University

8

45

31 70

19 38 86

Level 1

Level 2

Level 3

Page 5: UNIT 6C - Carnegie Mellon School of Computer Science15110-s13/Unit06PtC-handout.pdf · UNIT 6C Organizing Data: Trees and Graphs 15110 Principles of Computing, Carnegie Mellon University

5

Using Nested Arrays

• Languages like Ruby do not let programmers manipulate pointers explicitly. 

• We could use Ruby arrays to implement binary trees. For example:

nil stands for an empty tree

Arrows point to subtrees15110 Principles of Computing,

Carnegie Mellon University 9

45

31 70

19 38 86

Level 1

Level 2

Level 3

[left,45,right]

[[left,31,right],45,[left,70,right]]

[[[nil,19,nil],31,[nil,38,nil]],45,

[nil,70,[nil,86,nil]]]

Using One Dimensional Arrays

• We could also use a flat  array.

15110 Principles of Computing, Carnegie Mellon University

10

45 31 70 19 38 86

Level 1 Level 2 Level 3

45

31 70

19 38 86

Level 1

Level 2

Level 3

Page 6: UNIT 6C - Carnegie Mellon School of Computer Science15110-s13/Unit06PtC-handout.pdf · UNIT 6C Organizing Data: Trees and Graphs 15110 Principles of Computing, Carnegie Mellon University

6

Dynamic Set Operations

• Insert

• Delete

• Search

• Find min/max

• ...

15110 Principles of Computing, Carnegie Mellon University

11

Binary Search Tree (BST)

• A binary search tree (BST) is a binary tree such that 

– All nodes to the left of any node have data values less than that node

– All nodes to the right of any node have data values greater than that node

15110 Principles of Computing, Carnegie Mellon University

12

Page 7: UNIT 6C - Carnegie Mellon School of Computer Science15110-s13/Unit06PtC-handout.pdf · UNIT 6C Organizing Data: Trees and Graphs 15110 Principles of Computing, Carnegie Mellon University

7

Inserting into a BST

• For each data value that you wish to insert into the binary search tree:

– Start at the root and compare the new data value with the root. 

– If it is less, move down left. If it is greater, move down right. 

– Repeat on the child of the root until you end up in a position that has no node. 

– Insert a new node at this empty position.

15110 Principles of Computing, Carnegie Mellon University

13

Example

• Insert: 84, 41, 96, 24, 37, 50, 13, 98

15110 Principles of Computing, Carnegie Mellon University

14

84

41 96

24

37

50

13

98

Page 8: UNIT 6C - Carnegie Mellon School of Computer Science15110-s13/Unit06PtC-handout.pdf · UNIT 6C Organizing Data: Trees and Graphs 15110 Principles of Computing, Carnegie Mellon University

8

Using a BST

• How would you search for an element in a BST?

15110 Principles of Computing, Carnegie Mellon University

15

84

41 96

24

37

50

13

98

Searching a BST

• For the key (data value) that you wish to search

– Start at the root and compare the key with the root. If equal, key found. 

– Otherwise

• If it is less, move down left. If it is greater, move down right. Repeat search on the child of the root.

• If there is no non‐empty subtree to move to , then key not found.

15110 Principles of Computing, Carnegie Mellon University

16

Page 9: UNIT 6C - Carnegie Mellon School of Computer Science15110-s13/Unit06PtC-handout.pdf · UNIT 6C Organizing Data: Trees and Graphs 15110 Principles of Computing, Carnegie Mellon University

9

Exercises

• How you would find the minimum and maximum elements in a BST?

• What would be output if we walked the tree in left‐node‐right order?

15110 Principles of Computing, Carnegie Mellon University

17

Max‐Heaps

• A max‐heap is a binary tree such that 

– The largest data value is in the root

– For every node in the max‐heap, its children contain smaller data.

– The max‐heap is an almost‐complete binary tree.

• An almost‐complete binary tree is a binary tree such that every level of the tree has the maximum number of nodes possible except possibly the last level, where its nodes are attached as far left as possible.

15110 Principles of Computing, Carnegie Mellon University

18

Page 10: UNIT 6C - Carnegie Mellon School of Computer Science15110-s13/Unit06PtC-handout.pdf · UNIT 6C Organizing Data: Trees and Graphs 15110 Principles of Computing, Carnegie Mellon University

10

These are not heaps! Why?

15110 Principles of Computing, Carnegie Mellon University

19

84

41 56

24

7

10

13

38

68

53 26

72 30

Adding data to a Max‐Heap

15110 Principles of Computing, Carnegie Mellon University

20

10

6210

62

84

41 56

713

381324

62

10

41

Insert new data into nextavailable tree positionso the tree remains(almost) complete.

Exchange data with itsparent(s) (if necessary)until the tree is restored to a heap.

1. insert here

2.

3.

Page 11: UNIT 6C - Carnegie Mellon School of Computer Science15110-s13/Unit06PtC-handout.pdf · UNIT 6C Organizing Data: Trees and Graphs 15110 Principles of Computing, Carnegie Mellon University

11

Building a Max‐Heap

• To build a max‐heap, just insert each element one at a time into the heap using the previous algorithm.

15110 Principles of Computing, Carnegie Mellon University

21

Removing Data from a Max‐Heap

15110 Principles of Computing, Carnegie Mellon University

22

10

84

62 56

713

381324

Remove root (maximum value)and replace it with "last" nodeof the lowest level.

Exchange data with its larger child (if necessary) until the tree is restored to a heap.

41

10

10

62

41

10

1. move last node to root

2.

3.

Page 12: UNIT 6C - Carnegie Mellon School of Computer Science15110-s13/Unit06PtC-handout.pdf · UNIT 6C Organizing Data: Trees and Graphs 15110 Principles of Computing, Carnegie Mellon University

12

BSTs vs. Max‐Heaps

• Which tree is designed for easier searching?

• Which tree is designed for retrieving the maximum value quickly?

• A heap is guaranteed to be “balanced”(complete or almost‐complete). What about a BST? 

15110 Principles of Computing, Carnegie Mellon University

23

BSTs vs Max‐Heaps

• BST with n elements

– Insert and Search:

• worst case O(log n) if tree is “balanced”

• worst case O(n) in general since tree could have one node per level

• Max‐Heap with n elements

– Insert and Remove‐Max

• worst case O(log n) since tree is always “balanced”

– Find‐Max

• worst case O(1) since max is always at the root15110 Principles of Computing,

Carnegie Mellon University 24

Page 13: UNIT 6C - Carnegie Mellon School of Computer Science15110-s13/Unit06PtC-handout.pdf · UNIT 6C Organizing Data: Trees and Graphs 15110 Principles of Computing, Carnegie Mellon University

13

Graphs

15110 Principles of Computing, Carnegie Mellon University

25

15110 Principles of Computing, Carnegie Mellon University

26

Page 14: UNIT 6C - Carnegie Mellon School of Computer Science15110-s13/Unit06PtC-handout.pdf · UNIT 6C Organizing Data: Trees and Graphs 15110 Principles of Computing, Carnegie Mellon University

14

Graphs

• A graph is a data structure that consists of a set of vertices and a set of edges connecting pairs of the vertices.

– A graph doesn’t have a root, per se.

– A vertex can be connected to any number of other vertices using edges.

– An edge may be bidirectional or directed (one‐way).

– An edge may have a weight on it that indicates a cost for traveling over that edge in the graph.

• Applications: computer networks, transportation systems, social relationships

15110 Principles of Computing, Carnegie Mellon University

27

Undirected and Directed Graphs

15110 Principles of Computing, Carnegie Mellon University

28

1

0

3

26

4

53

7

0 1 2 3

0 0 6 7 5

1 6 0 4 ∞

2 7 4 0 3

3 5 ∞ 3 0

0 1 2 3

0 0 6 7 5

1 ∞ 0 4 ∞

2 2 ∞ 0 3

3 ∞ ∞ 9 0

1

0

3

26

4

53

7

2

9

from

to

from

to

Page 15: UNIT 6C - Carnegie Mellon School of Computer Science15110-s13/Unit06PtC-handout.pdf · UNIT 6C Organizing Data: Trees and Graphs 15110 Principles of Computing, Carnegie Mellon University

15

Graphs in Ruby

15110 Principles of Computing, Carnegie Mellon University

29

1

0

3

26

4

53

7

0 1 2 3

0 0 6 7 5

1 6 0 4 ∞

2 7 4 0 3

3 5 ∞ 3 0

inf = 1.0/0.0

graph = [ [ 0, 6, 7, 5 ],

[ 6, 0, 4, inf ],[ 7, 4, 0, 3],[ 5, inf, 3, 0] ]

from

to

An Undirected Weighted Graph

15110 Principles of Computing, Carnegie Mellon University

30

0 1 2 3 4 5 6

0 0 10 ∞ 8 7 ∞ ∞

1 10 0 12 7 ∞ ∞ ∞

2 ∞ 12 0 6 ∞ 7 5

3 8 7 6 0 9 4 ∞

4 7 ∞ ∞ 9 0 ∞ 11

5 ∞ ∞ 7 4 ∞ 0 3

6 ∞ ∞ 5 ∞ 11 3 0

0

1

3

2

6

4

5

12

6

45

98

10

7 11

3

7

7

0 1 2 3 4 5 6Pitt. Erie Will. S.C. Harr. Scr. Phil.

vertices edges

from

to

Page 16: UNIT 6C - Carnegie Mellon School of Computer Science15110-s13/Unit06PtC-handout.pdf · UNIT 6C Organizing Data: Trees and Graphs 15110 Principles of Computing, Carnegie Mellon University

16

Original Graph

15110 Principles of Computing, Carnegie Mellon University

31

Pitt

Erie

S.C.

Will.

Phil.

Harr.

Scr.

12

64

59

810

7 11

3

7

7

A Minimal Spanning Tree

15110 Principles of Computing, Carnegie Mellon University

32

Pitt

Erie

S.C.

Will.

Phil.

Harr.

Scr.4

58

7

3

7

The minimum total cost to connect all vertices using edges from the original graph without using cycles. (minimum total cost = 34)

Page 17: UNIT 6C - Carnegie Mellon School of Computer Science15110-s13/Unit06PtC-handout.pdf · UNIT 6C Organizing Data: Trees and Graphs 15110 Principles of Computing, Carnegie Mellon University

17

Original Graph

15110 Principles of Computing, Carnegie Mellon University

33

Pitt

Erie

S.C.

Will.

Phil.

Harr.

Scr.

12

64

59

810

7 11

3

7

7

Shortest Paths from Pittsburgh

15110 Principles of Computing, Carnegie Mellon University

34

Pitt

Erie

S.C.

Will.

Phil.

Harr.

Scr.64

810

7

3

1014

8

12

157

The total costs of the shortest path from Pittsburgh to every other location using only edges from the original graph.

Page 18: UNIT 6C - Carnegie Mellon School of Computer Science15110-s13/Unit06PtC-handout.pdf · UNIT 6C Organizing Data: Trees and Graphs 15110 Principles of Computing, Carnegie Mellon University

18

Graph Algorithms

• There are algorithms to compute the minimal spanning tree of a graph and the shortest paths for a graph.

• There are algorithms for other graph operations:

– If a graph represents a set of pipes and the number represent the maximum flow through each pipe, then we can determine the maximum amount of water that can flow through the pipes assuming one vertex is a “source” (water coming into the system) and one vertex is a “sink” (water leaving the system)

– Many more graph algorithms... very useful to solve real life problems.

15110 Principles of Computing, Carnegie Mellon University

35