cs 106x, lecture 22 graphs; bfs; dfs · 7 graphs 2 1 2 2 1 1 a graph consists of a set of...

80
This document is copyright (C) Stanford Computer Science and Nick Troccoli, licensed under Creative Commons Attribution 2.5 License. All rights reserved. Based on slides created by Keith Schwarz, Julie Zelenski, Jerry Cain, Eric Roberts, Mehran Sahami, Stuart Reges, Cynthia Lee, Marty Stepp, Ashley Taylor and others. CS 106X, Lecture 22 Graphs; BFS; DFS reading: Programming Abstractions in C++, Chapter 18

Upload: others

Post on 22-May-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

This document is copyright (C) Stanford Computer Science and Nick Troccoli, licensed under Creative Commons Attribution 2.5 License. All rights reserved.Based on slides created by Keith Schwarz, Julie Zelenski, Jerry Cain, Eric Roberts, Mehran Sahami, Stuart Reges, Cynthia Lee, Marty Stepp, Ashley Taylor and others.

CS 106X, Lecture 22Graphs; BFS; DFS

reading:Programming Abstractions in C++, Chapter 18

2

Plan For Today• Recap: Graphs• Practice: Twitter Influence• Depth-First Search (DFS)• Announcements• Breadth-First Search (BFS)

3

Plan For Today• Recap: Graphs• Practice: Twitter Influence• Depth-First Search (DFS)• Announcements• Breadth-First Search (BFS)

4

Graphs

Graphs can model:- Sites and links on the web- Disease outbreaks- Social networks- Geographies- Task and dependency graphs- and more…

A graph consists of a set of nodes connected by edges.

5

GraphsA graph consists of a set of nodes connected by edges.

Nodes: degree (# connected edges)Nodes: in-degree (directed, # in-edges)Nodes: out-degree (directed, # out-edges)

Path: sequence of nodes/edges from one node to another Path: node x is reachable from node yif a path exists from y to x.Path: a cycle is a path that starts and ends at the same nodePath: a loop is an edge that connects a node to itself

6

Graphs

5

3

62

11

A graph consists of a set of nodes connected by edges.

Nodes: degree (# connected edges)Nodes: in-degree (directed, # in-edges)Nodes: out-degree (directed, # out-edges)

Path: sequence of nodes/edges from one node to another Path: node x is reachable from node yif a path exists from y to x.Path: a cycle is a path that starts and ends at the same nodePath: a loop is an edge that connects a node to itself

7

Graphs

2

1

22

11

A graph consists of a set of nodes connected by edges.

Nodes: degree (# connected edges)Nodes: in-degree (directed, # in-edges)Nodes: out-degree (directed, # out-edges)

Path: sequence of nodes/edges from one node to another Path: node x is reachable from node yif a path exists from y to x.Path: a cycle is a path that starts and ends at the same nodePath: a loop is an edge that connects a node to itself

8

Graphs

3

2

40

00

A graph consists of a set of nodes connected by edges.

Nodes: degree (# connected edges)Nodes: in-degree (directed, # in-edges)Nodes: out-degree (directed, # out-edges)

Path: sequence of nodes/edges from one node to another Path: node x is reachable from node yif a path exists from y to x.Path: a cycle is a path that starts and ends at the same nodePath: a loop is an edge that connects a node to itself

9

Graphs

A

C

DB

EF

A graph consists of a set of nodes connected by edges.

Nodes: degree (# connected edges)Nodes: in-degree (directed, # in-edges)Nodes: out-degree (directed, # out-edges)

Path: sequence of nodes/edges from one node to another Path: node x is reachable from node yif a path exists from y to x.Path: a cycle is a path that starts and ends at the same nodePath: a loop is an edge that connects a node to itself

10

Graphs

A

C

DB

EF

A graph consists of a set of nodes connected by edges.

Nodes: degree (# connected edges)Nodes: in-degree (directed, # in-edges)Nodes: out-degree (directed, # out-edges)

Path: sequence of nodes/edges from one node to another Path: node x is reachable from node yif a path exists from y to x.Path: a cycle is a path that starts and ends at the same nodePath: a loop is an edge that connects a node to itself

11

Graphs

A

C

DB

EF

A graph consists of a set of nodes connected by edges.

Nodes: degree (# connected edges)Nodes: in-degree (directed, # in-edges)Nodes: out-degree (directed, # out-edges)

Path: sequence of nodes/edges from one node to another Path: node x is reachable from node yif a path exists from y to x.Path: a cycle is a path that starts and ends at the same nodePath: a loop is an edge that connects a node to itself

12

Graphs

A

C

DB

EF

A graph consists of a set of nodes connected by edges.

Nodes: degree (# connected edges)Nodes: in-degree (directed, # in-edges)Nodes: out-degree (directed, # out-edges)

Path: sequence of nodes/edges from one node to another Path: node x is reachable from node yif a path exists from y to x.Path: a cycle is a path that starts and ends at the same nodePath: a loop is an edge that connects a node to itself

13

Graph PropertiesA graph is connected if every node is reachable from every other node.

14

Graph PropertiesA graph is complete if every node has a direct edge to every other node.

15

Graph PropertiesA graph is acyclic if it does not contain any cycles.

16

Graph PropertiesA graph is directed if its edges have direction, or undirected if its edges do not have direction (aka are bidirectional).

directed undirected

17

Graph Properties• Connected or unconnected• Acyclic• Directed or undirected• Weighted or unweighted• Complete

18

Plan For Today• Recap: Graphs• Practice: Twitter Influence• Depth-First Search (DFS)• Announcements• Breadth-First Search (BFS)

19

Twitter Influence• Twitter lets a user follow another user to see their

posts.• Following is directional (e.g. I can follow you but you

don’t have to follow me back L)• Let’s define being influential as having a high number

of followers-of-followers.– Reasoning: doesn’t just matter how many people follow

you, but whether the people who follow you reach a large audience.

• Write a function mostInfluential that reads a file of Twitter relationships and outputs the most influential user.

https://about.twitter.com/en_us/company/brand-resources.html

20

BasicGraph members#include "basicgraph.h" // a directed, weighted graph

g.addEdge(v1, v2); adds an edge between two vertexesg.addVertex(name); adds a vertex to the graphg.clear(); removes all vertexes/edges from the graphg.getEdgeSet()g.getEdgeSet(v)

returns all edges, or all edges that start at v,as a Set of pointers

g.getNeighbors(v) returns a set of all vertices that v has an edge tog.getVertex(name) returns pointer to vertex with the given nameg.getVertexSet() returns a set of all vertexesg.isNeighbor(v1, v2) returns true if there is an edge from vertex v1 to v2g.isEmpty() returns true if queue contains no vertexes/edgesg.removeEdge(v1, v2); removes an edge from the graphg.removeVertex(name); removes a vertex from the graphg.size() returns the number of vertexes in the graph

g.toString() returns a string such as "{a, b, c, a -> b}"

21

Plan For Today• Recap: Graphs• Practice: Twitter Influence• Depth-First Search (DFS)• Announcements• Breadth-First Search (BFS)

22

Searching for paths• Searching for a path from one vertex to another:

– Sometimes, we just want any path (or want to know there is a path).– Sometimes, we want to minimize path length (# of edges).– Sometimes, we want to minimize path cost (sum of edge weights).

ORD PVD

MIADFW

SFO

LAX

LGAHNL

$50

$80

$140$170

$70

$100$110

$120$60

$250

$200

$500

$130

23

Finding Paths• Easiest way: Depth-First Search (DFS)

– Recursive backtracking!• Finds a path between two nodes if it exists

– Or can find all the nodes reachable from a node• Where can I travel to starting in San Francisco?• If all my friends (and their friends, and so on) share my post, how many will

eventually see it?

24

Depth-first search (18.4)• depth-first search (DFS): Finds a path between two vertices by

exploring each possible path as far as possible before backtracking.– Often implemented recursively.– Many graph algorithms involve visiting or marking vertices.

• DFS from a to h (assuming A-Z order) visits:– a

• b• e

• fc

i• d

• g• h

– path found: {a, d, g, h}

a

e

b c

hg

d f

i

25

DFS

Mark current as visitedExplore all the unvisited nodes from this node

A B C D

FE

IHG

26

DFS

Mark current as visitedExplore all the unvisited nodes from this node

A B C D

FE

IHG

27

DFS

Mark current as visitedExplore all the unvisited nodes from this node

A B C D

FE

IHG

28

DFS

Mark current as visitedExplore all the unvisited nodes from this node

A B C D

FE

IHG

29

DFS

Mark current as visitedExplore all the unvisited nodes from this node

A B C D

FE

IHG

30

DFS

Mark current as visitedExplore all the unvisited nodes from this node

A B C D

FE

IHG

31

DFS

Mark current as visitedExplore all the unvisited nodes from this node

A B C D

FE

IHG

32

DFS

Mark current as visitedExplore all the unvisited nodes from this node

A B C D

FE

IHG

33

DFS

Mark current as visitedExplore all the unvisited nodes from this node

A B C D

FE

IHG

34

DFS

Mark current as visitedExplore all the unvisited nodes from this node

A B C D

FE

IHG

35

DFS

Mark current as visitedExplore all the unvisited nodes from this node

A B C D

FE

IHG

36

DFS

Mark current as visitedExplore all the unvisited nodes from this node

A B C D

FE

IHG

37

DFS

Mark current as visitedExplore all the unvisited nodes from this node

A B C D

FE

IHG

38

DFS

Mark current as visitedExplore all the unvisited nodes from this node

A B C D

FE

IHG

39

DFS

Mark current as visitedExplore all the unvisited nodes from this node

A B C D

FE

IHG

40

DFS

Mark current as visitedExplore all the unvisited nodes from this node

A B C D

FE

IHG

41

DFS

Mark current as visitedExplore all the unvisited nodes from this node

A B C D

FE

IHG

42

DFS

Mark current as visitedExplore all the unvisited nodes from this node

A B C D

FE

IHG

43

DFS

Mark current as visitedExplore all the unvisited nodes from this node

A B C D

FE

IHG

44

DFS

Mark current as visitedExplore all the unvisited nodes from this node

A B C D

FE

IHG

45

DFS

Mark current as visitedExplore all the unvisited nodes from this node

A B C D

FE

IHG

46

DFS

Mark current as visitedExplore all the unvisited nodes from this node

A B C D

FE

IHG

47

DFS

Mark current as visitedExplore all the unvisited nodes from this node

A B C D

FE

IHG

48

DFS

Mark current as visitedExplore all the unvisited nodes from this node

A B C D

FE

IHG

49

DFS

Mark current as visitedExplore all the unvisited nodes from this node

A B C D

FE

IHG

50

DFS

Mark current as visitedExplore all the unvisited nodes from this node

A B C D

FE

IHG

51

DFS

Mark current as visitedExplore all the unvisited nodes from this node

A B C D

FE

IHG

52

DFS

Mark current as visitedExplore all the unvisited nodes from this node

A B C D

FE

IHG

53

DFS Details• In an n-node, m-edge graph, takes O(m + n) time with an adjacency

list– Visit each edge once, visit each node at most once

• Pseudocode:dfs from v1:

mark v1 as seen.for each of v1's unvisited neighbors n:dfs(n)

• How could we modify the pseudocode to look for a specific path?

54

DFS that finds pathdfs from v1 to v2:

mark v1 as visited, and add to path.perform a dfs from each of v1'sunvisited neighbors n to v2:

if dfs(n, v2) succeeds: a path is found! yay!if all neighbors fail: remove v1 from path.

• To retrieve the DFS path found, pass a collection parameter to each call and choose-explore-unchoose.

a

e

b c

hg

d f

i

55

DFS observations• discovery: DFS is guaranteed to

find a path if one exists.

• retrieval: It is easy to retrieve exactlywhat the path is (the sequence of edges taken) if we find it– choose - explore - unchoose

• optimality: not optimal. DFS is guaranteed to find a path, not necessarily the best/shortest path– Example: dfs(a, i) returns {a, b, e, f, c, i} rather than {a, d, h, i}.

a

e

b c

hg

d f

i

56

Plan For Today• Recap: Graphs• Practice: Twitter Influence• Depth-First Search (DFS)• Announcements• Breadth-First Search (BFS)

57

Announcements• Assignment 7 will go out this Friday, is due Wed. after break

– Short graphs assignment (Google Maps!), implementing algorithms from this week

• Assignment 8 will go out the Wed. after break, is due the last day ofclass (Fri)– Graphs and inheritance assignment (Excel!)

58

Plan For Today• Recap: Graphs• Practice: Twitter Influence• Depth-First Search (DFS)• Announcements• Breadth-First Search (BFS)

59

Finding Shortest Paths• We can find paths between two nodes, but how can we find the shortest path?– Fewest number of steps to complete a task?– Least amount of edits between two words?

• When have we solved this problem before?

60

Breadth-First Search (BFS)• Idea: processing a node involves knowing we need to visit all its

neighbors (just like DFS)• Need to keep a TODO list of nodes to process

61

Breadth-First Search (BFS)• Keep a Queue of nodes as our TODO list• Idea: dequeue a node, enqueue all its neighbors• Still will return the same nodes as reachable, just might have

shorter paths

62

BFS

a b c d

fe

g h i

queue: a

Dequeue a nodeadd all its unseen neighbors to the queue

63

BFS

a b c d

fe

g h i

queue: e, g

Dequeue a nodeadd all its unseen neighbors to the queue

64

BFS

a b c d

fe

g h i

queue: e, g

Dequeue a nodeadd all its unseen neighbors to the queue

65

BFS

a b c d

fe

g h i

queue: g, f

Dequeue a nodeadd all its unseen neighbors to the queue

66

BFS

a b c d

fe

g h i

queue: g, f

Dequeue a nodeadd all its unseen neighbors to the queue

67

BFS

a b c d

fe

g h i

queue: f, h

Dequeue a nodeadd all its unseen neighbors to the queue

68

BFS

a b c d

fe

g h i

queue: f, h

Dequeue a nodeadd all its unseen neighbors to the queue

69

BFS

a b c d

fe

g h i

queue: h

Dequeue a nodeadd all its unseen neighbors to the queue

70

BFS

a b c d

fe

g h i

queue: h

Dequeue a nodeadd all its unseen neighbors to the queue

71

BFS

a b c d

fe

g h i

queue: i

Dequeue a nodeadd all its unseen neighbors to the queue

72

BFS

a b c d

fe

g h i

queue: i

Dequeue a nodeadd all its unseen neighbors to the queue

73

BFS

a b c d

fe

g h i

queue: c

Dequeue a nodeadd all its unseen neighbors to the queue

74

BFS

a b c d

fe

g h i

queue: c

Dequeue a nodeadd all its unseen neighbors to the queue

75

BFS

Dequeue a nodeadd all its unseen neighbors to the queue

a b c d

fe

g h i

queue: c

76

BFS Details• In an n-node, m-edge graph, takes O(m + n) time with an adjacency

list– Visit each edge once, visit each node at most once

bfs from v1 to v2:create a queue of vertexes to visit,

initially storing just v1.mark v1 as visited.

while queue is not empty and v2 is not seen:dequeue a vertex v from it,mark that vertex v as visited,and add each unvisited neighbor n of v to the queue.

• How could we modify the pseudocode to look for a specific path?

77

BFS observations• optimality:

– always finds the shortest path (fewest edges).– in unweighted graphs, finds optimal cost path.– In weighted graphs, not always optimal cost.

• retrieval: harder to reconstruct the actual sequence of vertices or edges in the path once you find it– conceptually, BFS is exploring many possible paths in parallel, so it's not

easy to store a path array/list in progress– solution: We can keep track of the path by storing predecessors for

each vertex (each vertex can store a reference to a previous vertex).

• DFS uses less memory than BFS, easier to reconstruct the path once found; but DFS does not always find shortest path. BFS does.

a

e

b c

hg

d f

i

78

Recap• Recap: Graphs• Practice: Twitter Influence• Depth-First Search (DFS)• Announcements• Breadth-First Search (BFS)

Next time: more graph searching algorithms

79

Overflow

80

BFS that finds pathbfs from v1 to v2:

create a queue of vertexes to visit,initially storing just v1.

mark v1 as visited.

while queue is not empty and v2 is not seen:dequeue a vertex v from it,mark that vertex v as visited,and add each unvisited neighbor n of v to the queue, while setting n's previous to v.

preva

e

b c

hg

d f

i