graphs - profs.info.uaic.rosd/eng/res/eng_curs-07.pdf · dna word design in biomolecular computing...

Post on 02-Jan-2020

0 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Graphs

DS 2019/2020

Content

Abstract data type Graph

Abstract data type Digraph

The implementation with adjacency matrices

The implementation with adjacency linked lists

Graph traversal algorithms (DFS, BFS)

Finding the (strongly) connected components

FII, UAIC Lecture 7 DS 2019/2020 2 / 51

Graphs

I G = (V ,E )I V a set of verticesI E a set of edges; an edge = a non-ordered pair of distinct vertices

V = {0, 1, 2, 3}E = {{0, 1}, {0, 2}, {1, 2}, {2, 3}}u = {0, 1} = {1, 0}

0,1 - the ends of uu is incident with 0 and 10 and 1 are adjacent (neighbors)

FII, UAIC Lecture 7 DS 2019/2020 3 / 51

Graphs

I Walk from u to v : u = i0, {i0, i1}, i1, · · · , {ik−1, ik}, ik = v3, {3,2}, 2, {2,0}, 0, {0,1}, 1, {1,3},3, {3,2},2

I Trail: a walk where any two edges are distinct

I Circuit = a closed trail where any two intermediate edges are distinct

I Path: a walk where any two vertices are distinct

I Cycle: closed path (i0 = ik)

FII, UAIC Lecture 7 DS 2019/2020 4 / 51

Induced subgraph

I G = (V ,E ) – a graph, W – a subset of V

I Induced subgraph: G ′(W ,E ′), whereE ′ = {{i , j}|{i , j} ∈ E si i ∈W , j ∈W }

FII, UAIC Lecture 7 DS 2019/2020 5 / 51

Graphs - Connectivity

Any graph can be expressed as the disjoint union of induced, connectedand maximal subgraphs (connected components).

I i R j if and only if there is a path from i to j

I R is an equivalence relation

I V1, · · · ,Vp equivalence classes

I G1, · · · ,Gp connected components, where Gi = (Vi ,Ei ) a subgraphinduced by Vi

I connected graph = a graph with a single connected component

FII, UAIC Lecture 7 DS 2019/2020 6 / 51

Abstract data type Graph

I objects:I graphs G = (V ,E ), V = {0, 1, · · · , n − 1}

I operations:I emptyGraph()

I input: nothingI output: the empty graph (∅, ∅)

I isEmptyGraph()I input: G = (V ,E),I output: true if G = (∅, ∅), false other way

I insertEdge()I input: G = (V ,E), i , j ∈ VI output: G = (V ,E ∪ {i , j})

I insertVertex()I input: G = (V ,E), V = {0, 1, · · · , n − 1}I output: G = (V ′,E), V ′ = {0, 1, · · · , n − 1, n}

FII, UAIC Lecture 7 DS 2019/2020 7 / 51

Abstract data type Graph

I removeEdge()I input: G = (V ,E ), i , j ∈ VI output: G = (V ,E –{i , j})

I removeVertex()I input: G = (V ,E ), V = {0, 1, · · · , n − 1}, kI output: G = (V ′,E ′), V ′ = {0, 1, · · · , n − 2}

{i ′, j ′} ∈ E ′ ⇔ (∃{i , j} ∈ E ) i 6= k , j 6= k,

i ′ = if (i < k) then i else i − 1,

j ′ = if (j < k) then j else j − 1

FII, UAIC Lecture 7 DS 2019/2020 8 / 51

Abstract data type Graph

I adjacencyList()I input: G = (V ,E ), i ∈ VI output: the list of vertices adjacent with i

I listOfReachableVertices()I input: G = (V ,E ), i ∈ VI output: the list of vertices reachable from i

FII, UAIC Lecture 7 DS 2019/2020 9 / 51

Content

Abstract data type Graph

Abstract data type Digraph

The implementation with adjacency matrices

The implementation with adjacency linked lists

Graph traversal algorithms (DFS, BFS)

Finding the (strongly) connected components

FII, UAIC Lecture 7 DS 2019/2020 10 / 51

Digraph (directed graph)

I D = (V ,A)I V a set of verticesI A a set of arcs (directed edges); an arc = an ordered pair of distinct

vertices

V = {0, 1, 2, 3}A = {(0, 1), (2, 0), (1, 2), (3, 2)}a = (0, 1) 6= (1, 0)

0 – the tail of a1 – the head of a

FII, UAIC Lecture 7 DS 2019/2020 11 / 51

Digraph

I Walk: i0, (i0, i1), i1, · · · , (ik−1, ik), ik3, (3,2), 2, (2,0), 0, (0,1), 1, (1,2), 2, (2,0), 0

I Trail: a walk where any two edges are distinct

I Circuit = a closed trail where any two intermediate edges are distinct

I Path: a walk where any two vertices are distinct

I Cycle: closed path (i0 = ik)

FII, UAIC Lecture 7 DS 2019/2020 12 / 51

Digraph - Connectivity

I i R j if and only if there is a path from i to j and a path from j to i

I R is an equivalence relation

I V1, · · · ,Vp the equivalence classes

I G1, · · · ,Gp strongly connected components, where Gi = (Vi ,Ai ) thesubdigraph induced by Vi

I strongly connected digraph = digraph with a single stronglyconnected component

I connected digraph

V 1 = {0, 1, 2}A1 = {(0, 1), (1, 2), (2, 0)}

V 2 = {3}A2 = ∅

FII, UAIC Lecture 7 DS 2019/2020 13 / 51

Abstract data type Digraph

I objects: digraphs D = (V ,A)I operations:

I emptyDigraph()I input: nothingI output: the empty digraph (∅, ∅)

I isEmptyDigraph()I input: D = (V ,A),I output: true if D = (∅, ∅), false other way

I insertArc()I input: D = (V ,A), i , j ∈ VI output: D = (V ,A ∪ (i , j))

I insertVertex()I input: D = (V ,A), V = {0, 1, · · · , n − 1}I output: D = (V ′,A), V ′ = {0, 1, · · · , n − 1, n}

FII, UAIC Lecture 7 DS 2019/2020 14 / 51

Abstract data type Digraph

I removeArc()I input: D = (V ,A), i , j ∈ VI output: D = (V ,A–(i , j))

I removeVertex()I input: D = (V ,A), V = {0, 1, · · · , n − 1}, kI output: D = (V ′,A′), V ′ = {0, 1, · · · , n − 2}

{i ′, j ′} ∈ A′ ⇔ (∃{i , j} ∈ A) i 6= k , j 6= k,

i ′ = if (i < k) then i else i − 1,

j ′ = if (j < k) then j else j − 1

FII, UAIC Lecture 7 DS 2019/2020 15 / 51

Abstract data type Digraph

I outAdjacencyList()I input: D = (V ,A), i ∈ VI output: the list of direct successors of i

I inAdjacencyList()I input: D = (V ,A), i ∈ VI output: the list of direct predecessors of i

I listOfReachableVertices()I input: D = (V ,A), i ∈ VI output: the list of vertices reachable from i

FII, UAIC Lecture 7 DS 2019/2020 16 / 51

The representation of graphs as digraphs

G = (V ,E ) =⇒ D(G ) = (V ,A){i , j} ∈ E =⇒ (i , j), (j , i) ∈ AI the topology is preserved

I the adjacency list of i in G = the out (=in) adjacency list of i in D

FII, UAIC Lecture 7 DS 2019/2020 17 / 51

Content

Abstract data type Graph

Abstract data type Digraph

The implementation with adjacency matrices

The implementation with adjacency linked lists

Graph traversal algorithms (DFS, BFS)

Finding the (strongly) connected components

FII, UAIC Lecture 7 DS 2019/2020 18 / 51

The implementation of digraphs with adjaceny matrices

I the representation of digraphsI n the number of verticesI m the number of arcs (optional)I a matrix (a[i , j ]| 1 ≤ i , j ≤ n)

a[i , j ] = if (i , j) ∈ A then 1 else 0

I if the digraph is a graph, then a[i , j ] is symmetricI the out adjacency list of i ⊆ line iI the in adjacency list of i ⊆ column i

FII, UAIC Lecture 7 DS 2019/2020 19 / 51

The implementation with adjacency matrices

0 1 2 30 0 1 0 01 0 0 1 02 1 0 0 03 0 1 1 0

FII, UAIC Lecture 7 DS 2019/2020 20 / 51

The implementation with adjacency matrices

I operationsI emptyDigraph

n← 0; m← 0I insertVertex: O(n)I insertArc: O(1)I removeArc: O(1)

FII, UAIC Lecture 7 DS 2019/2020 21 / 51

The implementation with adjacency matrices

I removeVertex()

Procedure removeVertex(a, n, k)begin

for i ← 0 to n − 1 dofor j ← 0 to n − 1 do

if (i > k) thena[i − 1, j ]← a[i , j ]

if (j > k) thena[i , j − 1]← a[i , j ]

n← n − 1endthe execution time: O(n2)

FII, UAIC Lecture 7 DS 2019/2020 22 / 51

The implementation with adjacency matrices

I listOfReachableVertices()I If i = j then j is reachable from i

If i 6= j then there is a path i j if there is the arc i → j or there is k:∃i k , k j

FII, UAIC Lecture 7 DS 2019/2020 23 / 51

The implementation with adjacency matrices

I listOfReachableVertices()

Procedure reflTransClosure(a, n, b) // (Warshall, 1962)begin

for i ← 0 to n − 1 dofor j ← 0 to n − 1 do

b[i , j ]← a[i , j ]if (i = j) then

b[i , j ]← 1for k ← 0 to n − 1 do

for i ← 0 to n − 1 doif (b[i , k] = 1) then

for j ← 0 to n − 1 doif (b[k , j ] = 1) then

b[i , j ]← 1endthe execution time: O(n3)

FII, UAIC Lecture 7 DS 2019/2020 24 / 51

Content

Abstract data type Graph

Abstract data type Digraph

The implementation with adjacency matrices

The implementation with adjacency linked lists

Graph traversal algorithms (DFS, BFS)

Finding the (strongly) connected components

FII, UAIC Lecture 7 DS 2019/2020 25 / 51

The implementation with adjacency lists

I the representation of digraphs with adjacency lists

I a vector a[0..n − 1] of linked lists (pointers)I a[i ] is the out adjacency list corresponding to i

FII, UAIC Lecture 7 DS 2019/2020 26 / 51

The implementation with adjacency lists

I operationsI emptyDigraphI insertVertex: O(1)I insertArc: O(1)I removeVertex: O(n + m)I removeArc: O(m)

FII, UAIC Lecture 7 DS 2019/2020 27 / 51

Content

Abstract data type Graph

Abstract data type Digraph

The implementation with adjacency matrices

The implementation with adjacency linked lists

Graph traversal algorithms (DFS, BFS)

Finding the (strongly) connected components

FII, UAIC Lecture 7 DS 2019/2020 28 / 51

Digraphs: systematic exploration

I it manages two setsI S = the set of already visited verticesI SB ⊆ S the subset of vertices for which there are chances to find

neighbors not visited yet

I the adjacency list of i is divided in:

FII, UAIC Lecture 7 DS 2019/2020 29 / 51

Digraphs: systematic exploration

I the current stepI read a vertex i from SBI extract j from the ”waiting” list of i (if it is nonempty)I if j isn’t in S , then add it to S and to SBI if the ”waiting” list of i is empty, then remove i from SB

I initiallyI S = SB = {i0}I the ”waiting” list of i = the adjacency list of i

I termination SB = ∅

FII, UAIC Lecture 7 DS 2019/2020 30 / 51

Digraphs: systematic exploration

Procedure exploration(a, n, i0, S)begin

for i ← 0 to n − 1 dop[i ]← a[i ]

SB ← (i0)visit(i0); S ← (i0)while (SB 6= ∅) do

i ← read(SB)if (p[i ] = NULL) then

SB ← SB − {i}else

j ← p[i ]→ varfp[i ]← p[i ]→ succif (j 6∈ S) then

SB ← SB ∪ {j}visit(j); S ← S ∪ {j}

end

FII, UAIC Lecture 7 DS 2019/2020 31 / 51

Systematic exploration: complexity

TheoremAssuming that the operations over S and SB as well as visit() areachieved in O(1), the time complexity, in the worst case, of theexploration algorithm is O(n + m).

FII, UAIC Lecture 7 DS 2019/2020 32 / 51

The DFS (Depth First Search) exploration

I SB is implemented as a stack

SB ← (i0)⇔SB ← emptyStack()

push(SB, i0)

i ← read(SB)⇔i ← top(SB)

SB ← SB − {i} ⇔pop(SB)

SB ← SB ∪ {j} ⇔push(SB, j)

FII, UAIC Lecture 7 DS 2019/2020 33 / 51

The DFS exploration: example

FII, UAIC Lecture 7 DS 2019/2020 34 / 51

The BFS (Breadth First Search) exploration

I SB is implemented as a queue

SB ← (i0)⇔SB ← emptyQueue();

insert(SB, i0)

i ← read(SB)⇔read(SB, i)

SB ← SB − {i} ⇔remove(SB)

SB ← SB ∪ {j} ⇔insert(SB, j)

FII, UAIC Lecture 7 DS 2019/2020 35 / 51

The BFS exploration: example

FII, UAIC Lecture 7 DS 2019/2020 36 / 51

Content

Abstract data type Graph

Abstract data type Digraph

The implementation with adjacency matrices

The implementation with adjacency linked lists

Graph traversal algorithms (DFS, BFS)

Finding the (strongly) connected components

FII, UAIC Lecture 7 DS 2019/2020 37 / 51

Finding the connected components (undirected graphs)

Function ConnectedCompDFS(D)begin

for i ← 0 to n − 1 docolor [i ]← 0

k ← 0for i ← 0 to n − 1 do

if (color [i ] = 0) thenk ← k + 1DfsRecConnectedComp(i , k)

return kend

FII, UAIC Lecture 7 DS 2019/2020 38 / 51

Finding the connected components (undirected graphs)

Procedure DfsRecConnectedComp(i , k)begin

color [i ]← kfor (each vertex j in listaDeAdiac(i)) do

if (color [j ] = 0) thenDfsRecConnectedComp(j , k)

end

FII, UAIC Lecture 7 DS 2019/2020 39 / 51

The strongly connected components (digraphs)

FII, UAIC Lecture 7 DS 2019/2020 40 / 51

The strongly connected components: example

FII, UAIC Lecture 7 DS 2019/2020 41 / 51

Finding the strongly connected components

Procedure DfsStronglyConnectedComp(D)begin

for i ← 0 to n − 1 docolor [i ]← 0parent[i ]← −1

time ← 0for i ← 0 to n − 1 do

if (color [i ] = 0) thenDfsRecStronglyConnectedComp(i)

end

FII, UAIC Lecture 7 DS 2019/2020 42 / 51

Finding the strongly connected components

Procedure DfsRecStronglyConnectedComp(i)begin

time ← time + 1color [i ]← 1for (each vertex j in adiacList(i)) do

if (color [j ] = 0) thenparent[j ]← iDfsRecStronglyConnectedComp(j)

time ← time + 1finalTime[i ]← time

end

FII, UAIC Lecture 7 DS 2019/2020 43 / 51

Finding the strongly connected components

Notation: DT = (V ,AT ), (i , j) ∈ A⇔ (j , i) ∈ AT

Procedure StronglyConnectedComp(D)begin

1. DFSStronglyConnectedComp(D)2. compute DT

3. DFSStronglyConnectedComp(DT ) but considering in the for mainloop the vertices in descending order of their final times of visitingfinalTime[i ]

4. return each tree computed at step 3 as being a distinct stronglyconnected component

end

FII, UAIC Lecture 7 DS 2019/2020 44 / 51

Finding the strongly connected components: complexity

I DFSStronglyConnectedComp(D): O(n + m)

I compute DT : O(m)

I DFSStronglyConnectedComp(DT ): O(n + m)

I Total: O(n + m)

FII, UAIC Lecture 7 DS 2019/2020 45 / 51

Applications

I Algorithms, path problems, computer networks (routing), genomics(alignment networks, genome assembly), multi-relational data mining,operations research (scheduling), artificial intelligence (constraintsatisfaction), etc.

FII, UAIC Lecture 7 DS 2019/2020 46 / 51

Applications

The Konigsberg Bridge Problem (1736): starting from one land masses,walk over each of the seven bridges just once

The land masses: vertices, the bridges: edges

It is possible to choose a vertex, proceed along the edges and return to thechosen vertex, covering each edge once?

FII, UAIC Lecture 7 DS 2019/2020 47 / 51

Applications

I Google search engine: PageRank algorithm - to determine howimportant a given web page is

I Geographic Information Systems (GIS): Google Maps, Bing Maps

I Social networks

FII, UAIC Lecture 7 DS 2019/2020 48 / 51

DNA word design

I The design of DNA codes that satisfy combinatorial constraints; usein biomolecular computing to store information, or to manipulatemolecules in chemical libraries

I Find the largest set S of strings of length n over the alphabet{A,C ,G ,T} s.t.:

I GC Content Constraint: each word has 50% symbols from {C ,G}I Hamming Distance Constraint: each pair of words, w1 6= w2 differ in at

least d positions: H(w1,w2) ≥ dI Reverse Complement Hamming Distance Constraint:

H(R(w1),C (w2)) ≥ d ; R(w): the reversed of word w and C (w) thecomplement of w (C ↔ G , A↔ T )

FII, UAIC Lecture 7 DS 2019/2020 49 / 51

Graph modeling

I every DNA word has an assigned vertex viI E = EHD ∪ ERC (EHD pairs of words with a HD conflict, ERC pairs of

words with a RC conflict)

I a solution: a maximum independent set

Figura: Graphs for words of size 2 and Hamming distance d = 2 (left) and d = 3(right)

AC

TCAG

TG

CA

CTGA

GT

AC

TCAG

TG

CA

CTGA

GT

FII, UAIC Lecture 7 DS 2019/2020 50 / 51

Solution of 136 words for n = 8, d = 4 instance

AAACCACC ACCAGTGT ACCCAAGA ACGTAGTG ACTGACGT AGGAAGCTAGTCCTCT AGTTGGCA ATCCCGTT ATGGGCTT CAAACCTC CAAGAGACCAAGCAGT CACAGTTG CACCAATC CAGATGGT CAGGATCT CATCGTGTCATGACTG CATTCGCT CCAGTCTT CCCTGATT CCGACTTT CCTCAGTTCGAAGGTT CGACACAT CGATTTGG CGCACAAT CGCCTTTT CGCTAGTACGGTGTAT CGTAAAGG CGTGTGAT CTATGCCT CTCGTACT CTGAAGAGCTGCAAGT CTTACCGT CTTCCTAG GAAAGCGT GAACAGCT GAACGTAGGAAGGATC GACATGAG GACCTAGT GACTGTCT GAGAAGTC GAGACACTGAGTACAG GATGCAAG GATGTCCT GCAATAGG GCAGCTAT GCCTAGATGCGATCAT GCGGAATT GCTCGAAT GCTTATGG GGAAATGC GGACCATTGGATAACG GGCAACTT GGGTTGTT GGTATTCG GGTTCCAT GGTTTAGCGTAACCAG GTAGAGTG GTATCGGT GTCAGTAC GTCCAAAG GTCGATGTGTGAGATG GTGCTTCT GTTAGGCT GTTCTCTG GTTGACAC TAACACGCTAAGCTCG TACACAGC TACCGCTT TAGATCCG TAGGAAGG TAGGCGTTTAGTGTGC TATCGACG TATGTGGC TCAACGTG TCACGTCT TCAGACAGTCATGCTC TCCATGCT TCCCATTG TCCGTATC TCCTCAAG TCGAAGGATCGAGTAG TCGCAAAC TCGGTTGT TCGTACCT TCTACCAC TCTCCTGATCTCTGAG TCTGCACT TGAACCCT TGACCTAC TGAGAGGT TGATGGAGTGCAGTCA TGCGTTAG TGCTACAC TGCTCTGT TGGAGAGT TGGATGACTGGCTATG TGGGATTC TGTAGCTG TGTCTCGT TGTGACCA TGTGGAACTGTTCGTC TTAAGGGC TTACCAGG TTAGTCCC TTCAACGG TTCCTTGCTTCGCCAT TTCGGGTA TTCTGACC TTGACTCC TTGCCCTA TTGCGGATTTGTTGGG TTTCAGCC TTTGGTGG TTTTCCCG

FII, UAIC Lecture 7 DS 2019/2020 51 / 51

top related