the problem solving process - mcmaster …cs2md3/lecturenotes.doc · web viewmultiply matrices a...

THE PROBLEM SOLVING PROCESS

1

MATHEMATICALMODEL

INFORMAL ALGORITHM

ABSTRACT DATA TYPES

PSEUDO – LANGUAGE PROGRAM

OROTHER FORMAL

DESCRIPTION

DATASTRUCTURES

PROGRAM(Pascal, C, C++, ete.)

DATA TYPE VERSUS ABSTRUCT DATA TYPE

DATA TYPE: Set of values (or objects)

ABSTRUCT DATA TYPE (ADT): Set of objects + a mathematical model with a collection of operations defined on the model.

2

DATA TYPE: Set of values (or objects).

Fortran 77: INTEGER, REAL, CHARACTER/STRING LOGICAL

Composite types: Array of Integers Array of reals Etc.

Pascal: Basic types: integer, real, character, BooleanComposite types: array of integers

Array of charactersEtc.

Record of integers/reals/charactersEtc.

Set of…. File of…

THERE ARE OPERATIONS ASSOCIATED WITH EACH TYPE

AGGREGATING TOOLS: array, record, file

3

C:

Basic types: int, real, char

Composite types: arrays, structures

WHAT ABOUT POINTERS?

Pointer can be treated as a data type, but usually it’s treated as a DATA STRUCTURING FACILTY.

4

ABSTRUCT DATA TYPE (ADT)

Set of objects plus a mathematical model with a collection of operations defined on the model

Example: List a1, a2 ,…, an

LIST (of integers) is an ADT with the following operations:

1. Calculate the length of the list

2. Get the fist member of the list and return null if empty

3. Retrieve the member at position P and return null if P doesn’t exist

4. Locate X in the list

5. Insert X into the list at position P

6. Delete the member at position PP = 1 2 3 4 5 6 7 8L = 50, 60, 23, 47, 21, 39, 60, 40

1. LENGTH (L) = 8

2. FIRST (L) = 50

3. RETRIEVE (4, L) = 47RETRIEVE (9, L) = null

5

4. LOCATE (60, L) = 2

5. INSERT (30, 5, L) gives the result:

L = 50, 60, 23, 47, 30, 21, 39, 60, 40

6. DELETE (3, L) gives the result

L = 50, 60, 47, 21, 29, 39, 60, 40

ALL OPERATIONS ARE ATOMIC EXCEPT FIRST SINCE:FIRST (L) = RETRIEVE (1, L)

EXAMPLE: ADT STACK (OF INTEGERS)

1. Retrieve the top element2. Delete the top element (POP)3. Insert x at the top (PUSH)4. Test if the stack is empty

27

40

326

S =

1. TOP (S) = 27

2. POP(S) = results in S =

3. PUSH (0, S) results in

S =3. EMPTY (S) = false

Example: ADT MATRIX (OF REALS)

1. Return number of rows

2. Return number of columns

3. Multiply matrices A and B

4. Add A and B

5. Compute the transpose of matrix A

6. Delete a rows/column

40

32

0

274032

7

7. Add a row/column

8. Multiply matrix A by real number 6

OBSERVATIONS:

1. Domain of an operation may involve more than one ADTType

2. Some operations are partial

3. Range of an operation may be a different ADT

A simple application of ADT – evaluation of arithmetic express

a + b*c/d **e +f

Algorithm: Value (x : expression); oprnd: STACK OF REALS optor: STACK OF CHARS x1, x2 : REAL i: INTEGER Initialize oprnd and optor for i:= to LEN (x) do

case x[i] of

8

real: PUSH (x[i], oprnd)char: if TOP (optor) < x[i] then

PUSH (x[i], optor)

elserepeat

x2: = TOP (oprnd);POP (oprnd);x1: = TOP (oprnd);POP (oprnd);x1: = x1 TOP (optor) x2; PUSH (x1, oprnd);POP (optor)

until TOP (optor) < x[i];PUSH (x[i], optor)

end if endcaseValue: = top (oprnd)

Comparison of ADT’s with procedure – the advantages

1. GENERALIZATION

Procedures are generalization of primitive operations (e.g. +, -, *,….)

ADT’s are generalizations of primitive data types.

2. ENCAPSULATION (OR MODULARITY)

A procedure encapsulates all the statements relevant to a certain aspect of a program.

9

An ADT encapsulates all the definitions and the operations relevant to a data type.

How to implement an ADT?

Note that a data structure doesn’t have to be associated with an ADT.

Data Structure: A collection of data objects connected in various ways.

A data structure is always associated with a specific programminglanguage.

FORTRAN 77: the only data structuring facility is ARRAY

PASCAL: we have: ARRAY, RECORD, and POINTER

C: ARRAY, STRUCTURE, POINTER

Some important terms:

10

Cell: a box capable of holding a value drawn from some basic or composite data types (e.g. integer, record…)

CELL IS BASIC BUILDING BLOCK OF DATA STRUCTURE

Pointer: a cell whose value indicates another cell

Cursor: an integer-valued cell, used as a pointer to an array

11

Example: A simple data structure is given below.It may be used in the implementation of ADT MATRIX.

A p o i n t e rCELL

a11

a11 a11

a22 a22a22

am1 am1 am1

12

TypeCell type = record

Element: real

Down: cell type

Right: cell type

End

13

Example: A data structure below way be used in the implementation of ADT LIST

1

2

3

4A CURSOR

L = 7.8, 1.2, 5.6, 3.4

1

2

3

4

1.2 3

3.4 0

5.6 2

7.8 1

2 4

1.2 3

3.4 0

5.6 2

7.8 1

TypeRecord type = RecordCursor = integer;

Ptr: Record typeend

-1 ≡ uil pointer cursor 0 ≡ uil pointer

14

ALGORITHM VERSUS PROGRAM

An algorithm is a finite sequence of instructions satisfying the following criteria.

1) Definiteness : - each instructions must be clear and unambiguous

2) Finiteness: - the algorithm will terminate after a finite number of steps for all cases.

3) Effectiveness:- each instruction can be performed using a finite amount of resource (time and space)

A (well-defined) program in principle is similarly described, but program:

1) is always associated with a specific programming Language

2) May not half (e.g. Operating systems)

All the programs. We are interested in; half pseudo-Pascal is our chosen language

WE WILL USE ALGORITHM AND PROGRAM INTERCHANGABLE

15

Examples:

Proc search (x: integer; A : array [1…10] of integer)i=1;while x <> A[i] and I <= 10 do i:= i + 1;search: = i

end

proc print ( S: set) Print the elements of Send

proc Pi print all the digits of Pi never endsend

16

The running time of a program depends on

1) Computer speed2) Compiler quality3) Input to program4) Program efficiency (or quality)

The TIME COMPLEXITY of a program is defined as a function of input, usually the SIZE of input.

17

Program A is of worst case time complexity T(n) if the maximum running time of A on any input of size n is T(n).

THE UNITS OF T (n) ARE UNSPECIFED.

Although the constants in T (n) are important, we are more interested in the growth rate (or order) of T (n).

e.g. 2n ≈ 10n + 1000

2n << n2 when n is large

f (n) << g (n) ↔ lim f (n) n∞ g (n)

18

IMPORTANT DEFINITION

T (n) is 0 (f(n)) if there are constants C and n0 such that

T (n) ≤ C f (n) when n ≥ n0

Note : 0 (f(n)) actually denotes a class of functions of the same or slower growth rates, and it would be better to write

T (n) Є 0 (f (n))

No

19

Examples:

3n2 + 16n + 8 = 0 (n2)

C = 4 n0 = 17 n > 17 3n2 + 16n + 8 ≤ 4 n2

--------------- ----- T (n) f (n)

n logn = 0 (n2)

n > 0 n logn < n2, jo C = 1, n = 0

3n3 – 6n2 ≠ 0 (n2)

If n > 0 THEN 3n3 -6n2 >Cn2 3n – 6 > CHence if n > C + 6 then 3n3 – 6n2 > Cn2

3

Stands for “is”, not “equals”

V C Э n0 n > n0 3n3 – 6n2 > Cn2

20

kΣ ai ni = 0 (nk) when ak > 0i= 0

106 = 0 (1) = 0 (2)

100n + 105 = 0 (n)

n4 + n2 + n + 6 = 0(n4)

2n + n100 = 0(2n)

3n >> 0 (2n)

log10 n = 0 (log2n) since log10n = log2nlog210

0 (f(n)) is an upper bound of the at the growth rate order of T (n) if T (n) = 0 (f(n))

21

To specify a lower bound, we use Ω.

DEFINITION:

T (n) is Ω (f(n)) if there is a constant C such that T (n) ≥ c f(n) infinitely of ten.

½ n + 100 = Ω (n) T (n)

F (n)

C = ½ ½ n + 100 > C n

T (n) = n n is odd & n ≥ 1

n2 /100 n is odd & n ≥ 1

T (n) = Ω (n2)

C = 1/100 T (n) ≥ C n2 for n = 0, 2, 4, 6,

22

WHY IT IS IMPORTANT?5n2

2n n3/2100n

3000

2000

1000

5 10 15 20 n

Running times of 4 programs

1000 jek ≈ 17 minutes

23

HOW LARGE A PROBLEM CAN WE SOLVE

SUPPOSE THAT WE NOW WE BUY A MACHINE THAT RUN 10 TIMES FASTER AT NO ADDITIONAL COST. THEN FOR THE SAME COST WE CAN SPNED 104 SECONDS ON A PROBLEM WHRE WE SPENT 103 SEC BEFORE Running time

T (n)Max Problem size

for 103 secMax problem

size for 104 secIncrease in

Max problem size

100 10 100 1000%

5n2 14 45 320%

n 3/2 12 271 230%

2n 10 13 130%

THE 0 (2n) PROGRAMS CAN SLOVE ONLY SMALL PROGRAMS NO MATTER HOW FAST THE UNDERLYING COMPUTER IS.

24

THEOREM

IF T1 (n) = 0 (f(n)) AND T2 (n) = 0 (g(n))

THEN T1 (n) + T2 (n) = 0 (max (f (n)), g (n)).

PROOF

THERE ARE c1, n1, c2, n2 SUCH THAT

n ≥ n1 T1 (n) ≤ c1 f (n)n ≥ n2 T2 (n) ≤ c2 g (n)

LET n3 = max (n1,n2). THEN

n ≥ n3 T1 (n) + T2 (n) ≤ c1 f (n) + c2 g (n) ≤ (c1 + c2) max (f(n), g (n)).

ENDPROOF

HENCE: 0 ( f (n)) + 0 (g (n)) = 0 (max (f (n),g (n)))

0 ( n2) + 0 (n3) = 0 (n3)0 (n2) + 0 (2n2) = 0 (2n2) = 0 (n2)

25

THEOREM

IF T1 (n) = 0 (f (n)) AND T2 (n) = 0 (g (n))

THEN T1 (n) T2 (n) = 0 (f (n) g(n))

PROOF

THERE ARE c1, n1, c2, n2 SUCH THAT

n ≥ n1 T1 (n) ≤ c1 f (n)n ≥ n2 T2 (n) ≤ c2 g (n)

LET n3 = max (n1,n2). THEN

n ≥ n3 T1 (n) T2 (n) ≤ c1 c2 f(n) g (n)

ENDPROOF

HENCE: 0 ( f (n)) 0 (g (n)) = 0 (f (n) g (n))

0 ( n2) 0 (n5) = 0 (n7)0 (n2) 0 (2nh) = 0 (n22h) = 0 (2h+2logh)

OTHER IMPLICATIONSf(n) f (n)

Σ 0 (g (i, n)) = 0 ( Σ g (i,n)) i=1 i=1

max (0 (f(n)), 0(g (n)) = 0 (max(f (n)), g(n)))

26

0 (f(n)) = 0 (g (n)) * ASYMMETRIC!

0 (f (n)) ≤ 0 ( g(n))

Means IS

MY CAT IS BLACK ≠ BLACK IS MY CAT

0 : FUNCTION → SET OF FUNCTIONS

0 (n2)n2 0

DEF: 0 (f (n) 0 (f (n) == 0 (f (n) g (n))

Any Operator+, ., ETC.

N2

1000 n2 + 5

27

OTHER USEFUL RULES

f(n) 0 (f (n))

C 0(f(n)) 0 (f(n))

0 (f(n)) + 0 (f(n)) 0(f(n))

0 (0 (f(n)) 0 (f(n))

0 (f(n))0(g(n)) 0 (f(n))g(n))

0 (f(n)g(n)) f(n)0(g(n))

REMEMBER: HERE IS ASYMMERIC!

28

CALCULATING COMPLEXITIES OF ALGORITHMS

Procedure : bubble (var A : array [1…..n] of int);

BUBBLE SORT A INTO INCREASING ORDERVar i, j, temp: interger:

Begin

1 for i: = 1 to n -1 do 2 for j:=n down to it 1 do

3 if A[j-1] > A[j] then begin 4 temp: = A[j-1]; swap

A[j-1] and A[j]5 A[j-1] := A[j];

6 A[j] := tempend

(3) – (6) TAKES 0(1)

(2) – (6) TAKES (n -1) 0(1) + 0(1) = 0(n-1)

(1) – (6) TAKES:

n n

Σ [0(n-i) + 0(1) = 0 ﴾Σ (n-i) = 0 (n(n-1))i=1 i=1 2

29

0(n 2 -n) 0 (n2) 2Function test (m: integer) : Boolean;TESTS IF m IS A POWER OF 2, I.E. M2K FOR SOME kBegin if m =1 then test:= true

else

if (m mod 2 = 0) then test:= test (m/2)

else test:=false endLET T(m) = time complexity of test

1→C1

2 →C2

3→C1

4→C2 + T(m/2)

5→C2 C1 +C2 m =12c1 + c2 m odd, m >1

T (m) = 2C1 + C2 + T (m/2) m even

A recurrence equation

Define a new function:

30

C1 + C2 m ≤ 1T’(m) =

2C1 + C2 + T’ (m/2) m > 1

Then T(m) ≤ T ‘(m) for all the m > 0 i.e. T ‘(m) is an upper bound of T (m).

Note: T ‘(m) is defined for all real numbers.

T ‘(m) = 2C1 + C2 + T’ (m/2)

= 2(C1 + C2 )+ T’ (m/22) = 3(2C1 + C2 ) + T’(m/23 )

… = [log2m] (2C1 + C2) + T’ m

2[log2m]

= (2C1 + C2 ) [log2m] + C1 + C2

= 0(logm)

THUS: T’(m) = 0 (logm), T(m) = 0 (logm)

Worst case occurs when M= 2k

31

Ceiliuy :

[x] is the smallest integer ≥ x→e.g. [1.5] = 2

[3.1] = 4[3.0] = 3

NOTE THAT:

2[log2m] ≥ m

IF m = 2k log2m = k,

[log2m] = k and 2[log2m] = m

m = 6

log24 = 2 & log2 8 = 3 → 2 <log26 < 3 → [log26] =3 → 2[log2m] > m

32

Problem:

What is T (m) ? Is m the length of input!

M is 100, 15, 64, etc, just number!!

IF m is BINARY and is the number of bits of m, THEN

n = [log2m]

And T (m) = 0 (log2m) → T (n) = T ([log2m]) = 0 (log2m)

i.e. T (n) = 0 (n)

M CAN BE TREATED AS : 000….0,m

I.E. m UNITS, THEN

THEN “LENGTH” OF M IS m, and

T (n) = T (m) = 0 (log n)

33

DESIGN OF A PROGRAM

TOP – DOWN / BOTTOM – UP APPROACH

STEPWISE REFINEMENT, COOSE ADT’S AND DATA STRUCTURES

CODING

A REMARK ABOUT RUNNING TIME

ALTHOUGH THE ORDER OF RUNNING TIME IS VERY IMPORTANT, WE SHOULD ALSO CONSIDER THE FOLLOWING FACTORS IN PRACTIC.

1. THE TIME IT TAKES TO WRITE AND DEBUG THE PROGRAM

2. READABILITY, MODULARITY, ETC. HOW HARD IS TO MAINTAIN THE PROGRAM

3. SOMETIMES CONSTANTS ARE ALSO IMPORTANT

4. SPACE (OR STORAGE) COMLEXITY

5. ACURACY

34

ADT LIST

A list is a sequence of zero or more elements of a given type (element type).

L = a1, a2, a3, …., an

length = nfirst = a1

last = a1 some data type

ai is at postion i

ai1 precedes ai

ai followa ai1

END(L) = position n+1

Operations

INSERT (x,p,l); DELETE(p,L);

LOCATE (x,L); RETRIEVE(p,L);

MAKENULL(L): L ← Є

FIRST(L); NEXT(p,L); PREVIOOUS(p,L)

PRINT(L); LENGETH(L); REVERSE(L)

CONCAT(L1, L2); etc. EMPTY(L)

Array implementation of lists

35

Last 1

2 list

empty

max

Const max =?;type position = 1..max; LIST = record elements: array [positions] of elements type: last: o ..max end;function END (var L: LIST) : integer; begin END : = L.last+1 end;

last

a1

a2

an

36

1 2 3 p max

a1 a2 a3 ap an-1 an

Procedures INSERT (x: elements type; p: position; var L:LIST);

Var q: position begin if L.last = max then error (‘list is full) else if (p>L.last+1) or (p>1) then error else begin for q:=L.list downto p do L.elements[q+1]:=L.elements[q];

//shifting to the right// L.last:=L.last+1; L.elements[p]:=x End End;

Time co,plexity: INSERT, DELETE, LOCATE – O(n) RETIEVE, NEXT, PREVIOUS END, FIRST, MAKENULL – O(1)Avg. time INSERT, DELETE, LOCATE – O(n)

Pointer implementation (linked list)

37

Cell 0 cell 1 cell 2 cell n

..

Headerlist

L

Type Celltype = record Element : elementtype; Next: ↑ ceeltype

End; LIST = ↑ celltype;

Position = ↑ celltype;Position i : a pointer to cell i -1, 1≤ i ≤n+1

Function END(L.LIST) : positionVar

q: position begin q:=L; while q. ↑. Next < > nil do q := q. ↑. Next; END:=q End;

LIST: record

a1 an .

a2

38

first: ↑ celltype; last: ↑ celltype end;Insert x at p time O(1)

…… …..

p

Delete cell at p Time O(1)…

….

p

Time O(n)

LHeader p

PREVIOUS (p, L)

a b

a b c

39

INSERT, DELETE, RETRIVE, NEXT, FIRST, MAKENULL –O(1) PREVIOUS, LOCATE, END – O(n)Compare the two implementations

1. maximum size of the list – array

2. waste of space – both

3. operation speeds

array pointer

INSERT O(n) O(1)DELETE O(n) O(1) PREVIOUS O(1) O(n) END O(1) O(1) or O(n)

4. pointer representation can be dangerous!

e.g. q:=NEXT(p,L);INSERT (X,P,L)

. . .

IF q=NEXT(p,L) then

40

P q q≠NEXT(p,L)!

DOUBLY – LINK – LISTS Cell 1 cell 2 cell3 cell n

Type

Cell type = recordElement: elementtype;

Next, previous: celltype End; Position: celltype;

Position I: a pointer to cell I

Function LAST (L)Begin LAST : = L.previous

End;

a2 a1 a3 an

WHAT HAPPENS IF POINTERS AREN’T AVAILABLE? USE CURSOR!

41

PATTERN MATCHING IN STRINGS

ALPHABET A = a1, a2, …. , ak

SYMBOL/CHARACTER

A STRING x = a1, a2, …. , an n 0 , ai A

STRINGD A SPECIAL CAST OF LISTS

PATTERN MACTHING: x = a1, a2, …. , an

Pat = b1, b2, …. , bm

Is pat a substring of x?

i.e. ( I : 1 I n – m +1) ai ai+1 …. = b1b2 …bm

x = aabbbabbbaaa pat = bab

1234567891011x = aabbabbbaaa yes i= 4 pat = bab

pat = abab => No

42

SIMPLE ALGORITHM

x = aabbabbbaaa pat = bab

aabbabbbaaa NO bab aabbabbbaaa NO bab

aabbabbbaaa YES! bab

BUT FOR pat = aaa WE NEED TO MOVE

FROM aabbabbbaaa TO aabbabbbaaa aaa aaa

SIMILARLY for pat = abab, from 1234567891011

aabbabbbaaa TO aabbabbbaaa abab abab

8 = 11 – 4 +1

43

WORST CASE

x = a1, a2, …., am am+1 …..an-m+1 …. an b1 b2…bm b1 b2… bm

n-m+1 passesEACH PASS TAKES O(m) comparisons, HENCE(n-m+1) O(m) = O(m(n-m+1)) = O (mn)procedure find (x, pat : STRING; var found : Boolean; var i : position)

Found is set to false if pat doesn’t occur int x, otherwise found is set to take and I is set to the first position in x where pat begins)

Var p,q : position;

BeginIf not EMPTY (x) and not EMPTY (pat) then

Begin Found: = false; i:= FIRST(x);

while not found and i END (x) do

begin p:= i; q: = FORST(pat);

while RETRIEVE (p,x) = RETRIEVE (q, pat) and not found do

44

begin

p: = NEXT(p,x); q:= NEXT(q,pat)

IF q= END (pat) then found : = true End;

If not found then i:=NEXT(i,x) End; IF END(L) IS O(1) THEN T(n,M) = O(MN) ENDEND

THE KNUTH, MORRIS PRATT ALGORITHM ( KMP)

X = abaababaabacabaababaabaab MISMATCH

Pat= abaababaabaab

45

WHAT DO YOU DO NEXT?X =

Pat = NEXT MOVE

X =

Start comparisonX= abaababaabacabaababaabcab

abaababaabaab

math u

abaababaaba

uabaababaabacabaababaabaab abaaba baabaab

start comparing & mismatch

u w u c

u w u a

u w u c

u w u a

46

abaaba

abaababaabacabaababaabaab abaababaabaab

START COMPARING & MISMATCHU

aba uabaababaabacabaababaabaab

abaababaabaab start comparing & mismatchabaababaabacabaababaabaab abaababaabaab ↑ mismatch

abaababaabacabaababaabaab abaababaabaab COMPARING

NUMBER OF COMPARISON IS O (n) BUT HOW TO FIND

OUT WHAT IS U?

47

LET pat = b1b2 … bm

OR EACH 1 ≤ j ≤ m, LET

Largest i sud that 0 < i < j and b1… bi = bj –i+1 … bj f (j) =

0 if sud i does not exist

f (j) < j FAILURE FUNCTION

j 1 2 3 4 6 7 8 9 10 11 12 13 14Pat = a b a a b a b a a b a a Bf(j) 0 0 1 1 2 3 2 3 4 5 6 4 5

aba abaa abaab abaaba

abaabab abaababa abaababaa

abaababaab abaababaaba

abaababaabaa abaababaabaab

48

TIME COMPLEXITY:

T (n, m) = O (n + complexity of defining g) = O (n + complexity of defining f)

0 if j = 1

f(j) = fs(j -1) +1 where s is the smallest I such

that bfi(j-1) +1 = bj

0 if no such i exist

f i (j -1) = f (f(… f(j -1)…..)

i times

49

f 3 (j -1) = f (f (f (j-1)))T(j-1) T1 j - 1

jf(j- 1)

HERE f(j) = f (j -1) +1

j -1

i= f(j -1) j

u j -1

ww

j

f(i) = f (f(i-1)) = f2 (i-1)

u a u a

U b u a

a b a

50

f2(i-1) +1

proc fail (pat[1…m], vav f: away [1…m] of integer )

vav i, j = integer;

begin f[1] : = 0; for j: = 2 to m do

begin i:= f [j -1];

while (pat[j] ≠ pa[i+1 and i > 0) do i:= f[i]; if pat[j] = pat[i+1] then f[j]:= i+1

else f[j]:=0 andend

T (m) = O (m) !

51

Procedure KMP (x, pat, g, found, i);

x[1..n] , pat [1..m] are strings ;

g[j] = g (j), 1 ≤ j ≤ m

var p, q : position

begin if n ≠ 0 and m ≠ 0 then

begin p:=1; q:=1; while (p ≠ n+1 and q ≠ m+1) do

if x[p] = pat [9] then begin p: = p+1; q: = q+1; end else if q =1 then p:= p+1 else q: = g[q];Time = 0(m)

If q = m+1 then begin found : = true; i: = p-m

end else found :=false

end else ……end ;

52

ADT STACK

“ LAST-IN-FIRST – OUT” LIST (LIFD)

OPERATIONS:

MAKENULL(S) : make stack S empty

TOP(s) : Return the top element of s RETRIEVE (FIRST, S)

TOP (s) = RETRIEVE (I, S)

POP(S) : Delete the top element of S Sometimes POP is defined as function that returns the element being popped out DELETE (FIRST (s) , S)

POP (s) = DELETE (I, S)

PUSH (X,S); insert x at the top of S

PUSH (x, s) = INSERT (x, I , s) INSERT (x, FIRST(S), S)

EMPTY (s) : Return true if S is empty, false otherwise

53

A SIMPLE EXAMPLE:

F: erase characters, if cancels the previous uncancelled character

@: kill character, if cancels all previous characters on the line

abc # d @ aa#b = ab

Procedure EDIT

Var S : STACK OF CHAR; C: CHAR

Begin MAKENULL (S); Read (c);

While not end ( c ) do

Begin If c = ‘#’ then POP (s)

Else if c = ‘@’ then MAKENULL(S) Else PUSH (c, S); Read (c) End;

PRINT S IN REVERSE ORDER

End

54

ARRAY IMPLEMENTATION OF STACKS

TOP

1 1

2 force

K K

stack

max MAX

type : position = 1 … max; STACK = record

Top : 1 .. max +1;

Elements : away [position] of element type PUSH, POP, TOP – O ( 1)

k

1ST ELEMENT

2ND ELEMENT

LAST ELEMENT

55

MORE SPACE – EFFICIENT IMPLEMENTATION

POINTER IMPLEMENTAION

Stack

MANY STACKS IN ONE ARRAYTOP

12 tree

3

BOTTOM

12

3

Stack pace

a b c .

STACK 1

STACK 2

STACK 3

56

ADT QUEUE

A QUEUE IS A “First – in – First – Out” LIST > (FIFO)

OPERATIONS:

MAKENULL (Q);

FRONT (Q) : return the first element of Q

FRONT (Q) = retrieve (first (Q), Q)

ENQUEUE (x, Q) : inserts x at the end of Q ENQUEUE (X, Q) = INSERT (X, END (Q), Q)

DEQUEUE (Q): DELETES THE FIRST ELEMENT OF Q

DEQUEUE (Q) = DELETE (FIRST (Q),Q)

EMPTY (Q):

57

POINTER IMPLEMENTATION

header…

front

near

type celltype = record

element : elementtype;

next : ↑ celltype

end;

QUEUE = record Front, rear : ↑ celltype End;

FUNCTION EMPTY (Q : QUEUE) : Boolean; Begin

If Q. front = Q.rear then EMPTY: = true Else EMPTY : = false End

EACH OPERATION – 0 (1)

ana2a1

58

ARRAY IMPLEMANTATION

FRONT TREE

1

QUEUE

REAR

MAX

TREE

1ST ELEMENT

2ND ELELMENT

LAST ELELEMT

59

ENQUEUE – 0 (n)

CIRCULAR ARRAY IMPLEMENTATION(BUFFER!)

Max -1

max

real 1

2

an

…. .

a2 . . a1 A1

60

HOW DO WE DISTRINGUSH BETWEEN FULL AND EMPTY

MAINTAIN AN EXTRA BIT FULL ≡ (FRONT = addone(addone(real)))

1 if have been to (i, j) Mark [i, j] =

0 otherwise

IF NO WAY OUT, BEACK UP ONE CELL AND TRY A DIFFERENT MOVE

MUST STRORE THE CURRENT PATH SOMEWHERE

A PATH: (i, j), (i2, j2), …., (is, js)

(is, js)

(is-1, js-1)

.

.

.

.

(iz, jz)

(i, j)

61

STACK

Type offsets = record X: -1 …1 Y: -1 …1 End

NW N NE

(i-j, j -1) (i-1, j) (i-1, j+1)

W (j, j-1) (i,j) (i, j + 1) E

WS (i + 1, j -1) (i+1, j) (i+1, j+1) SE

62

Directions = (N, NE, E, SE, S, SW W, NW);

Var move : away [directions] of offsets

d Move[d] .x Move[d].y

N -1 0NE -1 1E 0 1SE 1 1S 1 0SW 1 -1W 0 -1NW -1 -1

Var maze : array [0 : m+1, 0 … n + 1] of 0 …1 Mark : array [1..m, 1…n] of 0…1

63

Type: dir = (N, NE, E, SE, S, SW,W,NW,D)Type: elementtype = recond DEAD END X: 1 … m; Y: 1 … n;

Start: dir End

STACK = ……..

Var path : STACK

Fuy NEXTMOVE (loc : elementtype) : dir

Var d: dir; S, r, i, j: out Found : bool;

begin i = loc.x; j := loc.y; d:=loc.start; found:= false;

while d# D ∩ not found do

Begin s: = move [d].x; r: = move[d].y; if maze [i + s, j+r] = 0 and mark [ i + s, j +r] = 0 then found : = true else d:= Succ (d)

end; NEXTMOVE : = d end

64

proc rat ( var: maze [0 … m+1, 0 … n+1] of 0…1);

var mark: array [1 …m, 1…n] of 0…1; path : STACK; location : elementtype; d: dir

function NEXTMOVE (X: elementtype): dir; ….;begin mark: = (0); MAKENULL (path); initialzation (should be specified last) location:= (1,1, E) ; mark [1,1] : = 1; PUSH (location, path );

While not EMPTY (pathy) dobegin location: = TOP (path) ; POP (path); d: = NEXTMOVIE (Location) if d = D then begin location.start: = succ(d); PUSH (location, path); location.x : = location.x + move [d].x; location.y : location.y + move [d].y; if location.x = m and location.y = n then begin print(path); return end else begin PUSH ((location.x, location.y, N), path); Mar(location.x, location.y) =1 End End End endend

65

TIME COMPLEXITY OF RAT : O (mn)

SPACE COMPLEXITY OF RAT :O (mn)

BUT WITHOUT SING MARK O (8mn) = O (2mn)

An application of queues - breadth-first search in trees

tree T

V1

V2 V3

V4V5

V6

V11V9

V7 V8 V10

a1

a10

a2

a11

a6

a4

a3

a9

a5

a8a7

66

binary LEFT (v), RIGHT(v) left child right child of V of V

e.g., LEFT (v3 = null, RIGHT (V3) = V6 DATA(V1) = a1

ROOT(T) = v1

Searching in tree

Given tree T and data x,

find a node v of T s.t. DATA(v) = X.

Possible approaches:

1. Breadth-first search

try level 1, then level 2, then

level 3, ...etc.

2. Depth-first search:

search along the leftmost path until the leaf is reached, then back-

up, try the 2nd leftmost path, ...etc.

67

Breadth-first

X = 20

V1

V2 V3

V4V5

V6

V12V10

V7 V8 V9 V11

Searching v1, v2, v3, v4, v5, v6, . . .

Depth-first

Searching order: v1, v2, v3, v4, v5, v6, . .

10

7

50

30

20

2

5

5

60

24 20

68

Procedure DSearch(x,T)

begin if x = DATA(ROOT(T))’ then PrintROOT(T) Else DSearch left subtree; DSearch right subtree; end;nonrècursive version

procedure DSearch (x,T); var path : STACK of nodes v: node; begin v := ROOT(T); MAKENULL (path); PUSH(v,path); while not empty (path) do begin v := TOP(path); pop(path); if DATA(v) = X then Print v elsel PUSH (LEFT(v)); PUSH(RIGHT(v)); e // swap// end end

Time: 0(n)Space: 0(n)Space avg: 0(Iogn)

69

Procedur BSearch(x,T)

Var level : QUEUE of nodes; V : node;begin v := ROOT(T); MAKENULL(level); ENQUEUE(vjevel);

while not empty (level); begin V := FRONT(level); DEQUEUE(Ievel); if DATA(V) = x then Print v; stop else begin ENQUEUE(LEFT(v), level); ENQUEUE(RIGHT(v), level) end endend;

Time:O(n) n =/T/ --------------size of T

Space: 0(n)

Avg :0(n)

70

Application – implement a DOS command cd:\

Cd:\ name – change current directory to subdirectory name

What should cd:\letters do?

BFS DFS?

When do we use DFS?

e.g., solution tree

A:

jobletters

WP 5.0letters

project letters

study

homework

71

Proc. A(x1, x2,....)

Var y1, y2, …Begin

.

.A(a1, a2, …)

L1 …. ….

Proc. A(x1, x2,....)Var y1, y2, …

begin.....

A(b1, b2, …)L3 .

.

.

.

.

.

Proc. A(x1, x2,....)Var y1, y2, …

begin.....

B(c1, c2, …)L3 .

.

.

.

.

.

.

Proc. B(x1, x2,....)Var f1, f2, …

begin.............

72

Ellmination of Recursion

Sometimes it is absolutely necessary to eliminate recursive

• recursive calls are not supported e.g., FORTRAN

• speed is the first priority - do it by yourself

Solution: STACK of activation records

Generally, an activation record holds

1. current values of the parameters (pass by value)

2. current values of the local variables

3. a label indicating return address

Assume that if procedure p(x1, x2 …. var y1, y2, ….)

then the recursive call is p(a1, a2, …, y1, y2, ….)

73

General Rules:

Procedure P (x1, x2: int; var y: int);

Var i, j: int;

Begin

____________________________________;

____________________________________; . . .

P(a1, a2,y)

_________________________________________;

_________________________________________;

. . .

end;

74

Example 1

procedure Ackerrnann (m,n:integer, var A:int);

1. begin if n<O or m<O then wnteln(“error”) else if m=O then A:=n+1 else if n=O then Ackermann (m-1 ,1 ,A) else begin Ackermann(m,n-1 ,A);2. Ackermanfl(m-i ,A,A);3. end end;

75

Recursion Elimination

procedure Ackerrnann(m,n:int; var A:int);

label 1,2,3;var

S : STACK of recordm, n, l:intend;

1:2.3;begin

MAKENULL(S);

1: if n<O or m<O then write!n(uerrorx) else if m=O then A:=n+1 else if n=O then begin PUSH((m,n, 3), S); m:=m- 1; n:=1; goto 1 end else begin PUSH((m,n,2), S); n:=n-1; goto 1;

2. PUSH((m, a, 3), S); m:=m-1; n:=A; goto l;

end;

76

3. if not EMPTY(S) then begin

(m, n,1):=TOP(S); POP(S); case 1 of 2: goto2; 3: goto 3; end end

end; Ackermann

More details in [AHU] pp. 64- 69.

[HS] pp. 150-153.

* The method works only when

no pass-by-reference parameters or, same p-b-r parameters are passed each time (e.g.,

function)

General case???

POINTER!!! p(...,var x:type1)

p(...,xp:↑type1)

77

no global variables

procedure R(x:integer var y,z: integer);

var i: integer;

begin

---------------------

---------------------

y:=x*i

-------------------

-------------------

R(a,i,y);

-------------------

------------------

end;

Trees

Basic Terminology

78

1. a single node is a tree, also the root.2. if T1,T2 are trees with roots n1, n2, …., nk. Then n

nk

T= n1 n2 TR

T1 T2 a subtree of n(and of T)Is a tree with root n.n1, n2, …, nk are the children of n.actually. A rooted tree or oriented

siblings

n is the parent of n1, n2, …, nk

10

12

4

89

3

75

1

2

611

79

Note the every node (except root) has a unique parent.

A node with no children is a leaf.

A non-leaf node is also called an internal node.

n1

n2

n3

nk

n1, n2, n3,…., nk is a path of length k-1 from n1 to nk

Note: n1 is a path of length 0 n1 is an ancestor of nk

nk is a desendent of n1

height of n: length of the longest path from n to a leaf

height of a leaf is 0!

depth of n: length of the unique path from root to n.

depth of root is o!

80

height (or depth) of a tree: height of root.

Order of nodes

in a tree, siblings are ordered from left-to-right

(ordered)

≠

if n is to the left of n2 then all descendents of n1 are to the left of all descendents of n2

Tree Traversals

T

a

b c

a

bc

n

T1 TR81

Preorder traversal of T is

n, preorder traversal of T1

DFS preorder traversal of T2

preorder traversal of Tk

Inorder traversal of T is

i.t. of T1 n, i.t. of T2 ..., i.t. of Tk

Postorder traversal of T isp.t. of T1 p.t. of T2, ..., p.t. of Tk, n.

↑ evaluation of expression trees, divide-and conquer

Example

T2

1

82

Preorder: 1, 2, 5, 3, 6, 7, 4, 8, 9, 10

Inorder: 5, 2, 1, 6, 3, 7, 8, 4, 9, 10

Postorder: 5, 2, 6, 7, 3, 8, 9, 10, 4, 1

Preorder: we list a node the first time we pass it

Postorder: we list a node the last time we pass it

Inorder: we list the first time, but list an interior node the second time we pass it

procedure Preorder (T:tree); var v: node;

begin

8

1

10

2

4

7

3

9

5 6

83

V := ROOT(T); Print v; for each subtree T of v, from left to right do Preorder (T)end;

time complexity: O(|T|) ← number of nodes

Pre/In/Post

space complexity: 0 (height of T) ← stack

Procedure Preorder (T:tree); // no stack //

var v: node;

84

begin

V := ROOT(T) while v ≠ null do begin print v; if v ≠ leaf then v := 1st child of v

elseback up until while v ≠ null andv is not the v = last child of Parent(v)last child of doparent(v) v := Parent(v); if v ≠ null then v := next sibling of v end end;

time = O(|T|) if parent () is 0(1)

space ÷?

Reconstructing a tree from its traversals

Preorder and Postorder traversals are sufficient.

85

Preorder and Inorder traversals aren’t sufficient.

Inorder and Postorder traversals aren’t sufficient.

example trees?

Any single traversal isn’t sufficient.

(pre/in/post)

Labelled Trees, Expression Trees

*

a

b c

a

be

d d e

c

n1

86

+ +

a ab c

n2 represents a+b

n3 represents a+c

n1 represents (a + b) * (a + c)

Evaluation can be done by a postorder traversal.

pre/in/post-order listings give

prefix (Polish), infix, postfix (Reverse Polish)

↑ ↑ ↑*+ab+ac a+b*a+c ab+ac+*

ADT TREE

1. PARENT (n,T).: node. If no parent return null node.

2. LEFTMOST-CHILD (n,T) : node

n4 n5

n2

n6

n3

n7

87

3. RIGHT-SIBLING (n,T): node

returns the sibling immediately following n.

4. LABEL (n,T): label

≡ DATA(n,T)

5. ROOT(T) : node

6. MAKENULL(T)

7. CREATEL (v1, T1, T2 Ti ): tree; i=O,1,2,...

v

T1 Ti

T2

Alternative: ATTACH (T1 T2 ) : tree

8. DELETE (n,T) - delete the subtree rooted at n.

a1

a2 a5

a3 a4

n

n1

n2

n3

n5

n4

88

a7 a9

a6 a8

LEFTMOST-CHiLD (n1, T = n2 )RIGHT-SIBLING (n1, T) = n4

RIGHT-SIBLING (n7, T) = ^

procedure PREORDER (n:node);

//list labels of descendents of n in T (global) in preorder!!

varbegin print LABEL (n,T); n := LEFTMOST-CHILD (n,T) while n ≠ ^ do begin PREORDER (n); n := RIGHT-SIBLING (n,T) endend;

Array Implementation

a a

bb

n6 n6 n8n9

1

23

910

876

54

89

c b a

a c

b

1 2 3 4 5 6 7 8 9 10 parent (10, T)

0 = ^

label

node i is to the left of node j then i < ji.e., number siblings from left to righte.g., preorder, even inorder

type node = 1 .. max cell = record parent : 0 … max; label : labletype end; THREE = array [1… max] of cellfunction LEFTMOST-CHILD(n:node; T:TREE):node; var i:integer

begin i : =1;

0 1 1 2 2 5 5 5 3 3a b a c b a b c a b

90

while i ≤ max and T [i] . parent ≠ n dotime: i := i+1;O(|T|) if i > max then LEFTMOST-CHILD := 0 else LEFTMOST-CH := i end;

function RIGHT-SIBLING(n:node; T:TREE):node;

var i:integer parent:node;

begin

parent := T[n].parent;time i := n+1;O(|T|) while i ≤ max and T [i]. parent ≠ parent do i := i+1; if i > max then RIGHT-SIBLING := 0 else RIGHT-SIBLING := 1 end;

Trees as lists of children

Label children node right sibling

91

1 2

3 4 5 6 7 8 9 10

node space

type node = 1 .. max LIST = … TREE = record header : array [1..max] of LIST; labels : array [1..max] of labletype root : node end;

no matter how LIST is implemented,LEFTMOST-CHILD; RIGHT- SIBLING _ 0(1) PARENT – 0(|T|)

If want 0(1) for all, add parent field

Considering CREATE (n, T1, T2, …, Ti);

node space

1 2

.

.

.

.

.

.

A

C .

B

G .ID .F .E

H .

6 7 8

6 4 .

92

T1 3 4 5 6 T 7 8 9 T2 10 11 12 13 14 15

T1 T2

SimplifiedLeftmost=child & right-sibling representation

10 .

2 12 .

11 148

I

A

C

E

HGF

D

B

A

93

Leftmost label right Child siblings

3

5

7 8

Var cellspace : array [1..max] of record Label : labeltype; Leftmost-child, right-sibling:0 .. max End

SUMMARY1. Array of Parents

• PARENT--O(1)

• LEFTMOST-CHILD, RIGHT-SIBLING - O(|T|)

8 B 5

0 C 0

3 A 00 D 0

BC

D

94

ALL-CHILDREN — 0(m)

• simple, space-efficient

2. List of Children

• LEFTMOST-CHILD - 0(1)

• PARENT, RIGHT-SIBLING -- 0(|T|)

• can store several trees, CREATE

3. Leftmost-child, Right-sibling

• LEFTMOST-CHILD, RIGHT-SIBLING -- 0(1)

• PARENT — O(|T|)

• make tree, CREATE, slightly more space than (2)

BINARY TREES

A node is a binary tree

If T is a binary tree, v is a node, then

95

If T1, T2 binary trees, v a node then

A binary tree is NOT a tree!!!

≠

Binary Trees A child is either a left or right child

Binary tree are not really trees

VV

T

T

T1

V

T2

BA

B

A

96

full binary tree: every internal node has two children and leaves have the same depth

complete binary tree: obtained from a full binary tree as follows: fix a leaf and delete all the leaves to the right of it

• no. of nodes of depth i ≤ 2i

isize of a binary tree of depth i ≤ Σ 2i = 2i+1 -1

j=0

If complete, 2i -1 < size ≤ 2i+1 -1

If full, size = 2i+1 -1

size-1

≥ Depth

≤ log2 (size +1) -1

Binary tree traversals

v

T1 T2

97

Preorder (T):

V, preorder (T1), preorder (T2)

Inorder (T):

* Inorder (T1), v, inorder (T2)

Postorder (T):

Postorder (T1), postorder (T2), v

How to reconstruct a binary tree from its traversals?

Just Preorder (or inorder or postorder) traversal is not enough.

Preorder & Postorder aren’t enough!

v

T2

a a

98

Preorder and Inorder

a1, a2, ….an b1, b2, …, bn

1. Find i s.t. a1 = bi

Then T1 = Reconstruct (a2, … ai,bi, …., bi -1)

T = Reconstruct (ai+1, …., an, bi+1, …, bn)

T =

Posorder & inorder similar

Representation of binary trees

bb

a1

T1 T2

A

B . C

99

Type node = record label : labeltype; left, right : ↑ node end

TREE ↑ node;

Notes: 1. cursors may also be used.

2. if operation PARENT ( ) is crucial, a parent field could be included.

3. but if traversal is the only concern, then the parent field is not really needed.

procedure PREORDER(T:TREE); var temp, tempparent, tempchild; procedure BACKUP; var stop : boolean; begin // find the successor of temp in preorder traversal// stop false; temp:=tempparent;

. E . . F . . D .

100

while temp ≠ nil and not stop do begin if temp ↑. tag = 0 then begin tempparent := temp ↑. left; temp ↑ .left := tempchild; if temp 1. right ≠ nil then begin tempchild := temp t right; temp ↑. right := tempparent; temp ↑. tag := 1 temparent : = then temp := tempchild; stop:= true; return; end else begin // tempt. tag = 1 // tempparent := temp ↑ .right; temp ↑. right := tempchitd end; tempchild := temp; temp := tempparent end end; // end of backup //

Begin // print nodes of T in preorder //

temp := T;tempparent := nil; while temp ≠ nil do

101

begin Print temp ↑. label; if temp ↑. left ≠ nil then begin

tempchild := temp ↑ . left; temp ↑. left := tempparent; temp ↑ . tag := 0; tempparent := temp; temp := tempchild end

else if temp ↑ . right ≠ nil then begin tempchild := temp ↑ . right; temp ↑. right := tempparent; temp ↑. tag := 1; tempparent := temp’ temp := tempchild end else //temp ↑ .left = temp ↑. right = nil //

BACKUP end

end; end of PREORDER

Threaded binary trees

102

0 → left = leftchild lefttag = 1 →left = leftthread (predecessor) in inorder

0 → right = right childrighttag = 1 right thread (successor)

predecessor/successor in inorder can be found without using stack orflipping

Representation of complete binary trees

A

0 0

0 1 0 0

. 1 1 . . 1 1

1

2

103

B C D E

H F G

1 2 3 4 5 6 7 8 9 10 11 12 13 T

← largest integer ≤ i/2

parent of node i = [1/2], 1 < i ≤ n

left child of node I = 2i, 2i ≤ n

right child of node I = 2i +1, 2i +1 ≤ n

type THREE = recordn : 0 .. max;labels : array [1..max] of labeltype

end;

A B C D E F G H I J

4

8

3

610

5

79

B

A

C

104

Var

temp, tempparent, tempchild : ↑ node;

tag = 0 → left points to parent

1 → right points to parent

type

node = recordlabel : labeltype;left, right: ↑ node; tag:0..1;end;

TREE = ↑ node;

An a of binary trees - Huffman codes

characters : a1,a2, …. Ak = A

string or message : x1, x2,….xn є A

. E .

. G .

. H .

D .

. F .

105

p(ai) - the probability that a will appear in a message

Encoding: assign a binary code c(ai) for each ai

c(x1, x2…xn) = c(x1)… c(xn)

Decoding: given code bib find the unique message

x1,x2….xn such that C(x1,x2… xn) = b1b2 …bm

Average code length:

k

Σ p(a1).| c(a1) |i =1

| c(a1)| : length of c(a1)

character probability code 1 code 2 code 3

106

abcde

.30

.10

.10

.10

.40

000001010011100

01001000110001

0001100001

average 3 2.1 1.7

Prefix property: c(ai ) is not prefix of c(ai ) for any j ≠ i

e.g., Code 1 and Code 2 have prefix property,

Code 3 doesn’t!

Claim : prefix property makes decoding easy e.g., comsider

Decoding code 000 ….

Code 1 a … on-line decoding

Code 2 d …

Code 3 ??

Huffman Code - an optimal (least average code length) prefix code

107

Algorithm Huffman (a1, a2, …. an);

//find Huffman code c(a for each ai //

Let a and a be two characters such that p(ai) and p(aj) are the lowest among a1, a2, …., an;

Let a be a new character and p(a’) = p(ai) + p(aj);

Huffman (a1, a2, …, an - (ai,aj + a’);

c(ai) = c(a’) 0;

c(aj) = c(a) 1;

end;

Example: a,b,c p(a) = 0.5, p(b) = 0.3, p(c) = 0.2

Hufiman (a,b,c) c(a)=0, c(b)=10, c(c)=11

Huff man (a,[bc])=> c(a) = 0, c([b]) = 1

Binary tree representation of prefix code 0

1 1

108

0 0 0 0 0 e a d 0 1

0 1 0 1 0 b c a b c d e

code 1 code 2

typenode = record

left, right :↑ node;probability : real ;character : a1, a2, …., ak)end;

used only in leaves

A more efficient implementation is given in [AHU] pp.94 -101

example a b c d e f g h

109

.10 .20 .05 .05 .10 .30 .10 .10

1)

a b c d e f g h

2) a b e f g h

c d

(3) & (4)

called a forest a

e g

c d

b f h

(5)

.10 .20 .05.05 .10.10.30.10

.10 .20 .10.10 .10.10.30

.05 .05

.20

.05.05

.10.10 .10

.20

.10

.10.30.20

.20.20

110

called a forest a

e g

c d

f

bh

(6)

ge

a c d

(7)

.05

.10

.10.10 .10.10

.20

.30

.30.30

.40

.10.05.05

.10.10

.10

.20

.20

111

ge

a c d

f

b h

(6) 10

.40

.10.05.05

.10.10

.10

.20

.20

.30.30

.10

.60

.20

112

10

10

f

0 10 1 0 1

0 1e g b h

a

c d

using a modified preorder listing, we can print the Huffman codes for the characters (using a stack)

Algotithm Huffman-Tree;

// construct a huffman tree for characters a1,a2,….,an// var forest: array [1… max] of THREE;

p:real;begin

for i:=1 to n do begin

new(forest[i]); forest[i] ↑. left := nil;

113

forest[i] ↑. right:= nil; forest[i] ↑. probability := p(ai);

forest[i] ↑. Character := ai

end;

while forest contains more than one tree do begin

i := index of the tree with the smallest prob.; j := index of the tree with the second smallest prob.; p := forest[i] ↑. prob + forest [i] ↑. Prob.;

forest [i] := CR EATE2( (p,-) ,forest[i],forest[j]); Delete tree forest[j] EndEnd

A set is a collection of elements/members

114

Notes:

1. An element can be a set!

2. A set can be infinite or empty.

3. Usually (in this course), members of a set are of the same type.

4. Members of a set are different (otherwise, a multiset).

5. Members could be nearly ordered.A relation is a linear order on some set S

(i) for any a + b in S. exactly one of a<b, a+b, a>b is true. (Trichotomy)

(ii) for a,b,c in S,

a<b, b<c ==> a<c (Transitivity)

115

Some notation:

S = a1, a2, …an

or S = (x|x satisfies condition?)

e.g. 1,2,...,10 = (x|x is an integer and 1 ≤ x ≤ 10)

Ø =

Membership: x є S, x ∉ S

inclusion: S1⊆ S2 S1⊈ S2

(subset) S1⊆ S2 iff S1 ≠ S2 and S1 ⊆ S2

superset S1⊇ S2

proper superset: S1⊇ S2

Union: S1∪ S2 1, 2∪ (2, 3)=1,2,3

Intersection: S1∩S2 1, 2 ∩1, 3 = 2

Difference: S1-S2 1, 2 – 2, 3 = 1

ADT SET

116

1. MAKENULL(S): S:=ø

2. INSERT(x,S): S:=S∪ (x

3. DELETE(x,S): S:=S-x

4. MEMBER(x,S):true iff x ∊ S

5. ASSIGN(A,B): copy B into A

6. EQUAL(A,B): true iff A = B

7. UNION(A,B,C): C:= A∪B

8. INTERSECTION(A,B,C): C:=A∪B

9. DWFERENCE(A,B,C): C:=A-B

10. MERGE(A,B,C): if 4∩B=Ø, C:=A∪Botherwise C undefined

11. MIN(S): returns the minimum element In S assuming S is linearly ordered

12. FIND(X): disjoint A1,A2 ,…An - globalfind the unique A1 St. X ∊A1

13. SIZE(S). *SUBSET(A,B) COMPLEMENT(A)

SET with Union, Intersection, DifferenceExample – data-flow analysis

GEN = 1,2,3B1 KILL = 4, 5, 6, 7, 8, 91. t: = ?

2. p:= ?3. q:= ?

4. read (p)5. read (q)

q ≤ p ?6. t : = p7. P : = q

8. q : = t

P mod q =0

Write (q)9. t : = pmodq117

B2 GEN = 4,5 KILL = 2, 3, 7, 8

GEN = KILL

yB3

GEN = 6 KILL = 1, 9

B4

GEN = 7, 8 KILL = 2, 3, 4, 5

B6

GEN = KILL = ∅

B6 yGEN = KILL = ∅

B7GEN = 9KILL = 1,6

B8 GEN[i] = data definition in B1 KILL[i] = d|d ∊Bi & ∊d ėBi

defining same var as D

DEF1NE[i] d|∃ a path B1….BiBi, such that d is the last definition of the variable defined d in the path

118

reaching definitionsof Bi

DEFIN = (1,4,5)

DEFIN = (4,5,6,7,8,9)

GEN[i]= data definitions in Bi

KILL[i] = (data definitions not in B), but defining the same variablesas GEN[i]

DEFOUT[i] = d|(same as in DEFIN[I] except “Bi…BiBi”)

leaving definitions

DEFOUT[i] = (DEFIN[i] – KILL[i]) ∪GEN[i]

DEFIN[i] = ∪ DEFOUT[i]

Bi is apredeceasorof Bi,

i.e. there is an arc from Bi to Bi)

Algorithm dataflow ( GEN;KILL; var DEFIN);

Var temp SET; i = integer;

119

changed : boolean;

begin no. of blocks

for i:= 1 to n do begin MAKENULL(DEFIN[i]); MAKENULL(DEFOUT[i]) end; repeat changed := false; for i:= 1 to n do begin DIFFERENCE(DEFIN[i], KILL[i], temp); UNION (temp. GEN[i], temp); If not EQUAL (temp, DEFOUT[i]) then ASSIGN (DEFOUT[i], temp); Change : = true; End;

For i:= 1 to n do begin MAKENULL(DEFIN[i]); for each predecessor Bi of Bi do UNION(DEFIN[i], DEFOUT[i],DEFIN[i]) end;until not changed;end;

Example

1. read (x)2. read (y)

120

B1 GEN[1] =1,2

KILL[1]= 3,5

GEN[2] = 3 4 B2 KILL[2] = 1

B3 GEN[3] = KILL [3] =Ø

GEN[4] =5 KILL[4] = 2B4

DEFOUT[I] = (DEFIN[I] – KILL GEN[I]

DEFIN[I]= DEFOUT[j]

Bj is a predecessor of BIiteration Ø 1 2 3 4

DEFIN[1]

DEFOUT[1]

DEFIN[2]

DEFOUT[2]

DEFIN[3]

DEFOUT[3]

DEFIN[4]

DEFOUT[4]

ØØØØØØØØ

Ø1,21,23,43,4ØØ5

3,4

1,2

1,2

2,3,4

2,3,4

3,4

3,4

5

2,3,4

1,2,4

1,2,4

2,3,4

2,3,4

2,3,4

2,3,4

3,4,5

2,3,4

1,2,4

1,2,4

2,3,4

2,3,4

2,3,4

2,3,4

3,4,5

BIT- VECTOR IMPLEMENATIONA,B,….Z

S 1,2,…N UNIVERSAL SET

1 2 I N

true iff i Є S

const N =?Type SET = packed array [1..N] of boolean;

Procedure UNIN (A,B: SET; var C:SET);var I = interger;begin

3. x: = x+y4. z: = 10.0

x z?

5. y :=x*z

121

for I:=1 to N doC[I]:=A[I] or B[I];

End;

MEMBER. INSERT, DELETE, - O (1)MAKENULL, ASSIGN, EQUALUNION, DIFFERENCE, INTERSECTION – O(N)EMPTY

Linked –list implementation

most general, size id unlimited efficient if the sets are ordering by “<” in that case, a set is represented as a shorted list a1,a2…, an where

a1 <a2 < …..an

Unsorted

MAKENULL, EMPTY – 0(N)

INSERT, MEMBER, DELETE, ASSIGN, 0(n)

EQUAL, UNION INTERSECTION, DIFFRENCE – 0(nm)

SORTED

MAKENULL, EMPLTY –(1)

122

INSERT, MEMBER, DELETE, ASSIGN, EQUAL, UNION,

INTERSECTION, DIFFERENCE – O(n) OR O(n+m)

Can be improved to O(logn)If balanced search trees are used

ADT Dictionary

SET with IINSERT, DELETE, MEMBER, and MAKENULL

Example

Dean’s list data base

Program deanlist(input, output);Type name = packed array[1…20] of char,

Grade = -1 ..12

Var student :name;Average: grade;Database: DICTIOARY (of names)

BeginMAKENULL(database);Readln(student, average);While student# ‘’ do begin

123

Case avergage of 12..10 : INSERT (student, database) 9:8.. 0 : DELETE(student, database) -1 : if MEMBER(student, database)

then writein(‘yes’)else writeln(‘no’)

endcase end

end

A modified dictionary

TypeElementtype = record

Key : keytype; Data : datatype End;

ThenMAKENULL, INSERT,

DELETE

QUERY (x:keytype) : datatype;

INSERT((key,data), dictionary)

DELETE(KEY, dictionary)

124

QUERY(key, dictionary)

Implementation of dictionary

1. Bit-vector if the universal ser is 1,2,…

INSERT, DELETE, MEMBER – O(1)

2. Sorted or unsorted o(n)

INSERT O(n) INSERT O(n)DELETE or DELETEMEMBER O(logn) MEMBER – o(n)

1. Unsorted array (of some constant size)

If set is ordered

125

TypeDICTIONARY = record

Last : 0..max+1Data : array [1..max] of element type end;

Procedure MAKENULL (var : A DICTIONARY)Begin

A last :=0End;

0(1)

Function MEMBER(x:elementype; varA:DICTIONARY):boolean;

Var i : integer;Begin

For i:=1 A.last doO(n) if A.data[i] = x

Then return (true);Return(false);

End

Procedure INSERT (x:elementtype; varA:dict…);Begin

If not MEMBER(x,A) thenIf A.last <max then begin

O(n) A.last := A.last +1

A.data[A.last]:xEndElse error(‘full’)

End;

126

Proceudre DELETE(x,A);;Var i:= integerBegin

Find the i s.t A.data[i] =xO(n) or I>a.last;

If A.data[i] = then A.data[I]:=…

Hashing – O(1) time/operation on avg

INSERT. DELETE, MEMBER

Universal set

To represent set S: put ai in cell I if ai Є

.

.

.

.

a1

a2

an

a4

an-1

a3

127

0(1) time if rank (a) = i is o(1) time

Generally, partition the elements into groups and let all elements in a group share a cell

O(1) time if h(x) = I if x in a group I can be computed in O(1) time.

Perfect if elements from a same group do not occur simultaneously!

Good if it is unlikely two elements from a same group occur simultaneously!

Okay if not TOO MANY elements from a same group occur simultaneously!

Some hash functions: I mod p, sum of digits, h(135) = 9

Hashing

Goal:

O(1) / Operation of average

INSERT, DELETE, MEMBER

Pr(time > C) << 1.0

Open hashing buckets

Partition elements in to B classesHashing function h(x) = I if x Є class I

128

0

1

b-1

Bucket tables list of elementsHeaders in each bucket

Avg time =1 + N/B per operation

If N ≤ C*B., avg time ≤ 1+ C

Closed Hashing

01

B-1

Bucket table

Insert: X x is placed in bucket h1(x)

If bucket h1(x) is already taken collision

129

Then try bucket h2(x) rehashing

If bucket h2(x) is taken

Then try bucket h3(x) .

.

.Member: X try bucket h1(x), h2(x)

Until find it or an empty bucket is met

Example 2

Sorting using priority queue

Key ≡ priority

Pool: priority queue

Procedure PQSort(var a array [1…n] of ….); Var pool:PRIORITY QUEUE of ….;

I:Intger; Begin MAKENULL(pool);

For i:=1 to n doINSERT(A[i], pool);

For i:=1 to n doA[I]:= DELELTEMIN(pool)

End;

130

Obs: if INSERT, DELETEMIN- o(login), then PQSort – O(nlogn)

Previous implantation of sets

Bit -vector – O (N) DELETEMIN

Array - O (n) INSERT & DELETEMIN

Linked list – unsorted O(n) DELETEMIN

- sorted O(n) INSERT

Hashing - DELETEMIN O(n)

Solution – heap partially ordered tree in [AHU]

12 3 parent ≤ child3

5 9

131

45

76

8 9 10

1 2 3 4 5 6 7 8 9 10 11

3 4 9 6 8 9 10 10 18 9

DELETEMIN:

6

10

18

8

99

10

9

5

910

101

8

8

9

6

59

8

9

6

132

Generally, time = O(depth of tree) = O(logn)(2depth ≤ n ≤ 2depth +1)

Insert (4, heap); 3

5 9

86 9 10

9 410 18

3

5 9

46 9 10

910

101

8

5

66

910

101

8

8

9

9

133

9 810 18

3

4 9

56 9 10

9 810 18

time = O(depth) = O(logn)

A linear order’<’ is a relation on elements;

(i) for any two elements a and b

a < b, a > b, or a = b

(ii) a<b and b<c

a <c

A set is ordered if a linear order’<’ on its members exists, e.g. sets of integers reals character strings (by lexicographical order)

Note: the appearance order of elements in a set representation is unimportant, e.g. (1,3,4) = (3,1,4 = 4,3,1

134

A sorted list:a1,a2 a3 a4,…. an-1, an

Representing ordered sets – binary search trees

Elements are ordered by ‘<’

Interested In operations:

MAKENULL, INSERT, MEMBER, DELETE, MIN

Previous implementation:

sorted linked list: MEMBER - 0(n)

sorted array: INSERT, DELETE - 0(n)

Solution: binary search tree

2030 left subtree < parent

135

15 25 45parent < right subtree

10 1728

16

Pascal Implementation

typeelementtype = recordkey:real;data:datatype end;

nodetypes = (leaf, interior)

twothreernode = recordcase kind : nodetypes ofleaf: (element:elementtype);

I nterior (first,second,third:↑twothreenode;lowofseconcd,lowofthird:real

136

end;

SET = ↑twothreenode

need parent: ↑Twothreenode?

2-3 three: 3-way B-tree

AVL – tree (Adelson – Velskii, landis)

Balance binary search tree [HS pp.436-452]

AVL tree

AVL tree

I dL – dR 1≤1

Empty, single nodes are also AVL trees.

An AVL tree is also called a height-balance (or depth) binary tree.

dL dR

137

BF = dL- dR

BF =1-1

-1 0

0 1

0

nd: minimum number of nodes in an AVL tree with depth d.

Fact

n0 = 1n1 =2nd = nd-1 + nd-2 +1

Similar to Fd = Fd-1 + Fd-2 Fibonacci number

nd ≥ Fd

= Cd/5

c = 1+2 5 >1

d ≤ log cnd + logc 5

depth O(logn)

12

7

2

15

19

8

10

138

MEMBER, INSERT, DELETE — O(Iogn)

Sets with MERGE and FIND

MERGE(A,B,C): it A∩B =∅ then C:=A∪B

environment = A1,A2,…,Am

FIND(x): the unique Ai s.t. x ∊Ai

Example

Equivalence problem:

Equivalence relation ‘≡’ on set S

1. a ≡a (reflexivity)

2. a ≡b b ≡ a (symmetry)3. a≡b, b≡c a ≡c (transitivity)

e.g. congruence modulo K i ≡kj iff(i-j) mod K = 0

139

equivalence classes:

S = S1, ∪ S2∪ S3 ∪...

s.t. a,b ∊si a≡ b

a ∊si a ∊sj a ≢b, i≠j

e.g. (0,k,2k,… 1,k+1, 2k+1,… …k-1, 2k-1,…

s = a1,a2,a3,a4,a5,a6,a5,a6,a7

Fortran: EQUIVALENCE a11≡a12

a13≡a14

. .

. .

. .

a1 a2 a3 a4 a5 a6 a7

a1≡a1 a1, a2 a3 a4 a5 a6 a7

a5≡a6 a1, a2 a3 a4 a5, a6 a7

a3≡a5 a1, a2 a3, a5, a6 a4 a7

a4≡a7 a1, a2 a3,a5, a6 a4 a7

ai≡aj A = FIND(ai; B = FIND(aj );

140

MERGE (A,B,A);

MAKENULL(B);

∪ = a1, a2, …., an =A1∪A2 ∪…∪=Am

A Partition

ADT MFSET A1, A2,… Am component

1. MERGE(A,B): A:=A∪B or B:=A∪B

2. FIND(X)

3. INITIAL(A,x): A:=x

A simple implementation element-based

Type

MFSET = array[membertype]of set-id-type

∪ = 1, 2, …, 121 2 3 4 5 6 7 8 9 10 11 12

= (2, 3, 6, 1, 4, 9 5, 8,12, 7, 10,

11

typeset-id-type = integermembertype = 1…n

functionFIND(x:1..n; var C:MFSET);

BeginO(1) FIND := c(x)

2 1 1 2 3 1 4 3 2 4 4 3

141

End

Procedure MERGE (A, B:integer; var C:MFSET); // A: A∪B// Var

X:1..n;Begin

For x:=1 to n doIf C[x] = B then

O(n) C[x] :=A

End;

By some minor improvementN merges can be done in O(nlogn) time using member list).

A tree implementation component-based

A = 1, 2, 3, 4 B=5, 6 C= 7

MERGE (A,B)

A CB

1

1 1

5 7

6

1

A

1

2 65

34

142

Time=0(1)

FIND (x) – O(depth)i

*weight rule:

if we always merge the smaller tree into the large tree, then depth ≤ log2n

Root must conatin weight of the tree.

Path compression

i

13

25

6

187

4

3 63 5 7 84

143

Find 6

With path compression only

n consecutive FIND - O(n)time

n Intermixed FIND and MERGE - O(nlogn)

With both path compression and rule ( *)

n intermixed FIND and MERGE - O(na(n))

α(n) = the least m s.t. n ≤ A(m,m)

pseudo-Inverse of Ackermann’s ftn.

In practice, α (n) ≤ 4 I

since2

A(4,4) = 2 . . 65536 .

2

144

2

Ordered Sets with MEREGE, FIND, SPLIT

SLPIT (S, S1, S2, x):

S1 := a| a ∊S and a < x

S2 : = a| a ∊S and a ≥ x

Longest Common Subsequence Problem

Sequence = string e.g. abcdaaa

A subsequence of a sequence x is obtained by removing zero or more (not necessarily contiguous) character from x

e.g. ab, aaaa, ada are subsequences of the above sequence

Longest common subsequence of x and y:

A longest sequence that is a subsequence of both x and y

145

e.g. x = 1 2 1 4 3 2 1

y = 2 5 1 3 4 1 2 1

21421 is an LCS21321 is another one

Application : UNIX diffDNA analysis, ete.

Solutions: x1, x2, …xn y1,y2, …, ym

|x| = n |y| = m

1. O(nm) dynamic programming

2. O(plogn)

Where p is the size of

(i,j) | ≤ n, 1 ≤i ≤n, 1≤i≤m, and x1 = y1

146

worst case p = O(mn)

In practice p = O(m+n)

Key idea

Input : A = a1a2…an B = b1b2…bm

To find | LCS(A,B) |

For j:=1 to m do

Find | LCS( a1….ai,bi….bj) |

Def.

Sk = i | |LCS(a1…aib1…bj)| = k

1 2 3 4 5 6 7 8 A = 1 2 1 4 3 2 1

B = 2 5 1 3 4 1 2 1

J s0 s1 s2 s3 s4 s5 s6 s7

1 1 2,3,4,5,6,7 ∅ ∅ ∅ ∅ ∅

147

2 1 2,3,4,5,6,7 ∅ ∅ ∅ ∅ ∅

3 ∅ 1,2 3,4,5,6,7 ∅ ∅

4 ∅1,2 3,4 5,6,7 ∅

5 ∅1,2 3 4, 5,6,7 ∅

6 ∅

7 ∅

8 ∅Def. PLACES (a) = I| 1 ≤ i≤n, ai = a

All PLACES(a) can be obtained in O(n) time, assuming the alphabet is finite.

If not, O(nlogn) e.g. if PLACES(a) = i1,i2…., ik

i1 > i2 > ….ik

PLACES [a] . . .

Hashing

Intuitive fact: in iteration j (i.e. when considering bj), new matches happen at PLACES (bj) in A. These matches may have a position from sk to sk +1.

Rule: r ∊sr to sk+1 iff j -1

Move r to sk+1 iff

1. ar = bj (i.e., γ ∊PLACES(bj))2. r-1 ∊sk

a1…ar-1 ar …

i1 iRi2

148

b1 … bj-1 bj …

Procedure LCS:

Begin

(1) Initialize s0 = [ 1, 2,…n andSi = 0 for i=1,2,…n;

(2) for j:=1 to m do

//compute sk’s for postion j //

(3) for r in PLACES(bj)(4) k:=FIND(r);(5) if K = find(r -1) then begin(6) SLPIT(sk, sk, s’k , r);(7) MERGES(sk, sk+1, sk+1)

End End

End;

Obs: if FIND, MERGE, SPLIT can be done in 0(logn), then the total time is

m0 ∑ | PLACES(bj).| logn

j-1

0 = (p.logn)

149

Data structure for sets S0, S1, …, Sn

2 -3 trees ! ! !

8 9 10 11 12 13 14

FIND ( r) : O(depth) = O(logn)

MERGES (S’k, Sk+1, Sk+1):

New Sk+1 New Sk+1

S’k Sk+1

Sk+1 S’k Sk+1

K

150

S’k

Similar to INSERT, repair

Takes O(logn) time

APLIT ( )

6 7 8 9 10 11 12

r = 9 split at 9

8 10 11 12 6 7 9

9 10 11 126 7 8

151

time = O(logn)

Graphs: A Math Model

Hw401HW401

HW6 QEWHw403 QEW

Flight Map (Imaginary)

KNOW

friends

WaterlooToronto

Niagara fallsHamilton

Toronto

Minneapolis

London

MiamiNew Orleans

ChicagoNew York

Bob Mary

Alex Mark

Bob Maryy

Alex Mark

152

Misc: state transition diagrams

Directed Graphs (Digraphs)

V1 = 1, 2, 3, 4, 5E1 = (1,2), (1,3), (2,3),(3,4),(4,5),(4,1),(5,1), (5,4)

G1 : = (V1, E1)

A digraph G = (V, E)

V: set of verices/nodes

E: set of arcs/directed edges

The arc from vertex v to vertex w:

Vw or (v,w) v≠w

Tail head w is adjacent to v

|V| = n|E| ≤ n(n-1)

= O(n2)

A path v1,v2, …vm s.t. the arcs (v1,v2,(v2,v3),…,(vm-1,vm)exist.

Length of the path : m-1

SandySandy

1

2

3

5

4

153

The path passes through v2, v3, …,vm-1

The path is simple if all vertices on the path are distinct, except possibly the first and last.

(Simple) cycle: a (simple) path of length at least one that begins and ends at the same vertex.

e.g. 1, 2, 3, 4, 1, 3 is a path

1, 2, 3, 4, 5 is a simple path

1, 2, 3, 4, 5, 1 is a simple cycle

1, 2, 3, 4, 1, 3, 4 is a simple cycle

labelled diagrapha

b a b bb abab

a abbaaabaa . . .

When the labels are numbers, the diagraph is also called a network or weighted diagraph.

154

Representation of diagraphs

1. List of edges e.g., (1,2), (1, 3), (2, 3),…

2. Adjacency matrix

G = (V,E) V = 1,2, …, n

Adjacency matrix for G is an n x n

Boolean matrix

A[i,j] = true1 if (i, j) ∊E= false0 otherwise

space: O(n2) even if | E | << o(n2)

3. Adjacency list

1

2345

23411

3 .

5 .

155

Space : O(|E|) to decide if ij, we need O(n) time

ADT DIAGRAPH

Single source shortest paths problem:

Given G = (V,E) and source vertex

15 3

40 100 50 5018

2050 10 40 20

30labels (costs)must be ≥ 0costs (2,1) = +∞

We need to determine the cost of the shortest path from sources to every other vertex

e.g. source =1 to min cost 2 70

2

63

4

1

5

n-1Cost(v1,v2,…vn) = ∑ cost(vivi+1) i=1

156

3 60 4 40 5 10 6 30

Dijkstra’s algorithm

Source vertex =1 G = (V,E) V= 1, 2, 3

Distance D(i) = cost of shortest path 1 to i

Let S ⊆ V be a set of verticles,

Ds (i) = cost of shortest path from 1 to i that only passesThrough vertices in S

S is called a restrictionset

10B(3) = 9

Ds(3) = 10 if S = 14 5

Fact: Dv(i) = D(i) and D∅(i) = cost(1i)

Idea: Let S ⊆ V be some set s.t. 1∊S

Suppose we know Ds(i) for each i ≤ V. Then we can enlarge S as follows:

1. w ∊V-S and Ds(w) is the minimum

1 3

2

157

2. S= S∪w

3. Ds(i): min(Ds(i),Ds(w) + cost (w, i))

i .

Ds(i)S

. .

wDs(w)

Algorithm

BeginS:= 1;For i:=1 ton do

D[i]:= cost(1,i);For i:= to n-1 do begin

Find w in V-S s.t. D[w] is a minimum;S:=S∪w;For j:=2 to n do

D[j]:=min(D[j], D[w] + cost(w, j))End

End;

. . . . .

. |

. .

Ds(x)≤ Ds(w)For any X ∊S

D[i]≡Ds(i)

158

Obs. D[j] = D(i) if I ∊S

Thus no need to update D[i] if ∊SExample

1020 10

6010 30 10

4010

3050

30 60 = d[3]

0 +∞ 10 40

30 60

0 +∞ 10 40

30 60

0 +∞ 10 40

Ds(i) = D(i) for I ∊S

2

5

3

4

16

2

6

54

3

1 S = 1

2

6

54

3

1

2

6

54

3

1

159

30

0 +∞ 10 40

30

0 +∞ 10 40

2

6

54

3

1

2

6

54

3

1

160

Procedure Dijkstra:

// C[i,j] = cost(i,j)//

begin S := 1 For i :=2 to n do D[j] :=C[1, i]; For i := 2 to n-1 do begin1. Find a w in V-S such that

D[w] is a minimum; S := S ∪w;2. for each vertex V in V-S do

D[v] := min(D[v] + C[w,v]) End End;

How to recover theShortest paths?

Time = O(n2) Adjacency lists of costs:

W V1 C1

161

Priority Queue for V-S time: 0(|Ellogn)

All-Pairs Shortest Paths Problem

Given: a digraph with nonnegative arc costs

Goal: for each pair v, w of vertices find the cost of the shortest path from v to w.

Application: construction of shortest flying time table

Solution 1: repeat Dijkstra’s algorithm with source = 1 ,2,...,n

time: 0(n3) or 0(n | E | logn)

Solution 2: Floyd’s algorithm

let D(i,j) and Ds(i,j) be as before

D(i,j): distance from i to j

D distance from i to j under restrictions.

162

Floyd’s Idea:

Let sk = 1, 2, …, k, 0 ≤ k < n

Suppose Dsk(i,j) is know for all 1≤ I, j, ≤ n

Then,

Let sk+1 = 1, 2, …k+1

Dsk+1 (i,j) = min Dsk(i,j)Dsk(I,k+1) + Dsk(k+1, j)

For all i≤ j, j≤n

Thus, we compute

Ds0(i,j) Ds1(I,j),…., Dsn(i,j)

Cost (i,j) D(i,j)

163

in the following procedure, A is an nxn matrix

A[i,j] = Dsk,((i,j) after k-th iteration

procedure Floyd(Var A:arraY[l ..n,1 ..n] of real; C:...);

var i,j,k: integer,

beginfor i := i to n do

for j:=1 to n doA[i,j] = C[i,j]; //A = Ds0//

for i := 1 to n doA[i,j] :=0;

for k := 1 to n do for i := 1 to n do

for j:=1 to n doif A[i,k] + A[k,j] <A[i,j] then

A[i,j] := A[k,j]+ A[k,j]// A[i,j]:= min(A[i,j], A[i,k] + A[k,j]) //

end;time = 0(n3)

164

Recovering the paths

Use an rrxn matrix P

initially, P[i,j]:=0 1 ≤ 1,1 ≤n

In procedure Floyd, insert red line

If A[i,k] + A[k,j] <A [i,j]thenbegin

A [i,j] := A[i,k] + A[k,j];P [i,j] : = k

endMeaning: the Shortest path

from i to j passes through vertex k

Procedure path (i,j:integer);

// print a shortest path from I to j //var

k: integer;begin

k:= p[i,j];if k ≠ 0 then begin // the path is not direct //

path(i,k);writeln(k);path(k,j);

end;end;

165

Transitive closure of adjacency matrix

Given: digraph G = (V,E) represented by adjacency matrix C

Goal: for each pair i,j, whether there exists a path from i to j

A[i,j]= 1 true if apath from i toj0 false other

for 1≤i,j≤n

A is called the transitive closure of C

Solution 1: Use Floyd’s algorithm.

Initialize A[i,j] = + ∞ if C[i,j] = 0A[i,j] =1 other

At the end, set A[i,j] =1 if A[i,j] ≠ + ∞= 0 if .. = + ∞

Solution 2: Simplified Floyd’s algorithms - Warshall’s algorithm

in iteration k A[i,j] := A[i,j] or A[i,k] and A[k,j]

166

procedure Warshall(var A:array[1…n, 1…n] of boolean; C:...);

var

i,j,k: integer;begin

for i := 1 to n dofor j := 1 to n do

A[I, j] := C[i,j];

For k := 1 to n dofor j := 1 to n do

for j := i to n doIf A[i,j] = 0 then

A[i,j] = and A[k, i]end;

lime = O(n3)

A=C + C.C + C.C.C +…. + Cn-1

•‘: boolean multiplication, i.e. and

it is known C.C can be done in 0(n2.376)

to obtain A, computec2, c4, c8,… cn-1

log2n

time : O(logn *n2.376) < 0(n3)

167

Undirected graphs

G = (V,E)

V=(1,2),(2,3)(3,4),(4,5),(5,1),1,3),(2,5)

(u,v) and (v,u) denotes the same edge

(u,v) is incident upon u and v.

V1,V2,…Vn is a path if the edges (V1,V2),(V2,V3)…,(Vn-1,Vn) exist.

The path V1,V2,…,Vn connects V1 and Vn

Definitions for simple path and cycleand the same of length ≥ 3

G1 = (V1,E1) is a subgraph of G2 = (V2,E2) if V1 and E1 E2

If E1 contains all edges (u,v) in E2

Such that u,v V1, G1 is called an induced subgraph of G2

21

53

4

2

31

2

31

168

Graph G is connected if every pair of G’s vertices is connected by somepath

Connected component of G: a maximal connected induced subgraph of G

G is cyclic if G contains at least one cycle

G is acyclic if G doesn’t contain any cycles

Free tree: a connected acyclic graph

Fact: 1. Every n-node free tree has n-1 edges

3. If we add any edge to a free tree, we get a cycle

Claim: If n>1, there must be a vertex with degree (i.e., number ofedges incident upon the vertex) =1

169

Proof of claim

Let G be a free tree with > 1 node

Suppose that G’s nodes all have degree >1

∴ a cylcle esists. A contradiction!!

Proof of (1): true If n = 1

Suppose (1) is true for n = k

Let G = (V,E) be a k+1 – node free tree

Let u be a vertex of dgree 1 and (u,w) be the only indicent edge

G’ = (V-u. E-(u,w)I a free tree

By induction hypothesis, G’ has k-1 edges

∴ G has k edges

proof of (2): if no cycle then the graph is still free tree

but number of edges = n.

contradiction!!!

V1

V2

VjVi+2Vi+1ViV3

170

Representation

Adjacency matrix: symmetric, i.e. entry i,j = entry j,i

Adjacency list: redundancy, i.e. if edge (u,v) exists, then u is on the list for v and v is on the list for u.

Minimum-cost spanning tree

G= (V,E) is connected. Each edge (u,v) E has a cost C(u,v) (=C(v,u)).

A spanning tree of G is a subgraph of G which is a free connecting allvertices in V.

The cost of a spanning tree is the sum of the costs of edges in the tree.

11

30 3 820

15 2013

30

171

The MST Property:

Let G = (V,E) be a connected graph

U ⊆ V is a proper subset of V

If (u,v) is an edge of the lowest cost s.t.

u ⊆U and v V-U, then there is a minimum cost spanning tree that includes (u,v) as an edge.

C(u,v) ≤ C(u’,v’)

procedure Prim (G:graph;var T : set of edges);

II Constructs a minimum-cost spanning tree T II

var U : set of vertices; U,V:vertex;begin

T : = ∅ ; ∪ := 1 while ∪≠ V do being find a lowest cost edge (u,v) s.t. u ∊ U and v ∊U; T := T ∪ ((u,v); ∪ := U ∪ v end

end;

u .u u .

. v v-u.v’

172

8An example: 9 9

75 3 8

8 6U = 1

U = 1,2

7 8

U = 1,2,58

7

3

U = 1,2,3,58

7

35

Kruskal’s algorithms

5

1

2

4 31

25

43

52

1

43

1

5

2

4

3

1

5

4

3

2

173

a connectedcomponentw.r.t. T

procedure Kruskal (G:graph;var T:set of edges); var u,v : vertex;

E’ : set of edges;

begin

E’ := E;T := ∅ ;while E’ ≠ ∅ do begin find a lowest cost edge (u,v) in E’; E’ := E’ – (u,v);

If u and v are not in the same connected component then T:= T∪(u,v);

endend;

K1 := FIND (u)K2 := FIND (v)if K1, ≠ K2 then MERGE (k1, k2) …α(n) time

E’: PRIORITY QUEUE;T: MFSET;Time:O(eloge), e = | E|

8Example: 9 95

1

2

174

75 3 8

6add(3,5) to T

3

add (4.5) discard (3,4)

35

Add (2,5)Discard(2,3)

7

35

Add (1,2) discard (1,3),(1,5)8

7

35

Graph Traversal and Search

Digraphs - depth-first search

4 31

25

43

52

1

43

1

5

2

4

3

1

5

4

3

2

175

go as far as you can following the arcs!

typedigraph = array [1 ..n] of adjacency list;vertex = 1..n

varV : vertexmark: array (vertex] of (visited, unvisited);

0(e)

for v.:= 1 to n do mark [V] := unvisited;for v := 1 to n do

if mark [V] = unvisited then dfs (v);

procedure dfs(v:vertex);var w: vertex;begin mark[v] := visited;

Print v; // anything //for each vertex w on L[v] do

I if mark [w] = unvisited then dfs(w)

end;

Example3

9

1

6

176

DFS order : 1, 2, 4, 7, 10, 5, 3, 6, 8, 9

Depth-first spanning forest:

Dfnumber1 7

2 68

3 4 tree arc9

5 10

forwars arc : ancestor descendent (3,8)

back arc: descendent ancestor (7, 1)

Cross arc: all the other (7, 4) , (9,1)

Fact: if (v, w) is a

(1) tree/forward arc, dfnumber (v) dfnumber (w);

25

7

410

8

1

52

4 7

10

3

6

8

9

177

(2) back/cross arc, dfnumber (v) dfnumber(w)

An application- test for acyclicity

Fact: a digraph is cyclic iff a back arc is encountered in any DFS.

Dfnumber (v) is the Smallest in the cycle

.

.

How to start a back arc? dfl0

In dfs, Include a dflNumber for each 1Node enoutered. Also, keep 2 the current path in array. 3

4

Breadth-first search

go as broadly as possible

v

w

V1

V2

V3

V4

V5

178

procedure bfs(v);var Q: QUEUE of vertex

x,y:vertex

beginmarkivi := visited;print v // or anything //MAKENULL(Q)ENQUEUE(v,Q);while not EMPTY(Q) do begin

x:= FRQNT(Q); DEQUEUE(Q);for each vertex y adjacent to do

time = 0(e) if mark[y] = unvisited then beginmark[y] := visited;ENQUEUE(y,Q)

Endend;

BFS order:1, 2, 5, 4, 6, 3, 7

bfnumber, bf spanning forest

(Undirected) Graphs

DFS: very similar to DFS for digraphs.

2 1

4

5

67

3

179

DFS order: 1, 2, 4, 3, 6, 5, 7, 8, 10 , 9

dfs spanningforest

if connected dfspanning tree

dfnumber(v):

Tree edge:

Back edge: (1,4), (4, 6), (2,5)

No cross edges!!!

BFS:

2

1

3

4

5

6

7 8

9

10

1

2

4

3 5

6

7

8 9

10

180

For the above graph, the BFS order is: 1,2,3,4,5,6,7,8,9,10

Applications of DFS and BFS:

1. Test for acyclicity

acyclic 1ff no back edges

2. Test for connectivity

connected 1ff only one tree In the DFS/BFS spanning forest

generally, each tree in the forest gives a connected component.

3. Biconnected components (next lecture)

Articulation points and biconnected components

Flight Map 1

181

Articulation vertex point : if removed the reaming graph becomes disconnected

Def. A vertex v is called an articulation point or cutpoint if vertices x,w st. x≠ v, w≠x, x≠ and v is inevery path connected x and w.

Def. A connected graph is biconnected if it does not have any articulation points.

Fact: The following are equivalent:

1. 0 is biconnected

2

45

87

3

6

182

2. Deletion of any single vertex fails to disconnect 0

3. Every pair of vertices are connected by two disjoint paths (n ≥ 3)

Def. A connected graph Is k-connected If deletion of any k-i vertices fails to disconnect the graph

Def. A connected graph is k edge-connected if deletion of any k-i edges fails to disconnect the graph

Biconnected component (or bicomponent): a maximal Induced biconnected subgraph, e.g. the above graph has 5 bicomponents

Problem

Given a connected graph G, identify all its articulation points and

2 56

34 5

5

83

7

5

1

183

bicomponents. -

Trivial algorithm: 0(n*e). We want 0(e)!

To identity the articulation points:

Step 1 Do a depth-first search of G.

Note:

1. a single df spanning tree

3. only tree and back edges

Fact:

1. A leaf cannot be an articulation point

184

2. The root is an articulation point 1ff it has more than one child

3. Let v be an interior node other than the root. v Is an articulation pointiff some subtree of v has no back edge incident upon a proper ancestor of v.

Obs. Let w be any proper descendent of v and (w,x) be a back edge. x Is a proper ancestor of v iff dfnumber (x) <dfnumber (v).

Def.

low(v) = the smallest dfnumber of v or of any node reachable by following a back edge from some descendent of v (including v itself).

dfnumber

1

2

4

3

6

5 7

8 9

10

11

185

Dfnumber 1 2 3 4 5 6 7 8 9 10 11

Low 1 2 1 1 5 1 6 6 9 1 1

dfnumber (v)Low(v) = min

Dfnumber(x) s.t. (v,x) is back edge from any x

Low(y) for any child y of v

Step 2 Traverse the df spanning tree in postorder and compute low(v) for all nodes v.

Note : if v is a leaf

Low(v) = min dfnumber (v)

Dfnumber(x) s.t. (v,x) is back edge

Step 3 Identify articulation points by traversing the tree in postorder. (This step can be In parallel with Step 2.)

v is an articulation point i for some child w of v

low (w) ≥ dfnumber(v)

186

Step 4 In Step 3, whenever an articulation point v Is found, delete the subtree rooted at w and output the bicomponent given by the subtree and V.

1

Df spanninf treeAnd back edges 2 3 8

4

5

6

7Time : O(e)

Matching in Graphs

Teachers course

BA

E

CD

FH

G

A

D

F

B

H

G

E

C

16

187

G = (V, E) is a graph

A matching in g is a set of edges with no two edges incident upon same vertex

A matching is maximal if the number of its edges in the maximum.

A matching is complete/ perfect if every vertex in V is an endpoint of some edge in the matching.

G is bipartite if V = V1 ∪ V2, V1 ∩V2 = ∅each edge in E had one end in V1

and the other end in V2 .

Problem: Given a bipartite G, find a maximal matching in G.

Solution #1: (Brute force) Enumereate all possible matchings. Pick one that the largest number of edges.

Time: O(n!) = O(nn)

Solution #2: Augmenting paths

Time: O(ne)

2

3

4

5

7

8

9

188

M = (2,7),3(3,6),(4,9)

Let M be a matching

A vertex V is matched if it is an endpoint if an edge in M, e.g. 2,3,4,6,7,9Are matched

An augmenting path relative to M: a path connecting two unmatched vertices in which alternate edges in the path are in M.

e.g.P1

P2

Fact: if P is an augmenting path relative to M, then M ⊗P is a bigger matching.

e.g. M ⊗ P (2,7), (1,6), (3,9), (4,10)

1 6 3 9 4 10

5 10

189

M ⊗ P = (3,6), (2,7), (4,9), (5,10)

(⊗is also the Exclusive —Or on sets, i.e. A ⊗B (A-B) ∪ (B-A) - symmetric difference)

Fact: M is maximal iff there is no augmenting path relative to M.

Proof: “only if”: straight forward

“if”: i.e., if M is not maximal then there must be an augmenting path.

Let N be a matching s.t. |N| >| M|.

Then each connected component of (V,N ⊗M) must be one of the following:

equal 1. a simple cycle with edges alternating between N and M

1more from M 2. an augmenting path relative to N

1 morefrom N 3. an augmenting path relative to M

4. a path with equal number of edges from N and M

Since N ⊗M has more edges from N than M...

Algorithm

M:= ⊗Repeat

Find an augmenting path P relative to M;

190

M:= M ⊗ PUntil no more augmenting path exists

16

27

3

84

9

510

M P⊗

(1,6)

1 6

3 6 81

191

(3,6),(1,8)

(2,8),(1,6), (3,9)

(2,8),(1,6),(3,9),(4,7)

Algorithm to find an augmenting path relative to matching M

//G=(V E) V=V1∪ V2 //

Build an augmenting graph level by level as follows:

level 0 := unmatched vertices In V1

repeat level 2i+1 := new vertices that are adjacent to a vertex atthrough an edge not in M; also add the edge;

level 2i +2:= level 2i+2:= new vertices that are adjacent to a vertex at level 21+1 through an edge in M; also add the edge;

Stop when an unmatched vertex is added at an odd level or no more vertices can be added (i.e. no augmenting path exists)

The path from the vertex to any vertex at level 0 is an augmenting path.

Example

V1 V2

2 8 1 6 3 9

4 7

5 7 10

4

192

Level0

1

2

3

The process is very similar to BFSTime: O(e) if adjacency lists are used

Internal Sorting

1

2

3

6

4

7

5

8

9

10

11 1

2

2

7

5

8

31 4

9

6

193

Internal: data are stored in the main memory which is a RAM. Thus, access to each data item takes constant time.

Data Item: a record with one or more fields. One field contains the key of the record.

‘≤’ linear-ordering on keys (compare ‘<)

Sorting: arrange a sequence of records so that the keys form a nondecreasing sequence.,

r1,r2,…rn

ri1-,ri2,…rin s.t.

ri1 .key ≤ri2.key ≤…≤ rin .key

Bubble Sort

Move the Hghter records to the top

for i := 1 to no do

194

for j:=n down to i+1 doIf A[j].key < a[j-1].key then

swap (A[j], A[j-1])

In place Time: 0(n2) Bad sequence: descending

Insertion Sort

Insert A[i] into A[1], A[2], ..., A[i-1] at its rightful position

A[0].key := - ∞For i:= 2 to n do begin

j:= I;while A[j] A[j-1] to begin

swap (A[j], A[j-1]);inplace j:=j-1

endend;

Time: 0(n2) at descending sequence

Selection Sort

Select the smaHest record and place it at its tightful position.

for i := i to n-i do

195

select the smallest among A[i],…A[n]swap it with A[i]

Time: 0(n2) In placeBetter than Bubble when reconi is large

Shell Sort (diminishing-increment)

Incr = 6

Incr =3

Time: O(n3/2) in place for some incr sequence

Heap Sort

Q: PRIORITY QUEUE

for i:=1 to n doINSERT (A[i],Q);

for i := n down to 1 do

196

A[i] := DELETEMIN(Q);

Time: O(nlogn)

in place if Q is implemented using array A[l ..n]

Details in [AHU]

Quick Sort

If A[i..j] contains two distinct keys then find the larger of the first two distinct keys, v (called pivot);arrange A[i..j] so that k, i+1 ≤j;A[[],…., A[k] < v and A[k+1],…,A[j] ≤; A[i],…A[k] < v, and

A[k+1],…,A[j] ≥ v

quicksort (i,k-1);quicksort (k,j)

Example

Partition v = 7

5 7 2 1 4 3 9 5 1 7

5 1 2 1 4 3 5 9 7 7

9 7 7197

Partition V =5 partition v = 9

Partition v = 3

V= 2 v =4

Worst time complexity: 0(n2)

pivot (i,j) - 0(j-i+1)

partition (i,j,pivot) – O(j-i+1)

T(n) = 0(n) + T(n-1)

5 1 2 1 4 3 5

7 7 93 1 2 1 4 5 5

97 75 53 1 2 1 4

1 1 2 3 4

3 1 2 3 4

1 1 2 3 4

1 1 2 43

198

= . . .

= 0(n

not in place! stack space can be made to O(logn)

Average time complexity

Assumptions:

1. all orderings are equally likely

2. the keys are distinct

Pr(lst group s of size i)

= Pr(A[1] is the i ÷1 St smallest and A[1] Is the pivot)

+ Pr(A[2] Is the i+1 st smallest and A[2] is the pivot)

= 1/n i/n-1 + 1/n i/n-1 = 2i/n(n-1)

Tavg(1) = C0

n-1

Tavg(n) ≤ ∑ 2i [Tavg(i) + tavg(n-1)] +cn

I =1 n(n-1)

199

n-1

≤ 2 ∑ [Tavg(i) +t cn

n-1 I =1

Suppose tavg(i) ≤ k I logi for some constant k 2 ≤ I < n

n-1

Tavg(n) ≤ 2 ∑ k i logi +cn

n-1 I =1

n/2 n-1

= 2k ∑ i logi + ∑ I logn + cn

n-1 i =1 i=n/2+1

n/2 n-1

= 2k ∑ i (logn-1) + ∑ i logn + cn

n-1 i =1 i=n/2+1

≤ knlogn – kn/4 – kn/2(n-1) +1

≤ knlogn, if k is large enough Tavg(n) = 0(nlogn)

2-way merge sort

• divide-and-conquer

• can be used for external sorting

200

• can be generalized to rn-way

Algorithm Msort (A[l ..n]); if n> 1 then begin m:=[n/2]; Msort (A[1...m]);

Msort (A[m+1..n]);Merge (A[1...m], A[m+l …n], B[1...n]);A[1...n] := B[1...n];

end;

Let k=2[logn]

(i.e. k is the smallest power of 2 that is greater than or equal to n)

T(n) ≤ T(k) = T(k/2) + T (k/2) + ck

= 2 T(k(2) + ck

=4 T(k14)+ck+2 ck/2

if Merge =8 T(k/8) + 3cktakes 0(n) … = ck log2 k + k O(1) = O(nlogn)

The nonrecursive version

201

Logn

Recursive lognversion

NOTE : Merging order may be different in nonrecursive version.

Typeafile: array[1…max] of elementtype;

Merging two sorted lists

Ι M m+1 n ι n

202

i j k

procedure merge (var X,Z :afile; l, m, n : integer);

//Merge X[l…m] and X[m +1..n] into Z[l..n] //

var i,j,k: integer;begin i:=1; j := m+1; k:=l while I ≤ m and j ≤ n do begin

if x[i].key ≤ x[j].key thenbegin

Z[k]:=x[i]; i:= i+1End

Else beginZ[k] :=X[j]; j:=j+1

EndK := K +1

End;If i > m then Z[k…n] : = x[j…n] // move the reaming items//Else Z[k…n] : = x[j…m]

End;

Time: O(n-1)

l

Union of ordered sets represented by sorted lists!

203

Procedure onepass (var X,Y:afile; n;l:integer);

//this procedure performs one pass of the merege sort. It merges adjacent pairs of segments of length l from list X to list Y. n= |X| //

var i : integer;begin

i =1;While i ≤ n -2l-1 do begin

Merge (X, Y, i,l-1, i+2l -1);i:= I +2l;

end;

// merge remaining segments of length < 2 I //

if (i+l-1) < n thenMerge (X,Y,i+l-1,n)

ElseY[i…n] := X[i…n]

End;

Time : O(n)

procedure Msort(var X:afile; n:integer);

var l:integer Y:afile; Begin

204

//l is the size of the segments currently being merged //

l :=1; while l < n do begin

Onepass(X,Y,n,I);l := 2*1;Onepass(Y,X,n,l);l =2*l;

end;end;

At most [1og2n] +1 passes.

Each pass takes 0(n) time.

Total: O(nlogn)

Example3 5 6 4 5 9 3 7 2 8 4 6 1 X

1=1

205

3 5 4 6 5 9 3 7 2 8 4 6 1 Y

1=2

3 4 5 6 3 5 7 9 2 4 6 8 1 X

1=4

3 3 4 5 5 6 7 9 1 2 4 6 8 Y

1 =8

1 2 3 3 4 4 5 5 6 6 7 8 9

Obs: The list X and Y are scanned sequentially from left to right

[log2n] + 1 times

206

Bin Sorting

Is Ω(nlogn) the lower bound for sorting a elements?

Yes, if we don’t make any assumption about the keytype and only use comparisons such as key l ≤ key2.

What if we know 1 ≤ key ≤ n and the a elements have distinct keys?

To sort such n elements,

for i := 1 to n do B[A[i].key] := a[i];

0(n)

or

for i:= to n do while A[i].key ≠ i do

swap (A[i],A[A[i].key]);

0(n)

207

Example

Sorting records that have a small number of distinct keys:

n records, O(logn) distinct keys

Can we do better than O(nlogn)?

An algorithm using modified 2 - 3 tree: (AVL also okay)

5 7

.

.

.

size of tree: O(logn)each insert: O(loglogn)Total time: O(nloglogn)

Bin Sorting

4 6 9 11

2 4 5 6 7 119

208

Key = 1..m (any finite and discrete type)

1

2

m

Bin table BProcedure binsort; Var

i:= integer, v: keytype; begin for i = 1 to n doO(n) INSERT(A[i], END(B[A[i].key]), B[A[i],key]);

For v : = 2 to m doO(m) CONCAT(B[1],B[v]) End;

Bin sorting when m = n k for some k

Example

k=2

keytype=0...n2 -1

.

.

.

.

.

.

209

Step 1: Place each integer i into bin i mod n Append Ito the end of the list for bin i mod n.

Step 2: Concatenate the lists.

Step 3: Place each integer i into bin [i/n].

Step 4: Concatenate the lists.

Each step -- 0(n)

Total time — 0(n)

n=10

Given: 45, 36, 21, 64, 60, 33, 12, 27, 30, 25

BIN CONTENTS

ImodlO= 0 60,301 212 12

210

3 334 645 45,256 367 2789

New list: 60, 30, 21, 12, 33, 64, 45, 25, 36, 27

BIN CONTENTS

LW10J= 01 122 21, 25, 273 30,33,364 4556 60,64789

Radix Sort

typekeytype = record

f1: t1;f2:t2; . . finite, discrete

211

fk: tk end;

keyl = (a1, a2,…ak)

key2 = (b1, b2,…,bk)

keyl <key2 iff

1. a1 <b1, or

2. a1 = b1 and a2< b2, or

k. a1 = b1 ,...,ak-l = bk-1, and ak < bk

i.e. i, 0 ≤ i <k s.t. aj = bj i ≤j ≤i and

ai+1 < bi+1

e.g. abc <aca (called lexicographic order)

VarBi:array[ti] of linked listtype;

Procedure radixsort;

// binsort list A, first on fk, concatenate the bins in Bk, binsort of fk-1,and so on //

begin

212

for i := k down to 1 do beginfor each value V of type ti do

make Bi[v] empty;

for each record r on list A domove r from A on to the endof bin Bi[r.fi];

// binssort on fi //

foe each value V of type ti, from lowest to highest do concatenate Bi[v] onto the end of A

endend;

k kTime : o(|ti| +n) = o(kn+ |ti|) i=1 i =1

Example

A = hact, fact, sact, camp, duck, kuck, codd, less, more

D coddE moreK sack, duck, kuck

213

P campS lessT hact, fact

C sack, duck, kuck, hact, factD coddM campV moreS less

A sack, hact, fact, campE lessO codd, moreU duck, kuck

C camp, coddD duckF factH hactK kuckL lessM moreS sack

Odd-even merge sort

(Useful when you have a parallel computer)

Algorithm Odd-even-merge-sort (a0,a1,…a2n-1)

214

1. Split the list a0,a1,…a2n-1 into two lists a0,a1,…an-1 and a0,an+1,…a2n-1

2. Odd-even-merge-sort (a0,a1,…an-1)

3. Odd-even-merge-sort (an,an+1,…a2n-1)

4. Odd-even-merge (a0,a1,…an-1, an,an+1,…a2n-1)

Algorithm Odd-even-merge (a0,a1,…an-1, b0,b1,…bn-1)

1. c0,c1…cn-1 := Odd-even-merge (a0,a2,…an-2, b0,b2,…bn-2)

2. d0d1…dn-1 := Odd-even-merge (a1,a3,…an-1, b1,b3,…bn-1)

3. For all i > 0, compare c1 and di-1 and interchange i necessary

4. Return; c0 c1 d0 c2 d1 c3 d2 … cn-1 dn-2 dn-1

Example

Odd-even-merge (4,5,8,11,20,25;2,9,10,27,30,31)

Odd-even-merge (4,8,20;2,10,30)returns 2,4,8,10,20,30

215

Odd-even-merge (5,11 ,25;9,27,31)returns 5,9,11,28,27,31

c: 2 4 8 10 20 30

d: 5 9 11 25 27 31

2 4 5 8 9 10 11 20 25 27 30 31

Sequential time complexity:

Odd-even-merge – T1(n)

T1(n) = 2T1(n/2) + cn.= O(ntogn)

Odd-even-merge-sort – T2(n)

T2(n) = 2T2(n/2) + c1nlogn.= O(nlog2n)

In parallel, Odd-even-merge - O(logn) time….sort - O(log2n)

Odd-even-mereg (a0a1a2a3, a4 a5 a6 a7)

a0 a1 a2 a3, a4 a5 a6 a7

a0 a2 a4 a6, a1 a3 a5 a7

216

a0 a4 a2 a6, a1 a5 a3 a7

b0 b1 b2 b3, b4 b5 b6 b7

c0 c1 c2 c3, c4 c5 c6 c7

c0 c1 c4 c2, c5 c3 c6 c7

d0 d1 d2 d3, d4 d5 d6 d7

Lower bound for sorting

Defintion:

Let B be a problem and f(n) a function. B requires Ω (f(n)) time if every algorithm for B has time complexity Ω (f(n)) (i.e., the running time is at least f(n) In the worst case for inputs of length n).

f(n) is a time lower bound for B.

217

Theorem : Sorting by comparisons requires Ω (nlogn) time. (in fact, Ω (nlogn) comparisons)

Assumption: Only operations on keys are comparison of key values.

noyes

Without loss of generality, assume the keys are distinct.

decision trees

Let P be any sorting algorithm. Denote the input by:

A[1..n]:a1,a2,…,an

Define a binary tree as follows

yes no

Key 1 < key 2 ?

A[i1,] < a[j1]?

218

y noy no

An outcome i.e.A sort listar1,ar2,….,am

called the decision tree for P on size n

Decision tree for bubble sort with n =3

1 2 3

A

For I : = 1 to 2 doFor j :=3 down to I +1 do

If A[j]< A[j-1] then swap (A[j],A[j-1])

A[i3,] < a[j3]?A[i2,] < a[j2]?

219

A[1..3] = a b c

yy

y n y y

y n n y n n

Fact: For any sorting algorithm A, the decision tree for a must have atleast n! leaves.

Proof There are ni outcomes when A sorts n elements.

Fact The depth of the decision tree must be at least log

Proof Let depth = d

Abc

A[3] < a[2] ?

Abc

A[3] < a[2] ?

Abc

A[3] < a[2] ?

Abc

A[3] < a[2] ?

Abc

A[3] < a[2] ?

Abc

A[2] < a[3] ?

Abc

A[2] < a[1] ?

cbaacb bca abc

cabbac

220

n! ≤ 2

d ≥[1og2(n!)]

Corollary A requires at least 1og2(n!) comparisons in the worst case. .

n! = (n/e) e = 2.71 83

1og2(n!) = n1og2(n/e)

n1og2(n/e) = Ω (nlogn)

Sorting requires c compansors. Sorting requires Ω (nlogn) time.

Average Time complexity for sorting

= avg depth of leaves in decision tree

221

Claim: Among the n! leaves, at least half of them have depth n1og2

(n!).

Proof: The maximum number of leaves with depth ≤ n1og2(n!)-1 is

n1og2(n!/2) n1og2(n!/2)2 =2

= n!/2

on average, sorting requires

Ω 1og2(n!) = Ω(nlogn) time2

Problem Given a1,a2,…,an, s.t. a1 < a2 <…< an and xFind i s.t. a1 =x

Binary search: O(logn) time

Fact searching requires Ω (logn) time

Proof any of a1,a2,…,an could be x

there are at least n + 1 outcomes when we search x in a1,…,an

a decision tree must have depth log2(n+1)

222

Problem Given a1,a2,…,an find the smallest element

yes no

n-1 elements mustall have lost somecomparison!

External sorting

Assumption : The number of data items to be sorted is too large.

The data items (records) are sorted on external storage devices in the form of (sequential) files.

External storage device:

Magnetic tape read/write

ai < aj

ai < ajai < aj

223

head

Operation:Read (B)Write (B)f.forwardrewind

. . . BLOCK i BLOCK i +1 . . .

inter-block gap

Magnetic DiskA track

R/W head

A sector ( block)

224

To access a sector:

1. locate the current track by shifting the R/W head

2. wait until the correct sector arrives

the time needed to access a block:

seek time + actual R/W time

>>> main memory access time

file : a sequence of block a fixed number of records

in external sorting, the dominating factor is the number of block accesses

it is desirable to scan a file beginning to end

The model

File1 file2 file3CPU

.

.

.

.

.

.

.

.

.

.

.

.

225

Objective: sorting with minimum number of passes through the file (thus, minimum number of block accesses)

Bubble, insertion, …., Quick, heap sorts: require at least O(n) passes.

2-way merge sort: only require [log2n] passes!

ASSUME THAT FILES ARE STOREDON DISKS. THUS, SEEK TIME ISTHE “SAME” FOR ALL BLOCKS.

MAIN MEMORY

C BLOCKs

226

EXAMPLE

Sort file F = A1,A2,…A2100

A block = 100 records

Working main memory space

= 3 blocks ( used as buffers)

Step 1: internally sort three blocks (300 records at time.

Store the resulting file on disk.

Run1 run2 run3 run4 run5 run6 run7

1-300 301-600 601-900 901-1200 1201-1500 1501-1800 1801-2100

Step 2: partition the main memory into three blocks.

Two are used as input buffers and the third is used as an output

227

buffer.

Merge runs 1 and 2 .

Alogrithm

Merge (R1,R2)

Read a block from R1;

Read a block from R2;

Merge the records in the input buffers and store the result in the output buffer;

If the output buffer gets full, write the contents on to the disk and clear the buffer;

If an output buffer gets empty, read a block from the same run.

Merge runs 3 and 4, then 5 and 6 , then copy run 7

The result of this is a file of 4 runs.

Merge these runs and produce a file of two runs.

Merge the two runs to obtain a single run (i.e., a sorted file).

F

228

F1

F2

F3

Notes:

1. If the number of initial runs is m, then [log2m] passes suffice.

2. if the device is tape, then we need four tapes.

Tape 1:Run 1 Run 3 Run 5 Run 7

229

Tape 2: Run 2 Run 4 Run 6

Tape 3:Run 1 Run 3

Tape 4: Run 2 Run 4

Tape 1:Run 1

Tape 2: Run 2

Tape 3:Run 1

Tape 4:

3. Temp files can be discarded after being used.

4. k-way merge

3 way

230

Generally, K-way merge sort requires

[logkm] passes

= [log2m/log2k]

k +1 buffers; more comparison (k-1/record) in each pass

for tapes, k-way merge requires 2k tapes.

General algorithm design techniques

Divide-and-conquer e.g. merge sortTop-down, recursive quick sort

Dynamic programming longest commonBottom –up subsequence

Greedy shortest pathsBrute force minimum-cost spanning tree

231

Back tracking rat-in-maze

Divide-and-conquer

To solve problem A:

If A is small enoughThen solve it directly

ElseBreak A into smaller problems smaller instances

A1,A2,..Ak; Of the same problem

Solve Ai for each i = 1, 2, …, k;Combine the solutions for A1,..Ak

To obtain the solution for A

Example:

Towers of Hanoi

A B C

232

Algorithm Move (n,A,B)

//move n disk from A to B //

if n =1 then move the disk B

else beginmove (n-1, A, C);move (1,A, B);move (n-1, C, B)

end

C1

T(n) = 2T(n-1) + c2 if n =1 otherwise

= O(2n)

Example

Given n integers, find both the maximum and minimum

Algorithm maximum (A[1..n], max, min)

If n =1 thenMax := min := a[1]

Else if n =2 thenifA[1] < A[2] then

max : = A[2]min : = A[1]

else

233

max : = A[1]min := A[2]

else // n > 2//

maximin (A[1..n/2], max1, min1);maximin (A[n/2 +1..n], max2, min2);if max1 < max2 then

max : = max2 else

max : = max1if min1 ≤ min2 then

min : = min1else min : = min2

nC (n) = 2C( 2 ) +2 (comparisons)

1 if n =2

C(n) = 3/2n -2 (by induction)

Dynamic programming

234

There are situations where:

(i) There is no way to divide a problem into a small number ofSubproblems.

(ii) The subproblems overlap each other (too much redundancy if d divide-and-conquer is used).

(iii) The total number of subproblems to tackle is not large, i.e. polynomial (i.e. nk. Usually k =2,3).

Dynamic programming approach:

Systematically solve all the subproblems, with the smallest ones first. Keep track of the solutions to the solved subproblems by means of aTable. Solutions to larger subproblems are found by combining solutions to smaller subproblems.

Example

Longest common subsequence problem

(LCS)

235

sequence : x = a1a2…an

subsequence of x: a sequence obtained from x by deleting some characters

a b c a b

c a b b a b

LCS Problem: given x = a1a2…an

y = b1b2…bm

find the length of an LCS of x and y

Previous solution ( using sets);

O(plogn) time

Where p = the number of paris of positions,

One from each sequence, that have

The same character

In the worst case, p = O(nm)

Time : O(nmlogn)

236

Dynamic programming solution

Given x = a1a2…an and y = b1b2…bm

Define an (n +1) X (m +1) matrix L

L[i,j] = the length of an LCS of a1a2…ai and b1b2…bi

For all 0 ≤ i ≤ n, 0 ≤ j ≤ m

Note :L[0,j] = L[i,0] = 0 0 ≤ i ≤ n, 0 ≤ j ≤ m

L[n,m] is the length of the LCS of x and y

Each L[i,j]A subproblem

L[0,j] = 0, 0 ≤ j ≤ m

L[0,j] = 0, 0 ≤ i ≤ n

L[i-1,j]

L[i-j-1]

i-1,j -1] +1 if a1 = bj

L[i,j] = max0 otherwise

237

1 ≤ I ≤ n, 1 ≤ j ≤ m

a1… ai-1 ai

b1… bj-1 bj

0 1 j -1 j0

1

i -1

i

n

solution

Algorithm LCS

// evaluate matrix L row by row , with row 0 first //

for j : = 0 to m doL[0,j] : = 0;

For i := 1 to n do

0 0 0 00

00

0

238

L[i,0] : = 0;

For i := 1 to n dofor j : = 0 to m do

if ai =bj thentemp : = L(i-1,j-1] +1

else temp :=0L[i,j] : = max (L(i-1,j], L[i,j-1], temp

End; Writeln (L[n,m])

Time = O(nm)

Space = O(min(n,m))

Recursive solution

(divide – and-conquer)

Algorithm LCS (n,m)

If n = 0 or m = o thenLCS : = 0

Else beginI1:=LCS (n-1,m);I2:=LCS(n,m-1);if an = bm then

239

I3 : LCS (n-1, m-1) +1 Else

I3: =0;LCS : = max (I1,I2,I3)

EndEnd;

T(n,m) = T(n-1,m) + T(n,m-1) +T(n-1,m-1) +C= O(3n+m)

Dynamic programming example 2

World Series Odds

Problem: Teams A and B play a match. Whoever wins n games first wins the match.

Assumption: A and B are equally competent, i.e., each has a 50% chance of winning a particular game.

240

P(i,j): The probability that if A needs i games to win (i.e., A has won n-i games) and B needs j games to win, that A will eventually win the game, 0 ≤ i, j ≤ n.

We want to compute P(s,t) for some particular 0 ≤ s.t. ≤ n.

P(0,j) = 1 1 ≤ j ≤ n

P(i,0) = 0 1 ≤ i ≤ n

P(i,j) = P(i-1, j) + P (I, j-1)0 ≤ i, j ≤ n

2

0 j -1 j0

0 0 0

0

00

01 1 1 1

241

1

i -1

i

0

Order of evaluation:

1. row by row/column by column

2. diagonal

Greedy Algorithm

Setting : given n objects a1,a2,…an,

Each with a weight ( or cost) w(ai)

We want to select a subset of objects

a11,a12,…akm, subject to some

constraint, such that

242

m∑ w (aij)

j=1

is the minimum

Example

Coin Changing A1 = c1,c2,…cn is a set of distinct coin types

c1 > c2 > cn ≥ 1

How do we make up an exact amount using a minimum

Total number of coins?

If cn =1 then greedy algorithms can be used.

Algorithm coinchange (x);

i = 1;while x ≠ 0 do begin

if c1 ≤ x then begin //selsect the largest coin whose value is ≤ x //

writeln(c1);x : = x –c1

elsei:= i +1

end

243

e.g. c1 = 25¢ c2 = 10¢ c3 =5¢ c4 = 1¢

x =73¢

change : c1, c1, c2, c2, c4, c4, c4

Notes:

1. The algorithm doesn’t necessarily generate change with minimum total number of coins.

e.g., c1 = 5, c2 = 4, c3 = 1

x =8

2. Does so if A1 = kn-1, kn-2,…k0

Matching.

244

1. Start with m 0

2. Find an augmenting path P relative to M and replace m by MP

3. Repeat (2) until no further augmenting path exists, and then M is maximal.

1 6

27

38

4

5

9

10

1 2 3 4 5

6 7 8 9 10245

P = 1,6

M = 0 1,6 = 1,6

P = (2,6), (1,6),(1,7)

M = 1,6 P = (2,6), (1,7)

6

2 3 4

10

1

7 8 9

5

7

3

1

4 5

6 7

286 6 629 10

246

P = (3,7), (1,7),(1,6),(6,2),(2,9)

M = (2,6),(1,7) P =

= (3,7),(1,6),(2,9)

4

6

5

8 9 10

1 10

62

7

247

Doesn’t not work

P = (4,9), (2,9), (1,6),(1,8)

M = (3,7),(1,6),(2,9) P

= (4,9),(3,2),(1,8)

3 1

7

3

3

3

3

2 5

69 1

0

10

8 4

248

P = (2,9),(4,9),(4,10)

M = (4,9), (1,8),(3,7) P =

= (2,9),(1,8),(4,10),(3,7)

5

6 249

P = (5,6)

M = (2,9),(1,8), (4,10),(3,7) P=

= (2,9),(1,8),(4,10),(3,7),(5,6)

250

the problem solving process - mcmaster …cs2md3/lecturenotes.doc · web viewmultiply matrices a...

Documents