chapter 3. lexical analysis (2)

Chapter 3.

Lexical Analysis (2)

2

Nondeterministic Finite Automata

A nondeterministic finite automaton(NFA) is a mathematical model that consists of

1. a set of state S2. a set of input symbols (the input symbol alphabet)3. a transition function move that maps state-symbol pairs

to sets of states

4. a state s0 that is distinguished as the start (or initial) state

5. a set of states F distinguished as accepting (or final) states

3

STATEINPUT SYMBOL

a b

0

1

2

{0, 1}

--

{0}

{2}

{3}

Fig. 3.20. Transition table for the finite automaton of Fig. 3.19.

b

0start

1a

2b b

103

a

Fig. 3.19. A nondeterministic finite automaton.

4

Deterministic Finite Automata

A deterministic finite automata(DFA) is a special case of a nondeterministic finite automaton in which

1. no state has an -transition, i.e., a transition on input , and

2. for each state s and input symbol a, there is at most one edge labeled a leaving s.

5

Fig. 3.21. NFA accepting aa* |bb*.

0start

1

a102

a

3

b104

b

6

Fig. 3.23. DFA accepting (a|b)*abb.

0start

1a

2b b

103

b

a a

a

b

7

Fig. 3.24. Operations on NFA states.

OPERATION DESCRIPTION

-closure(s) Set of NFA states reachable from NFA state s on -transitions alone.

-closure(T) Set of NFA states reachable from some NFA state s in T on -transitions alone.

move(T, a) Set of NFA states to which there is a transition on input symbol a from some NFA state s in T.

8

Example 3.15

-closure(move(A, a))

-closure(move({0, 1, 2, 4, 7}, a)) = -closure({3, 8})

= {1, 2 , 3, 4, 6, 7, 8}

C = -closure({5}) = {1, 2, 4, 5, 6, 7}

The five different sets of states are :

A = {0, 1, 2, 4, 7} D = { 1, 2, 4, 5, 6, 7, 9}

B = {1, 2, 3, 4, 6, 7, 8} E = { 1, 2, 4, 5, 6, 7, 10}

C = {1, 2, 4, 5, 6, 7}

9

Fig. 3.27. NFA N for (a|b)*abb.

1start

2

1010

a3

4 5

0 6

b

87 9 a b b

10

Fig. 3.28. Translation table Dtran for DFA.

STATEINPUT SYMBOL

a b

A

B

C

D

E

B

B

B

B

B

C

D

C

E

C

11

Fig. 3.29. Result of applying the subset construction of Fig. 3.27.

Astart

Ba

Db b

10E

b

a a

a

bC

b

a

12

Thompson’s construction (1/2)

1. For , construct the NFA

2. For a in , construct the NFA

3. Suppose N(s) and N(t) are NFA’s for regular expressions s and t.

a) For the regular expression s|t, construct the following composite NFA N(s|t):

starti 10f

starti 10f

a

i

start10f

N(s)

N(t)

13

Thompson’s construction (2/2)

b) For the regular expression st, construct the composite NFA N(st):

c) For the regular expression s*, construct the composite NFA N(s*):

d) For the parenthesized regular expression (s), use N(s) itself as the NFA.

istart

10fN(s) N(t)

N(s)istart

10f

14

Fig. 3.32. Space and time taken to recognize regular expressions.

AUTOMATON SPACE TIME

NFA

DFA

O(|r|)

O(2|r|)

O(|r||x|)

O(|x|)

15

Fig. 3.35. NFA recognizing three different patterns.

4

1start a

102

3start a

1065b b

7start b

108

ba

4

1

start

a102

3a

1065b b

7b

108

ba

0

(a) NFA for a, abb, and a*b+.

(b) Combined NFA.

16

Fig. 3. 38. NFA recognizing Fortran keyword IF

2start

1I

3F (

0

4)

1065letter

any

17

Fig. 3.41. firstpos and lastpos for nodes in syntax tree for (a|b)*abb#.

{1,2,3} {6}

{1,2,3} {5}

{1,2,3} {4}

{1,2,3} {3}

{1,2} {1,2}*

{6} # {6}

{5} b {5}

{4} b {4}

{3} a {3}

{1,2} {1,2}|

{2} b {2}{1} a {1}

18

Fig. 3.42. The function followpos.

NODE followpos

1

2

3

4

5

6

{1, 2, 3}

{1, 2, 3}

{4}

{5}

{6}

-

chapter 3. lexical analysis (2)

Documents

nfa states

b combined nfa

finite automaton of

following composite

regular expressions

transition table

transition function

set of state s2