cs419 lec5 lexical analysis using dfa
TRANSCRIPT
Compilers
WELCOME TO A JOURNEY TO
CS419 Lecture 5
Scanning using deterministic Finite automata (DFA)
Cairo UniversityFCI
Dr. Hussien SharafComputer Science [email protected]
FINITE AUTOMATA FA
Its goal is to act as a recognizer for specific language/pattern.
Any problem can be presented in form of decidable problem that can be answered by Yes/No.
Hence FA (machine with limited memory) can solve any problem.
FA used to be hard wired devices like controller of a lift.
Dr. Hussien M. Sharaf 2
IMPLEMENTATION OF FINITE AUTOMATA IN CODE
Possible ways: 1. Use the position in the code (nested within
tests) to maintain the state implicitly.2. Use a variable to maintain the current
state and write transitions as doubly nested case statements inside a loop, where the first case statement tests current state and the nested second level tests the input character, given the state or vice versa.
Dr. Hussien M. Sharaf 3
EXAMPLE FOR METHOD 1
The position in the code to maintain the state
Implicitly.
Dr. Hussien M. Sharaf 4
CODE 1
Dr. Hussien M. Sharaf 5
EXAMPLE FOR METHOD 2
A variable to maintain the current state.
Dr. Hussien M. Sharaf 6
CODE 2
Dr. Hussien M. Sharaf 7
There is a fixed number of states and the DFA can only be in one state at a time.
DFA = “a 5-tuple “ (Q, Σ, , q0, F)
Q: {q0, q1, q2, …} is set of states.
Σ: {a, b, …} set of alphabet.(delta): A transition function, which is a total
function from Q x Σ to Q, this function: Takes a state and input symbol as arguments. Returns a single state.
: Q x Σ→Qq0 Q is the start state.
F Q is the set of final/accepting states.
DETERMINISTIC FINITE AUTOMATA DFA
Dr. Hussien M. Sharaf 8
: Q x Σ→Q Maps from domain of (states, letters)
to range of states.
TRANSITION FUNCTION
Q x Σ
Q
(q0, a)(q2, b)(q1, b)
q1
q2
q3
Dr. Hussien M. Sharaf 9
Lets name it Conditional link instead of transition
λ: Q x Q → Σ Maps from domain of (states, states) to
range of letters.
RENAMING FUNCTION TO λ (LAMDA)
Q x QΣ
(q0, q1)(q2, q3)(q1, q3)
a
Dr. Hussien M. Sharaf 10
b
c
Allows more than one link from domain to codomain
Not recommended
EXAMPLE1.1 Build an FA that accepts only aab L = {aab}
S1-
S3
aS2
a b+S4
S5
bb a a,b
a bS1 S2 S5
S2 S3 S5
S3 S5 S4
S4 S5 S5
Dr. Hussien M. Sharaf 11
a, b
EXAMPLE1.2 Build a DFA that accepts only aab DFA that is not well defined.S1-
S3
aS2
a b+S4
a bS1 S2 ?
S2 S3 ?
S3 ? ?
S4 ? ?Dr. Hussien M. Sharaf 12
EX2
(a+b)*
a, b
±
Dr. Hussien M. Sharaf 13
FA ACCEPTING NOTHING
1. FA with no final states
a
-
a,b
b
2. FA with disconnected graph. Start state does not have a path to the final state.
a
-
a,b
b
+b
Dr. Hussien M. Sharaf 14
EX3
All words with even count of letters.((a+b)(a+b))*
1±
2
a, b
a, b
Dr. Hussien M. Sharaf 15
EX4.1
All words that start with “a”a(a+b)*
1-
2
b
a 3 + a,
b
1-
2b
a 3 +
a,b
a,b Does not accept all inputs
Dr. Hussien M. Sharaf 16
EX4.2
All words that start with “a”a(a+b)*
4+
1-
2b
a 3 +
a,b
a,b
a,b
Special accept state for string “a”, might give better performance in hardware implementation
Dr. Hussien M. Sharaf 17
ASSIGNMENT2
Compilers_DFA_Sheet_2.pdf Deadline is 7 March-2013
Dr. Hussien M. Sharaf 18
EX5
All words that start with triple letter(aaa+bbb)(a+b)*
1-
2a 3
a,b
4b 5b
6+
b
a a
Dr. Hussien M. Sharaf 19
EX6
{aa, ba, baba, aaaa, bbba, abba, aaabaa, …}
All words with even count of letters and ends with “a”. (a+b)a ((a+b)a (b(a+b)a)* )*
-a,b
+
a,b
5b
a
b
Dr. Hussien M. Sharaf 20
EX7
{aa, ba, baba, aaaa, ab, bb, bababa, aaba, …}
All words with even count of letters having “a” in an even position from the start, where the first letter is letter number one.
(a+b)a((a+b)a)*
-
a,b
Dr. Hussien M. Sharaf 21
EX8 COMPILERS CONSTRUCTION PAGE 50
Consider the following FA:•Accepts identifiers where symbols are not acceptded with error handling.
•An identifier must start with a letter.letter [a-zA-Z]Other =~letterOther# = ~(letter|digit)
— letter
digit
letter
error
other
Other#
any
Dr. Hussien M. Sharaf 22
EX9 COMPILERS CONSTRUCTION PAGE 52
Integer constants
—
digit
error
- digit
digit
other
other
Dr. Hussien M. Sharaf 23
EX10 COMPILERS CONSTRUCTION PAGE 52
Decimal/floating constants
— +
digit
error
- digit
digit
other
other
+digit.
digit
other
Dr. Hussien M. Sharaf 24
THANK YOU
Dr. Hussien M. Sharaf 25