computer languages - finite automata & lexical analysis · finite automata lexical analysis...

124
Finite automata Lexical analysis Computer Languages Finite automata & Lexical analysis 1 Finite automata Recognizing words Deterministic automata Nondeterministic automata 2 Lexical analysis January 21st

Upload: others

Post on 23-Sep-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

Computer LanguagesFinite automata & Lexical analysis

1 Finite automataRecognizing wordsDeterministic automataNondeterministic automata

2 Lexical analysis

January 21st

Page 2: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

Plan

What we know

1 Regular expressions can be usedto describe some (not soelaborate) languages.

Numbers in a programminglanguage, keywords in aprogramming language, e-mailaddresses, dates.

2 Scanner Generators can be usedto create a program that reads asequence of characters andidentifies the words of thelanguage described by a regularexpression!

What we will learn today

1 What kind of program is ascanner?

2 How can such a programbe generated by anotherprogram?

3 How can a scanner beused to do lexicalanalysis?

Page 3: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

Plan

What we know

1 Regular expressions can be usedto describe some (not soelaborate) languages.

Numbers in a programminglanguage, keywords in aprogramming language, e-mailaddresses, dates.

2 Scanner Generators can be usedto create a program that reads asequence of characters andidentifies the words of thelanguage described by a regularexpression!

What we will learn today

1 What kind of program is ascanner?

2 How can such a programbe generated by anotherprogram?

3 How can a scanner beused to do lexicalanalysis?

Page 4: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

Plan

What we know

1 Regular expressions can be usedto describe some (not soelaborate) languages.

Numbers in a programminglanguage, keywords in aprogramming language, e-mailaddresses, dates.

2 Scanner Generators can be usedto create a program that reads asequence of characters andidentifies the words of thelanguage described by a regularexpression!

What we will learn today

1 What kind of program is ascanner?

2 How can such a programbe generated by anotherprogram?

3 How can a scanner beused to do lexicalanalysis?

Page 5: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

Plan

What we know

1 Regular expressions can be usedto describe some (not soelaborate) languages.

Numbers in a programminglanguage, keywords in aprogramming language, e-mailaddresses, dates.

2 Scanner Generators can be usedto create a program that reads asequence of characters andidentifies the words of thelanguage described by a regularexpression!

What we will learn today

1 What kind of program is ascanner?

2 How can such a programbe generated by anotherprogram?

3 How can a scanner beused to do lexicalanalysis?

Page 6: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

Plan

What we know

1 Regular expressions can be usedto describe some (not soelaborate) languages.

Numbers in a programminglanguage, keywords in aprogramming language, e-mailaddresses, dates.

2 Scanner Generators can be usedto create a program that reads asequence of characters andidentifies the words of thelanguage described by a regularexpression!

What we will learn today

1 What kind of program is ascanner?

2 How can such a programbe generated by anotherprogram?

3 How can a scanner beused to do lexicalanalysis?

Page 7: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

Plan

What we know

1 Regular expressions can be usedto describe some (not soelaborate) languages.

Numbers in a programminglanguage, keywords in aprogramming language, e-mailaddresses, dates.

2 Scanner Generators can be usedto create a program that reads asequence of characters andidentifies the words of thelanguage described by a regularexpression!

What we will learn today

1 What kind of program is ascanner?

2 How can such a programbe generated by anotherprogram?

3 How can a scanner beused to do lexicalanalysis?

Page 8: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

Plan

What we know

1 Regular expressions can be usedto describe some (not soelaborate) languages.

Numbers in a programminglanguage, keywords in aprogramming language, e-mailaddresses, dates.

2 Scanner Generators can be usedto create a program that reads asequence of characters andidentifies the words of thelanguage described by a regularexpression!

What we will learn today

1 What kind of program is ascanner?

2 How can such a programbe generated by anotherprogram?

3 How can a scanner beused to do lexicalanalysis?

Page 9: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

Plan

What we know

1 Regular expressions can be usedto describe some (not soelaborate) languages.

Numbers in a programminglanguage, keywords in aprogramming language, e-mailaddresses, dates.

2 Scanner Generators can be usedto create a program that reads asequence of characters andidentifies the words of thelanguage described by a regularexpression!

What we will learn today

1 What kind of program is ascanner?

2 How can such a programbe generated by anotherprogram?

3 How can a scanner beused to do lexicalanalysis?

Page 10: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

One word

We want to recognize whether a stringforms the word for

readChar(c);

if(c!=’f’) do something else

else

readChar(c);

if(c!=’o’) do something else

else

readChar(c);

if(c!=’r’) do something else

else report success

We can represent thecode fragment using thediagram:

Page 11: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

One word

We want to recognize whether a stringforms the word for

readChar(c);

if(c!=’f’) do something else

else

readChar(c);

if(c!=’o’) do something else

else

readChar(c);

if(c!=’r’) do something else

else report success

We can represent thecode fragment using thediagram:

Page 12: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

One word

We want to recognize whether a stringforms the word for

readChar(c);

if(c!=’f’) do something else

else

readChar(c);

if(c!=’o’) do something else

else

readChar(c);

if(c!=’r’) do something else

else report success

We can represent thecode fragment using thediagram:

Page 13: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

One word

We want to recognize whether a stringforms the word for

readChar(c);

if(c!=’f’) do something else

else

readChar(c);

if(c!=’o’) do something else

else

readChar(c);

if(c!=’r’) do something else

else report success

We can represent thecode fragment using thediagram:

Page 14: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

One word

We want to recognize whether a stringforms the word for

readChar(c);

if(c!=’f’) do something else

else

readChar(c);

if(c!=’o’) do something else

else

readChar(c);

if(c!=’r’) do something else

else report success

We can represent thecode fragment using thediagram:

Page 15: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

One word

We want to recognize whether a stringforms the word for

readChar(c);

if(c!=’f’) do something else

else

readChar(c);

if(c!=’o’) do something else

else

readChar(c);

if(c!=’r’) do something else

else report success

We can represent thecode fragment using thediagram:

Page 16: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

One word

We want to recognize whether a stringforms the word for

readChar(c);

if(c!=’f’) do something else

else

readChar(c);

if(c!=’o’) do something else

else

readChar(c);

if(c!=’r’) do something else

else report success

We can represent thecode fragment using thediagram:

Page 17: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

One word

We want to recognize whether a stringforms the word for

readChar(c);

if(c!=’f’) do something else

else

readChar(c);

if(c!=’o’) do something else

else

readChar(c);

if(c!=’r’) do something else

else report success

We can represent thecode fragment using thediagram:

Page 18: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

One word

We want to recognize whether a stringforms the word for

readChar(c);

if(c!=’f’) do something else

else

readChar(c);

if(c!=’o’) do something else

else

readChar(c);

if(c!=’r’) do something else

else report success

We can represent thecode fragment using thediagram:

Page 19: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

One word

We want to recognize whether a stringforms the word for

readChar(c);

if(c!=’f’) do something else

else

readChar(c);

if(c!=’o’) do something else

else

readChar(c);

if(c!=’r’) do something else

else report success

We can represent thecode fragment using thediagram:

Page 20: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

One word

We want to recognize whether a stringforms the word for

readChar(c);

if(c!=’f’) do something else

else

readChar(c);

if(c!=’o’) do something else

else

readChar(c);

if(c!=’r’) do something else

else report success

We can represent thecode fragment using thediagram:

Page 21: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

One word

We want to recognize whether a stringforms the word for

readChar(c);

if(c!=’f’) do something else

else

readChar(c);

if(c!=’o’) do something else

else

readChar(c);

if(c!=’r’) do something else

else report success

We can represent thecode fragment using thediagram:

Page 22: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

One word

We want to recognize whether a stringforms the word for

readChar(c);

if(c!=’f’) do something else

else

readChar(c);

if(c!=’o’) do something else

else

readChar(c);

if(c!=’r’) do something else

else report success

We can represent thecode fragment using thediagram:

Page 23: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

States and transitions

s0, s1, s2, s3 are called states

s0 is marked as the initial state.

s3 is marked as one final state

x−−→ represent transitions from state to statebased on the input character.

Page 24: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

States and transitions

s0, s1, s2, s3 are called states

s0 is marked as the initial state.

s3 is marked as one final state

x−−→ represent transitions from state to statebased on the input character.

Page 25: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

States and transitions

s0, s1, s2, s3 are called states

s0 is marked as the initial state.

s3 is marked as one final state

x−−→ represent transitions from state to statebased on the input character.

Page 26: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

States and transitions

s0, s1, s2, s3 are called states

s0 is marked as the initial state.

s3 is marked as one final state

x−−→ represent transitions from state to statebased on the input character.

Page 27: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

States and transitions

s0, s1, s2, s3 are called states

s0 is marked as the initial state.

s3 is marked as one final state

x−−→ represent transitions from state to statebased on the input character.

Page 28: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

More than one word

Just add proper code in the do something else fragments!

Page 29: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

More than one word

Just add proper code in the do something else fragments!

Page 30: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

Encoding an automata

Transitions (δ)

δ f i n o r t Others0 s5 s1 sE sE sE sE sEs1 s2 sE s3 sE sE sE sEs2 sE sE sE sE sE sE sEs3 sE sE sE sE sE s4 sEs4 sE sE sE sE sE sE sEs5 sE sE sE s6 sE sE sEs6 sE sE sE sE s7 sE sEsE sE sE sE sE sE sE sE

The automaton acceptsor rejects a string asfollows.

Starting in the startstate, for each char c inthe input change stateaccording to δ(s, c).After making ntransitions for ann-character string acceptif the state is a finalstate. Reject otherwise.

Page 31: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

Encoding an automata

Transitions (δ)

δ f i n o r t Others0 s5 s1 sE sE sE sE sEs1 s2 sE s3 sE sE sE sEs2 sE sE sE sE sE sE sEs3 sE sE sE sE sE s4 sEs4 sE sE sE sE sE sE sEs5 sE sE sE s6 sE sE sEs6 sE sE sE sE s7 sE sEsE sE sE sE sE sE sE sE

The automaton acceptsor rejects a string asfollows.

Starting in the startstate, for each char c inthe input change stateaccording to δ(s, c).After making ntransitions for ann-character string acceptif the state is a finalstate. Reject otherwise.

Page 32: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

Encoding an automata

Transitions (δ)

δ f i n o r t Others0 s5 s1 sE sE sE sE sEs1 s2 sE s3 sE sE sE sEs2 sE sE sE sE sE sE sEs3 sE sE sE sE sE s4 sEs4 sE sE sE sE sE sE sEs5 sE sE sE s6 sE sE sEs6 sE sE sE sE s7 sE sEsE sE sE sE sE sE sE sE

The automaton acceptsor rejects a string asfollows.

Starting in the startstate, for each char c inthe input change stateaccording to δ(s, c).After making ntransitions for ann-character string acceptif the state is a finalstate. Reject otherwise.

Page 33: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

More complex words

What automaton recognizesnumbers?

Doesn’t work for numbers with morethan 4 digits!

We want to say thatδ(s2, 0) = s2, δ(s2, 1) = s2,δ(s2, 2) = s2,. . . δ(s2, 9) = s2.

Page 34: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

More complex words

What automaton recognizesnumbers?

Doesn’t work for numbers with morethan 4 digits!

We want to say thatδ(s2, 0) = s2, δ(s2, 1) = s2,δ(s2, 2) = s2,. . . δ(s2, 9) = s2.

Page 35: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

More complex words

What automaton recognizesnumbers?

Doesn’t work for numbers with morethan 4 digits!

We want to say thatδ(s2, 0) = s2, δ(s2, 1) = s2,δ(s2, 2) = s2,. . . δ(s2, 9) = s2.

Page 36: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

More complex words

What automaton recognizesnumbers?

Doesn’t work for numbers with morethan 4 digits!

We want to say thatδ(s2, 0) = s2, δ(s2, 1) = s2,δ(s2, 2) = s2,. . . δ(s2, 9) = s2.

Page 37: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

More complex words

What automaton recognizesnumbers?

Doesn’t work for numbers with morethan 4 digits!

We want to say thatδ(s2, 0) = s2, δ(s2, 1) = s2,δ(s2, 2) = s2,. . . δ(s2, 9) = s2.

Page 38: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

Deterministic finite state automata

1 Finite number of states

2 From each state only onetransition for a given character.

For every regular expression there isa deterministic finite state automatathat recognizes its language.

For every finite automata there is aregular expression that describes thelanguage it recognizes.

The proof of this theorem isthe algorithm that is used by ascanner generator!

Page 39: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

Deterministic finite state automata

1 Finite number of states

2 From each state only onetransition for a given character.

For every regular expression there isa deterministic finite state automatathat recognizes its language.

For every finite automata there is aregular expression that describes thelanguage it recognizes.

The proof of this theorem isthe algorithm that is used by ascanner generator!

Page 40: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

Deterministic finite state automata

1 Finite number of states

2 From each state only onetransition for a given character.

For every regular expression there isa deterministic finite state automatathat recognizes its language.

For every finite automata there is aregular expression that describes thelanguage it recognizes.

The proof of this theorem isthe algorithm that is used by ascanner generator!

Page 41: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

Deterministic finite state automata

1 Finite number of states

2 From each state only onetransition for a given character.

For every regular expression there isa deterministic finite state automatathat recognizes its language.

For every finite automata there is aregular expression that describes thelanguage it recognizes.

The proof of this theorem isthe algorithm that is used by ascanner generator!

Page 42: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

Sketch of the proof (algorithm)

We will see how to associate a nondeterministic finite automata toeach regular expression!

There might be more than one edge labeled with the samesymbol leaving a state.

There might be transitions on the empty string (labeled ∈)

They are easier to construct, but they do not help asprograms!

We will then see how to associate a deterministic finite automatato a nondeterministic one!

Page 43: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

Sketch of the proof (algorithm)

We will see how to associate a nondeterministic finite automata toeach regular expression!

There might be more than one edge labeled with the samesymbol leaving a state.

There might be transitions on the empty string (labeled ∈)

They are easier to construct, but they do not help asprograms!

We will then see how to associate a deterministic finite automatato a nondeterministic one!

Page 44: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

Sketch of the proof (algorithm)

We will see how to associate a nondeterministic finite automata toeach regular expression!

There might be more than one edge labeled with the samesymbol leaving a state.

There might be transitions on the empty string (labeled ∈)

They are easier to construct, but they do not help asprograms!

We will then see how to associate a deterministic finite automatato a nondeterministic one!

Page 45: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

Thompson’s construction

In the following arbitrary NFAs can be used in place of the NFAsfor a and b

a is a RE for a ∈ Σ

is a NFA for a

b is a RE for b ∈ Σ

is a NFA for b

ab is a regular expression

is a NFA for ab

Page 46: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

Thompson’s construction

In the following arbitrary NFAs can be used in place of the NFAsfor a and b

a is a RE for a ∈ Σ

is a NFA for a

b is a RE for b ∈ Σ

is a NFA for b

ab is a regular expression

is a NFA for ab

Page 47: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

Thompson’s construction

In the following arbitrary NFAs can be used in place of the NFAsfor a and b

a is a RE for a ∈ Σ

is a NFA for a

b is a RE for b ∈ Σ

is a NFA for b

ab is a regular expression

is a NFA for ab

Page 48: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

Thompson’s construction

In the following arbitrary NFAs can be used in place of the NFAsfor a and b

a is a RE for a ∈ Σ

is a NFA for a

b is a RE for b ∈ Σ

is a NFA for b

ab is a regular expression

is a NFA for ab

Page 49: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

Thompson’s construction

In the following arbitrary NFAs can be used in place of the NFAsfor a and b

a is a RE for a ∈ Σ

is a NFA for a

b is a RE for b ∈ Σ

is a NFA for b

ab is a regular expression

is a NFA for ab

Page 50: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

Thompson’s construction

a|b is a regular expression

is a NFA for a|b

a∗ is a regular expression

is a NFA for a∗

Page 51: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

Thompson’s construction

a|b is a regular expression

is a NFA for a|b

a∗ is a regular expression

is a NFA for a∗

Page 52: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

Thompson’s construction

a|b is a regular expression

is a NFA for a|b

a∗ is a regular expression

is a NFA for a∗

Page 53: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

Thompson’s construction

a|b is a regular expression

is a NFA for a|b

a∗ is a regular expression

is a NFA for a∗

Page 54: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

Thompson’s construction

a|b is a regular expression

is a NFA for a|b

a∗ is a regular expression

is a NFA for a∗

Page 55: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

And now?

As we see, NFAs recognizingthe languages generated byregular expressions are easy toconstruct!

However, they are no so easyto implement! (How do wedeal with guessing?)

Page 56: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

And now?

As we see, NFAs recognizingthe languages generated byregular expressions are easy toconstruct!

However, they are no so easyto implement! (How do wedeal with guessing?)

Page 57: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

And now?

As we see, NFAs recognizingthe languages generated byregular expressions are easy toconstruct!

However, they are no so easyto implement! (How do wedeal with guessing?)

Page 58: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

NFA to DFA

There is a way to transform a NFA to a deterministic automata bysimulating that we make all possible choices at once!

Example

Page 59: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

NFA to DFA

There is a way to transform a NFA to a deterministic automata bysimulating that we make all possible choices at once!

Example

Page 60: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

NFA to DFA

There is a way to transform a NFA to a deterministic automata bysimulating that we make all possible choices at once!

Example

Page 61: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

Why all this?

The lexical structure of computer languages is described usingregular expressions.

The first part of the compiler reads a sequence of characters

Ignores comments and white spaces.

Finds lexemes that correspond to the lexical structure of thelanguage.

Generates a sequence of tokens for the rest of the compiler!

It is a deterministic finite state automaton generated by a scannergenerator!

Page 62: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

Why all this?

The lexical structure of computer languages is described usingregular expressions.

The first part of the compiler reads a sequence of characters

Ignores comments and white spaces.

Finds lexemes that correspond to the lexical structure of thelanguage.

Generates a sequence of tokens for the rest of the compiler!

It is a deterministic finite state automaton generated by a scannergenerator!

Page 63: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

Why all this?

The lexical structure of computer languages is described usingregular expressions.

The first part of the compiler reads a sequence of characters

Ignores comments and white spaces.

Finds lexemes that correspond to the lexical structure of thelanguage.

Generates a sequence of tokens for the rest of the compiler!

It is a deterministic finite state automaton generated by a scannergenerator!

Page 64: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

Why all this?

The lexical structure of computer languages is described usingregular expressions.

The first part of the compiler reads a sequence of characters

Ignores comments and white spaces.

Finds lexemes that correspond to the lexical structure of thelanguage.

Generates a sequence of tokens for the rest of the compiler!

It is a deterministic finite state automaton generated by a scannergenerator!

Page 65: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

Why all this?

The lexical structure of computer languages is described usingregular expressions.

The first part of the compiler reads a sequence of characters

Ignores comments and white spaces.

Finds lexemes that correspond to the lexical structure of thelanguage.

Generates a sequence of tokens for the rest of the compiler!

It is a deterministic finite state automaton generated by a scannergenerator!

Page 66: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

Why all this?

The lexical structure of computer languages is described usingregular expressions.

The first part of the compiler reads a sequence of characters

Ignores comments and white spaces.

Finds lexemes that correspond to the lexical structure of thelanguage.

Generates a sequence of tokens for the rest of the compiler!

It is a deterministic finite state automaton generated by a scannergenerator!

Page 67: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

Some notation

Lexeme

A legal word in a language.For example, in Java the words

while, class, A, empty, {

are all lexeme.

Token

A category of lexeme. For example, in Java, there are tokens for

WHILE where the only lexeme is while

IDENTIFIER where there are infinitely many lexemes, forexample A, empty.

OPENBRACE where the only lexeme is {

Page 68: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

Some notation

Lexeme

A legal word in a language.For example, in Java the words

while, class, A, empty, {

are all lexeme.

Token

A category of lexeme. For example, in Java, there are tokens for

WHILE where the only lexeme is while

IDENTIFIER where there are infinitely many lexemes, forexample A, empty.

OPENBRACE where the only lexeme is {

Page 69: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

Some notation

Lexeme

A legal word in a language.For example, in Java the words

while, class, A, empty, {

are all lexeme.

Token

A category of lexeme. For example, in Java, there are tokens for

WHILE where the only lexeme is while

IDENTIFIER where there are infinitely many lexemes, forexample A, empty.

OPENBRACE where the only lexeme is {

Page 70: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

Some notation

Lexeme

A legal word in a language.For example, in Java the words

while, class, A, empty, {

are all lexeme.

Token

A category of lexeme. For example, in Java, there are tokens for

WHILE where the only lexeme is while

IDENTIFIER where there are infinitely many lexemes, forexample A, empty.

OPENBRACE where the only lexeme is {

Page 71: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

Some notation

Lexeme

A legal word in a language.For example, in Java the words

while, class, A, empty, {

are all lexeme.

Token

A category of lexeme. For example, in Java, there are tokens for

WHILE where the only lexeme is while

IDENTIFIER where there are infinitely many lexemes, forexample A, empty.

OPENBRACE where the only lexeme is {

Page 72: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

Some notation

Lexeme

A legal word in a language.For example, in Java the words

while, class, A, empty, {

are all lexeme.

Token

A category of lexeme. For example, in Java, there are tokens for

WHILE where the only lexeme is while

IDENTIFIER where there are infinitely many lexemes, forexample A, empty.

OPENBRACE where the only lexeme is {

Page 73: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

Some notation

Regular expressions are used to describe the legal lexeme thatbelong to a token. There will be a regular expression for WHILE,one for IDENTIFIER, one for OPENBRACE.

When we represent tokens, we can use integers (for the token) andsome extra info if needed for further understanding of the source(for example, it is not enough with knowing that we saw anidentifier, we need to keep track of the lexeme!)

Page 74: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

Some notation

Regular expressions are used to describe the legal lexeme thatbelong to a token. There will be a regular expression for WHILE,one for IDENTIFIER, one for OPENBRACE.

When we represent tokens, we can use integers (for the token) andsome extra info if needed for further understanding of the source(for example, it is not enough with knowing that we saw anidentifier, we need to keep track of the lexeme!)

Page 75: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

Some notation

Regular expressions are used to describe the legal lexeme thatbelong to a token. There will be a regular expression for WHILE,one for IDENTIFIER, one for OPENBRACE.

When we represent tokens, we can use integers (for the token) andsome extra info if needed for further understanding of the source(for example, it is not enough with knowing that we saw anidentifier, we need to keep track of the lexeme!)

Page 76: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

The Front End

Scanner Parser Types Trans.sourcecode

tokens AS AS IR

errors

The Scanner (lexical analyzer) transforms a sequence ofcharacters (source code) into a sequence of tokens: arepresentation of the lexemes of the language.

The Parser (syntactical analyzer) takes the sequence of tokensand generates a tree representation, the Abstract Syntax.

This tree is analyzed by the type checker and is then used togenerate the intermediate representation.

Page 77: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

The Front End

Scanner Parser Types Trans.sourcecode

tokens AS AS IR

errors

The Scanner (lexical analyzer) transforms a sequence ofcharacters (source code) into a sequence of tokens: arepresentation of the lexemes of the language.

The Parser (syntactical analyzer) takes the sequence of tokensand generates a tree representation, the Abstract Syntax.

This tree is analyzed by the type checker and is then used togenerate the intermediate representation.

Page 78: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

The Front End

Scanner Parser Types Trans.sourcecode

tokens AS AS IR

errors

The Scanner (lexical analyzer) transforms a sequence ofcharacters (source code) into a sequence of tokens: arepresentation of the lexemes of the language.

The Parser (syntactical analyzer) takes the sequence of tokensand generates a tree representation, the Abstract Syntax.

This tree is analyzed by the type checker and is then used togenerate the intermediate representation.

Page 79: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

The Front End

Scanner Parser Types Trans.sourcecode

tokens AS AS IR

errors

The Scanner (lexical analyzer) transforms a sequence ofcharacters (source code) into a sequence of tokens: arepresentation of the lexemes of the language.

The Parser (syntactical analyzer) takes the sequence of tokensand generates a tree representation, the Abstract Syntax.

This tree is analyzed by the type checker and is then used togenerate the intermediate representation.

Page 80: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

The lexical analyzer

Example

class Factorial{public static void main(String[] a)}

System.out.println(new Fac().ComputeFac(10));}

}

c l a s s ’ ’ F a c t o r i a l { ’\n’ ’\t’ p u b l i c ’ ’

scanner

CLASS (ID, Factorial) { PUBLIC

Page 81: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

The lexical analyzer

Example

class Factorial{public static void main(String[] a)}

System.out.println(new Fac().ComputeFac(10));}

}

c l a s s ’ ’ F a c t o r i a l { ’\n’ ’\t’ p u b l i c ’ ’

scanner

CLASS (ID, Factorial) { PUBLIC

Page 82: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

The lexical analyzer

Example

class Factorial{public static void main(String[] a)}

System.out.println(new Fac().ComputeFac(10));}

}

c l a s s ’ ’ F a c t o r i a l { ’\n’ ’\t’ p u b l i c ’ ’

scanner

CLASS (ID, Factorial) { PUBLIC

Page 83: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

The lexical analyzer

Example

class Factorial{public static void main(String[] a)}

System.out.println(new Fac().ComputeFac(10));}

}

c l a s s ’ ’ F a c t o r i a l { ’\n’ ’\t’ p u b l i c ’ ’

scanner

CLASS (ID, Factorial) { PUBLIC

Page 84: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

The lexical analyzer

Example

class Factorial{public static void main(String[] a)}

System.out.println(new Fac().ComputeFac(10));}

}

c l a s s ’ ’ F a c t o r i a l { ’\n’ ’\t’ p u b l i c ’ ’

scanner

CLASS (ID, Factorial) { PUBLIC

Page 85: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

The lexical analyzer

Example

class Factorial{public static void main(String[] a)}

System.out.println(new Fac().ComputeFac(10));}

}

c l a s s ’ ’ F a c t o r i a l { ’\n’ ’\t’ p u b l i c ’ ’

scanner

CLASS (ID, Factorial) { PUBLIC

Page 86: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

The lexical analyzer

Example

class Factorial{public static void main(String[] a)}

System.out.println(new Fac().ComputeFac(10));}

}

c l a s s ’ ’ F a c t o r i a l { ’\n’ ’\t’ p u b l i c ’ ’

scanner

CLASS (ID, Factorial) { PUBLIC

Page 87: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

The lexical analyzer

Example

class Factorial{public static void main(String[] a)}

System.out.println(new Fac().ComputeFac(10));}

}

c l a s s ’ ’ F a c t o r i a l { ’\n’ ’\t’ p u b l i c ’ ’

scanner

CLASS (ID, Factorial) { PUBLIC

Page 88: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

The lexical analyzer

Example

class Factorial{public static void main(String[] a)}

System.out.println(new Fac().ComputeFac(10));}

}

c l a s s ’ ’ F a c t o r i a l { ’\n’ ’\t’ p u b l i c ’ ’

scanner

CLASS (ID, Factorial) { PUBLIC

Page 89: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

The lexical analyzer

Example

class Factorial{public static void main(String[] a)}

System.out.println(new Fac().ComputeFac(10));}

}

c l a s s ’ ’ F a c t o r i a l { ’\n’ ’\t’ p u b l i c ’ ’

scanner

CLASS (ID, Factorial) { PUBLIC

Page 90: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

The lexical analyzer

Example

class Factorial{public static void main(String[] a)}

System.out.println(new Fac().ComputeFac(10));}

}

c l a s s ’ ’ F a c t o r i a l { ’\n’ ’\t’ p u b l i c ’ ’

scanner

CLASS (ID, Factorial) { PUBLIC

Page 91: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

The lexical analyzer

Example

class Factorial{public static void main(String[] a)}

System.out.println(new Fac().ComputeFac(10));}

}

c l a s s ’ ’ F a c t o r i a l { ’\n’ ’\t’ p u b l i c ’ ’

scanner

CLASS (ID, Factorial) { PUBLIC

Page 92: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

The lexical analyzer

Example

class Factorial{public static void main(String[] a)}

System.out.println(new Fac().ComputeFac(10));}

}

c l a s s ’ ’ F a c t o r i a l { ’\n’ ’\t’ p u b l i c ’ ’

scanner

CLASS (ID, Factorial) { PUBLIC

Page 93: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

The lexical analyzer

Example

class Factorial{public static void main(String[] a)}

System.out.println(new Fac().ComputeFac(10));}

}

c l a s s ’ ’ F a c t o r i a l { ’\n’ ’\t’ p u b l i c ’ ’

scanner

CLASS (ID, Factorial) { PUBLIC

Page 94: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

The lexical analyzer

Example

class Factorial{public static void main(String[] a)}

System.out.println(new Fac().ComputeFac(10));}

}

c l a s s ’ ’ F a c t o r i a l { ’\n’ ’\t’ p u b l i c ’ ’

scanner

CLASS (ID, Factorial) { PUBLIC

Page 95: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

The lexical analyzer

Example

class Factorial{public static void main(String[] a)}

System.out.println(new Fac().ComputeFac(10));}

}

c l a s s ’ ’ F a c t o r i a l { ’\n’ ’\t’ p u b l i c ’ ’

scanner

CLASS (ID, Factorial) { PUBLIC

Page 96: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

The lexical analyzer

Example

class Factorial{public static void main(String[] a)}

System.out.println(new Fac().ComputeFac(10));}

}

c l a s s ’ ’ F a c t o r i a l { ’\n’ ’\t’ p u b l i c ’ ’

scanner

CLASS (ID, Factorial) { PUBLIC

Page 97: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

The lexical analyzer

Example

class Factorial{public static void main(String[] a)}

System.out.println(new Fac().ComputeFac(10));}

}

c l a s s ’ ’ F a c t o r i a l { ’\n’ ’\t’ p u b l i c ’ ’

scanner

CLASS (ID, Factorial) { PUBLIC

Page 98: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

The lexical analyzer

Example

class Factorial{public static void main(String[] a)}

System.out.println(new Fac().ComputeFac(10));}

}

c l a s s ’ ’ F a c t o r i a l { ’\n’ ’\t’ p u b l i c ’ ’

scanner

CLASS (ID, Factorial) { PUBLIC

Page 99: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

The lexical analyzer

Example

class Factorial{public static void main(String[] a)}

System.out.println(new Fac().ComputeFac(10));}

}

c l a s s ’ ’ F a c t o r i a l { ’\n’ ’\t’ p u b l i c ’ ’

scanner

CLASS (ID, Factorial) { PUBLIC

Page 100: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

The lexical analyzer

Example

class Factorial{public static void main(String[] a)}

System.out.println(new Fac().ComputeFac(10));}

}

c l a s s ’ ’ F a c t o r i a l { ’\n’ ’\t’ p u b l i c ’ ’

scanner

CLASS (ID, Factorial) { PUBLIC

Page 101: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

The lexical analyzer

Example

class Factorial{public static void main(String[] a)}

System.out.println(new Fac().ComputeFac(10));}

}

c l a s s ’ ’ F a c t o r i a l { ’\n’ ’\t’ p u b l i c ’ ’

scanner

CLASS (ID, Factorial) { PUBLIC

Page 102: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

The lexical analyzer

Example

class Factorial{public static void main(String[] a)}

System.out.println(new Fac().ComputeFac(10));}

}

c l a s s ’ ’ F a c t o r i a l { ’\n’ ’\t’ p u b l i c ’ ’

scanner

CLASS (ID, Factorial) { PUBLIC

Page 103: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

The lexical analyzer

Example

class Factorial{public static void main(String[] a)}

System.out.println(new Fac().ComputeFac(10));}

}

c l a s s ’ ’ F a c t o r i a l { ’\n’ ’\t’ p u b l i c ’ ’

scanner

CLASS (ID, Factorial) { PUBLIC

Page 104: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

The lexical analyzer

Example

class Factorial{public static void main(String[] a)}

System.out.println(new Fac().ComputeFac(10));}

}

c l a s s ’ ’ F a c t o r i a l { ’\n’ ’\t’ p u b l i c ’ ’

scanner

CLASS (ID, Factorial) { PUBLIC

Page 105: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

The lexical analyzer

Example

class Factorial{public static void main(String[] a)}

System.out.println(new Fac().ComputeFac(10));}

}

c l a s s ’ ’ F a c t o r i a l { ’\n’ ’\t’ p u b l i c ’ ’

scanner

CLASS (ID, Factorial) { PUBLIC

Page 106: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

The lexical analyzer

Example

class Factorial{public static void main(String[] a)}

System.out.println(new Fac().ComputeFac(10));}

}

c l a s s ’ ’ F a c t o r i a l { ’\n’ ’\t’ p u b l i c ’ ’

scanner

CLASS (ID, Factorial) { PUBLIC

Page 107: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

The lexical analyzer - cont.

What are the tokens of minijava?www.cambridge.org/resources/052182060X/

each keyword is a tokeneach punctuation symbol is a tokeneach operator is a tokenan identifier is a token (and we are interested in its value!)an integer literal is a token (and we are interested in its value!)spaces, new lines, tabs and comments are ignored!

look at the appendices at the end of the book! The terminalsfor the grammar are the tokens!

Page 108: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

The lexical analyzer - cont.

What are the tokens of minijava?www.cambridge.org/resources/052182060X/

each keyword is a tokeneach punctuation symbol is a tokeneach operator is a tokenan identifier is a token (and we are interested in its value!)an integer literal is a token (and we are interested in its value!)spaces, new lines, tabs and comments are ignored!

look at the appendices at the end of the book! The terminalsfor the grammar are the tokens!

Page 109: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

The lexical analyzer - cont.

What are the tokens of minijava?www.cambridge.org/resources/052182060X/

each keyword is a tokeneach punctuation symbol is a tokeneach operator is a tokenan identifier is a token (and we are interested in its value!)an integer literal is a token (and we are interested in its value!)spaces, new lines, tabs and comments are ignored!

look at the appendices at the end of the book! The terminalsfor the grammar are the tokens!

Page 110: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

The lexical analyzer - cont.

What are the tokens of minijava?www.cambridge.org/resources/052182060X/

each keyword is a tokeneach punctuation symbol is a tokeneach operator is a tokenan identifier is a token (and we are interested in its value!)an integer literal is a token (and we are interested in its value!)spaces, new lines, tabs and comments are ignored!

look at the appendices at the end of the book! The terminalsfor the grammar are the tokens!

Page 111: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

The lexical analyzer - cont.

What are the tokens of minijava?www.cambridge.org/resources/052182060X/

each keyword is a tokeneach punctuation symbol is a tokeneach operator is a tokenan identifier is a token (and we are interested in its value!)an integer literal is a token (and we are interested in its value!)spaces, new lines, tabs and comments are ignored!

look at the appendices at the end of the book! The terminalsfor the grammar are the tokens!

Page 112: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

The lexical analyzer - cont.

What are the tokens of minijava?www.cambridge.org/resources/052182060X/

each keyword is a tokeneach punctuation symbol is a tokeneach operator is a tokenan identifier is a token (and we are interested in its value!)an integer literal is a token (and we are interested in its value!)spaces, new lines, tabs and comments are ignored!

look at the appendices at the end of the book! The terminalsfor the grammar are the tokens!

Page 113: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

The lexical analyzer - cont.

What are the tokens of minijava?www.cambridge.org/resources/052182060X/

each keyword is a tokeneach punctuation symbol is a tokeneach operator is a tokenan identifier is a token (and we are interested in its value!)an integer literal is a token (and we are interested in its value!)spaces, new lines, tabs and comments are ignored!

look at the appendices at the end of the book! The terminalsfor the grammar are the tokens!

Page 114: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

The lexical analyzer - cont.

What are the tokens of minijava?www.cambridge.org/resources/052182060X/

each keyword is a tokeneach punctuation symbol is a tokeneach operator is a tokenan identifier is a token (and we are interested in its value!)an integer literal is a token (and we are interested in its value!)spaces, new lines, tabs and comments are ignored!

look at the appendices at the end of the book! The terminalsfor the grammar are the tokens!

Page 115: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

The lexical analyzer - cont.

What are the tokens of minijava?www.cambridge.org/resources/052182060X/

each keyword is a tokeneach punctuation symbol is a tokeneach operator is a tokenan identifier is a token (and we are interested in its value!)an integer literal is a token (and we are interested in its value!)spaces, new lines, tabs and comments are ignored!

look at the appendices at the end of the book! The terminalsfor the grammar are the tokens!

Page 116: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

Source code for JFlex

JFlex specification

usercode

%%options and declarations

%%lexical rules

Code to be placed at the begining of the class with the generatedlexer. (package and imports.)%%Directives to adapt the lexer class to other programs.Code that will be included in the generated classDefinitions used in regular expressions%%Regular expressions and the actions to be taken on recognizingtokens.

Page 117: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

Source code for JFlex

JFlex specification

usercode

%%options and declarations

%%lexical rules

Code to be placed at the begining of the class with the generatedlexer. (package and imports.)%%Directives to adapt the lexer class to other programs.Code that will be included in the generated classDefinitions used in regular expressions%%Regular expressions and the actions to be taken on recognizingtokens.

Page 118: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

Source code for JFlex

JFlex specification

usercode

%%options and declarations

%%lexical rules

Code to be placed at the begining of the class with the generatedlexer. (package and imports.)%%Directives to adapt the lexer class to other programs.Code that will be included in the generated classDefinitions used in regular expressions%%Regular expressions and the actions to be taken on recognizingtokens.

Page 119: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

Source code for JFlex

JFlex specification

usercode

%%options and declarations

%%lexical rules

Code to be placed at the begining of the class with the generatedlexer. (package and imports.)%%Directives to adapt the lexer class to other programs.Code that will be included in the generated classDefinitions used in regular expressions%%Regular expressions and the actions to be taken on recognizingtokens.

Page 120: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

Source code for JFlex

JFlex specification

usercode

%%options and declarations

%%lexical rules

Code to be placed at the begining of the class with the generatedlexer. (package and imports.)%%Directives to adapt the lexer class to other programs.Code that will be included in the generated classDefinitions used in regular expressions%%Regular expressions and the actions to be taken on recognizingtokens.

Page 121: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

Source code for JFlex

JFlex specification

usercode

%%options and declarations

%%lexical rules

Code to be placed at the begining of the class with the generatedlexer. (package and imports.)%%Directives to adapt the lexer class to other programs.Code that will be included in the generated classDefinitions used in regular expressions%%Regular expressions and the actions to be taken on recognizingtokens.

Page 122: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

Source code for JFlex

JFlex specification

usercode

%%options and declarations

%%lexical rules

Code to be placed at the begining of the class with the generatedlexer. (package and imports.)%%Directives to adapt the lexer class to other programs.Code that will be included in the generated classDefinitions used in regular expressions%%Regular expressions and the actions to be taken on recognizingtokens.

Page 123: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

%%%debug%class minijavaLexer%implements minijavaTokens%int%unicode%line%column%{Object semanticValue;int token;%}nl = \ n | \ r | \ r \ nnls = nl | [ \ f \ t]%%"class" {return token = CLASS;}"+" {return token = ’+’;}{nls} {/* ignore new lines and spaces */}

Page 124: Computer Languages - Finite automata & Lexical analysis · Finite automata Lexical analysis Plan What we know 1 Regular expressionscan be used to describe some (not so elaborate)

Finite automata Lexical analysis

interface minijavaTokens {int ENDINPUT = 0;int CLASS = 1;int error = 2;// ’+’ (code=43)

}