chc5025 foundations of computation

30
CHC5025 Foundations of Computation (Pushdown Automata) 2021-2022 Semester 1 Week 6 Lecture Maged Refat Fakirah

Upload: others

Post on 04-May-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CHC5025 Foundations of Computation

CHC5025 Foundations of Computation

(Pushdown Automata)

2021-2022 – Semester 1

Week 6 – Lecture

Maged Refat Fakirah

Page 2: CHC5025 Foundations of Computation

Overview

• Context Free Grammars

• Pushdown automata (stack machines)

• Counter machines

• Lexical analyzers, parsers and compilers

Page 3: CHC5025 Foundations of Computation

1. Context Free Grammars

Page 4: CHC5025 Foundations of Computation

Context free grammars and pushdown automata

• We have seen that every regular grammar corresponds to non-deterministic

finite automaton.

• Context free languages are the languages generated by some context free

grammars.

• Similarly, every context free grammar corresponds to a pushdown automata.

• Recall that a regular grammar is a special kind of context free grammar.

Page 5: CHC5025 Foundations of Computation

Context free grammars

- A context free grammar (CFG) is defined by 4-tuples as G = (V, Σ, S, P).

• V elements are called variables or nonterminals.

• Σ elements are the set of terminal symbols.

• S is the start symbol, S ∈ V.

• P elements are called production rules.

- CFG has a production rule of the form:

X → x

where X ∈ V and x ∈ (V ∪ Σ)*

Page 6: CHC5025 Foundations of Computation

Example

The production rules of a grammar that would generate the language

{xnyn | n > 0} are as follows:

1) S → xAy, A → xAy | λ

2) S → xA, A → Sy | λ

Here, the set of variables V = {S, A}, the terminals Σ = {x, y, λ}, the start symbol is S,

but these details are often inferred from the context.

• Why is this grammar not regular?

• Accordingly, the CFG is a formal grammar which is used to generate all

possible patterns of a given finite language.

Page 7: CHC5025 Foundations of Computation

Arithmetic expressions

Page 8: CHC5025 Foundations of Computation

Questions

1. What language is generated by the following grammar?

S → xSx | ySy | x | y | λ

2. Give the production rules of a grammar that generates the language

{xnyn+2 | n ≥ 0}

3. Give the production rules of a grammar that generates the set of all

odd-length strings in {x, y}* with middle symbol X.

Page 9: CHC5025 Foundations of Computation

Questions 4

Page 10: CHC5025 Foundations of Computation

2. Pushdown Automata

(Stack Machines)

Page 11: CHC5025 Foundations of Computation

Pushdown automata

• pushdown automaton (PDA) is sometimes called a stack machine.

• Whereas NDFAs and DFAs can only read symbols, a PDA can also

write a symbol, or ”push” it onto the stack, and then read it back (”pop”

it) later.

• Note that all reading and writing can only be done at the top of the

stack: it is a ”last in, first out” storage device.

• The size of the stack is unlimited

• If the stack is empty, a PDA can detect this.

• Initially, the stack is empty.

Page 12: CHC5025 Foundations of Computation

Operations

Page 13: CHC5025 Foundations of Computation

Example

Page 14: CHC5025 Foundations of Computation

Programs

Page 15: CHC5025 Foundations of Computation

Example

Page 16: CHC5025 Foundations of Computation

Questions

• Draw the stack machine that recognizes the language

{xnyyxn | n > 0}

• Write down the grammar for this language

Page 17: CHC5025 Foundations of Computation

3. Counter Machines

Page 18: CHC5025 Foundations of Computation

Two Kinds of Counter Device

• The unsigned counter can increase or decrease a counter by 1, and

test whether the counter is zero.

• The signed counter can increase or decrease a counter by 1, and test

whether the counter is zero, positive or negative.

• The size of the counter is unlimited, but for an unsigned counter it can

never be negative

• Initially, the counter is zero

• Counter machines cannot write symbols

• Machines can contain more than one counter or stack, or bo

Page 19: CHC5025 Foundations of Computation

Counter Operations

Unsigned counter

• INC Increment counter by 1

• DEC Decrement non-zero counter by 1

• QZERO Test that counter is zero

• QPOS Test that counter is positive

Signed counter

• INC Increment counter by 1

• DEC Decrement counter by 1

• QZERO Test that counter is zero

• QPOS Test that counter is positive

• QNEG Test that counter is negative

In addition, either device can do no operation (NOOP) at any

Page 20: CHC5025 Foundations of Computation

Example

Page 21: CHC5025 Foundations of Computation

Program

The unsigned counter machine that recognises the language

{x2nyn | n > 0} has type

machine [control, input, unsigned counter]

and set of instructions

[ (1 → 2, SCANx, NOOP),

(2 → 1, SCANx, INC),

(1 → 3, SCANy, DEC),

(3 → 3, SCANy, DEC),

(1 → 4, EOF, QZERO),

(3 → 4, EOF, QZERO) ]

Page 22: CHC5025 Foundations of Computation

Nondeterminism

• Recall that NDFAs are no more expressive than DFAs: both

recognize only regular languages

• In contrast, nondeterministic PDAs are more expressive than

deterministic ones.

• However, not every context free language can be accepted by a

deterministic PDA.

• Theorem A language is context free if and only if some (possibly

nondeterministic) pushdown automata with a single stack

recognizes it.

Page 23: CHC5025 Foundations of Computation

4. Applications

Page 24: CHC5025 Foundations of Computation

Applications of Abstract Machines: Compilers

Page 25: CHC5025 Foundations of Computation

Phases of Compilation

Page 26: CHC5025 Foundations of Computation

Lexical Analysis and Parsing

• The lexical analyzer scans the source program, character by character,

and recognizes tokens such as numbers and identifiers.

• It also strips off any white space (ie tab, newline and space characters).

• Tokens are defined by regular grammars.

• In contrast, the grammar used by the parser is usually a context free

grammar.

• The parser verifies that a given sequence of tokens is acceptable

according to this grammar.

Page 27: CHC5025 Foundations of Computation

Example

• Consider the following language of arithmetic expressions:

E → I | E + E | E ∗ E | (E)

I → a | b | Ia | Ib | I0 | I1

• Identifiers (I) are defined by a regular grammar.

• Expressions (E) are defined by a context free grammar.

• The lexical analyzer takes as input the definition of I, together with the

values of the terminal symbols in E: +, *, (, ). It outputs two kinds of

token: identifiers and terminal symbol.

Page 28: CHC5025 Foundations of Computation

Example (continued)

• The parser takes as input the definition of E and the names of the

tokens output by the lexical analyzer.

• It then checks that the tokens form a valid expression.

• Suppose that the following string is given as input to the lexical

analyser a01 + ba

• The lexical analyser will read the string, character by character, and

output the three tokens a01, ′+′ and ba

Page 29: CHC5025 Foundations of Computation

Example (continued)

• The parser will then verify that the token sequence I + I is a valid

expression, where I denotes an identifier

• If the lexical analyser is given a string that contains an invalid identifier,

such as 0ab, then it will output an error message

• Similarly, if the parser is given an invalid sequence of tokens, such as I+,

it will output an error message

Page 30: CHC5025 Foundations of Computation

Summary

• Context free grammars generate context free languages

• Context free languages are recognised by PDAs

• The lexical analyser processes a file, character by character, according

to a regular grammar, into output tokens

• The parser checks that strings of tokens form valid strings according to

a context free grammar