recognising languages we will tackle the problem of defining languages by considering how we could...

20
Recognising Languages We will tackle the problem of defining languages by considering how we could recognise them. Problem: Is there a method of recognising infinite languages? i.e. given a language description and a string, is there an algorithm which will answer yes or no correctly? We will define an abstract machine which takes a candidate string and produces the answer yes or no. The abstract machine will be the specification of the language.

Upload: michelle-malone

Post on 28-Mar-2015

214 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Recognising Languages We will tackle the problem of defining languages by considering how we could recognise them. Problem: Is there a method of recognising

Recognising Languages

We will tackle the problem of defining languagesby considering how we could recognise them.

Problem:

Is there a method of recognising infinitelanguages?

i.e. given a language description and a string,is there an algorithm which will answer yes orno correctly?

We will define an abstract machine which takes a candidate string and produces the answer yes or no.

The abstract machine will be the specification of the language.

Page 2: Recognising Languages We will tackle the problem of defining languages by considering how we could recognise them. Problem: Is there a method of recognising

Finite State Automata

A finite state automaton is an abstract modelof a simple machine (or computer).

The machine can be in a finite number ofstates. It receives symbols as input, and theresult of receiving a particular input in aparticular state moves the machine to aspecified new state. Certain states arefinishing states, and if the machine is in one ofthose states when the input ends, it has endedsuccessfully (or has accepted the input).

Example: A1

1

2 3

4

a

bb

b

a

aa,b

Page 3: Recognising Languages We will tackle the problem of defining languages by considering how we could recognise them. Problem: Is there a method of recognising

FSA: Formal Definition

A Finite State Automaton (FSA) is a 5-tuple (Q, I, F, T, E) where:Q = states = a finite set;I = initial states = a subset of Q;F = final states = a subset of Q;T = an alphabet;E = edges = a subset of Q (T + ) Q.

FSA = labelled, directed graph = set of nodes (some final/initial) +

each arc may have a label from an alphabet +directed arcs (arrows) between nodes.

Example: formal definition of A1

Q = {1, 2, 3, 4}I = {1}F = {4}T = {a, b}E = { (1,a,2), (1,b,4), (2,a,3), (2,b,4), (3,a,3), (3,b,3), (4,a,2), (4,b,4) }

A11

2 3

4

a

bb

b

a

aa,b

Page 4: Recognising Languages We will tackle the problem of defining languages by considering how we could recognise them. Problem: Is there a method of recognising

Try going backwards

Given:

Q = {1, 2, 3, 4, 5}I = {1}F = {3}T = {a, c, h, m, t}E = { (1,c,2), (1,h,2), (1,m,2), (1,a,3), (1,h,5), (2,a,3), (3,t,4), (5,a,3) }

What’s the diagram?

Page 5: Recognising Languages We will tackle the problem of defining languages by considering how we could recognise them. Problem: Is there a method of recognising

Definitions

If (x,a,y) is an edge, x is its start state and y isits end state.

A path is a sequence of edges such that theend state of one is the start state of the next.

path p1 = (2,b,4), (4,a,2), (2,a,3)

A path is successful if the start state of thefirst edge is an initial state, and the endstate of the last is a final state.

path p2 = (1,b,4),(4,a,2),(2,b,4),(4,b,4)

The label of a path is the sequence of edgelabels.

label(p1) = baa.

Page 6: Recognising Languages We will tackle the problem of defining languages by considering how we could recognise them. Problem: Is there a method of recognising

Definitions (cont.)

A string is accepted by a FSA if it is the labelof a successful path.

Let A be a FSA. The language accepted by Ais the set of strings accepted by A, denoted L(A).

babb = label(p2) is accepted by A1.

The language accepted by A1 is the set ofstrings of a's and b's which end in a b, and inwhich no two a's are adjacent.

1

2 3

4

a

bb

b

a

a a,b

Page 7: Recognising Languages We will tackle the problem of defining languages by considering how we could recognise them. Problem: Is there a method of recognising

Non-Determinism

We would like to use FSAs as a mechanism to recognise languages, simply following the edges as we input symbols from a string.

But

problem – non-deterministic. That is, we may have a choice.

1. From one state there could be a number of edges with the same label. have to remember possible branching points as we trace out a

path, and investigate all branches.

2. Some of the edges could be labelled with , the empty string how does this affect our algorithm?

3. May be more than one initial state where do we start?

1

2

4

a

aa

b

b

Not Algorithmic

Page 8: Recognising Languages We will tackle the problem of defining languages by considering how we could recognise them. Problem: Is there a method of recognising

Deterministic FSA

A FSA is deterministic if:(i) there are no -labelled edges;(ii) for any pair of state and symbol (q,t), there is at most one edge (q,t, p); and(iii) there is only one initial state.

Notation: DFSA and NDFSA stand fordeterministic and non-deterministic FSArespectively.

A DFSA is a specification of an algorithm fordetermining whether or not a string is a member of a given language.

All three conditions must hold

Page 9: Recognising Languages We will tackle the problem of defining languages by considering how we could recognise them. Problem: Is there a method of recognising

Transition Functions

The transition function of A is the function

: (x, t) {y : edge (x, t, y) in A}

If A = (Q,I,F,T,E), then the transition matrixfor A is a matrix with one row for each statein Q, one column for each symbol in T, s.t.the entry in row x and column t is (x,t).

Example:

a b1 {2} {4}2 {3} {4}3 {3} {3}4 {2} {4}

where x denotes an initial state and x denotes a finish state.

Page 10: Recognising Languages We will tackle the problem of defining languages by considering how we could recognise them. Problem: Is there a method of recognising

Recognition Algorithm

Problem: Given a DFSA, A = (Q,I,F,T,E), anda string w, determine whether w L(A).

Note: denote the current state by q, and thecurrent input symbol t. Since A is deterministic, (q,t) will always be a singleton set or will be undefined. If it is undefined, denote it by ( Q).

Algorithm:

Add symbol # to end of w.q := initial statet := first symbol of w#.while (t # and q )begin

q := (q,t)t := next symbol of w#

endreturn ((t == #) & (q F))

Page 11: Recognising Languages We will tackle the problem of defining languages by considering how we could recognise them. Problem: Is there a method of recognising

Equivalence of DFSA's and NDFSA's

Theorem: DFSA = NDFSA

Let L be a language. L is accepted by a NDFSA iff

L is accepted by a DFSA

Algorithm: NDFSA -> DFSA

create unique initial stateremove -edgesremove edge choices

There is a simple algorithm to create a DFSAequivalent (i.e. accepts same language) toa given NDFSA. The reverse direction is trivial.

Page 12: Recognising Languages We will tackle the problem of defining languages by considering how we could recognise them. Problem: Is there a method of recognising

Algorithms

Algorithm: create unique initial state (informal)

Input: (Q, I, F, T, E)Q := Q {i} (where i Q)for each q I, E:= E {(i,,q)}I := {i}

a

b

1 2 3

654

ab

a

b

78

Page 13: Recognising Languages We will tackle the problem of defining languages by considering how we could recognise them. Problem: Is there a method of recognising

Algorithms

Algorithm: remove -edges (informal)

remove all single edges of form (q,,q) from Ewhile there are cycles (paths starting and endingat the same state) of -edges in Ebegin

select a cyclemerge all states in path into a single state, keeping all edges into and out of path

endwhile there are -edges in Ebegin

select an -edge (p, ,q)for each edge (q,t,r) add edge (p,t,r)if q F, add p to Fremove (p, ,q) from E

end

Page 14: Recognising Languages We will tackle the problem of defining languages by considering how we could recognise them. Problem: Is there a method of recognising

Removing Edge Choices

Algorithm: remove edge choices (subset construction)

Input: (Q,I,F,T,E)

I' := {I} F' := {} E' := {} S := {I} Q' : = {I}while S is not emptybegin

select X Sfor each t Tbegin

S' = {q Q:(p,t,q) E, for some p X}if S' F {}, then F' := F'{S'}if S' {}, then E' := E' {(X,t,S')}S := S {S'}\Q'Q' := Q' {S'}

endS := S\{X}

endreturn (Q',I',F',T,E')

1 2 3

a,b

a ba,b

(Q = {1,2,3}, I = {1}, F={3}, T={a,b}, E = {(1,a,1),(1,a,2),(1,b,1),

(2,b,3),(3,a,3),(3,b,3)})

REMEMBER

S is a set of sets!

Page 15: Recognising Languages We will tackle the problem of defining languages by considering how we could recognise them. Problem: Is there a method of recognising

Removing Edge Choices

I' F' E' S Q' X t S'

{1}({1},a,{1,2})({1},b,{1})({1,2},a,{1,2})({1,2},b,{1,3})

({1,3),a,{1,2,3})({1,3},b,{1,3})({1,2,3},a,{1,2,3})({1,2,3},b,{1,3})

{1}{1,2}-{1}

{1,3}-{1,2}{1,2,3}-{1,3}

-{1,2,3}

{1}{1,2}

{1,3}

{1,2,3}

{1}

{1,2}

{1,3}

{1,2,3}

abab

abab

{1,2}{1}{1,2}{1,3}

{1,2,3}{1,3}{1,2,3}{1,3}

{1,3}

{1,2,3}

a

1 1,2 1,2,3

b

a b

a

1,3

bb

a

REMEMBER S is a set of sets!

Page 16: Recognising Languages We will tackle the problem of defining languages by considering how we could recognise them. Problem: Is there a method of recognising

Minimum Size of FSA's

Let A = (Q,I,F,T,E). For any two strings, x, yin T*, x and y are distinguishable w.r.t. A ifthere is a string z T* s.t. exactly one ofxz and yz are in L(A). z distinguishes x and yw.r.t. A.

This means that with x and y as input, A must end in different states - A has to distinguish x and y in order to give the right results for xz and yz. This is used to prove the next result:

Theorem:

L T*. If, for some integer n, there are n elements of T* s.t. any two are distinguishablew.r.t. A, then any FSA that recognises L musthave at least n states.

Page 17: Recognising Languages We will tackle the problem of defining languages by considering how we could recognise them. Problem: Is there a method of recognising

Unique Minimal DFSA's

Theorem: Minimal DFSA

For a given language L, there exists a minimalDFSA accepting L, and it is unique.

Algorithm: DFSA => minimal DFSA

Input: (Q,I,F,T,E)

R := { }for all edges (p,t,q) E do add (q,t,p) to Rremove edge choices from (Q,I,F,T,R) to get (Q',I',F',T',E')Z := equivalent_states(Q, Q')(Q'',I'',F'',T,E'') := merge(Z,Q,I,F,T,E)

return (Q'',I'',F'',T,E'')

1

2 4

5

a

b

a b

3 a,bb

ab b

1

2

5

a

b

a 3,4

4 a,

b

a,b

b

Page 18: Recognising Languages We will tackle the problem of defining languages by considering how we could recognise them. Problem: Is there a method of recognising

Equivalent States

Algorithm: equivalent_statesInput: Q, Q'set all cells of M to tfor all p in Q do for all sets S in Q' do if p is in S then for each q in Q do if q not in S then M[p][q] := f else for each q in Q do if q in S then M[p][q] := fS := { }for all p in Q do Z := { } if M[p][] = t then for each q in Q do if M[p][q] = t then add q to Z M[q] [] := f add Z to Sreturn S

Note: M is a 2d array, indexed by Q and {} Q

Page 19: Recognising Languages We will tackle the problem of defining languages by considering how we could recognise them. Problem: Is there a method of recognising

merging equivalent statesAlgorithm: merge

Input: Z,(Q,I,F,T,E)

for all S in Z do select p in S for all q in S s.t. q p do delete q from Q for each edge (q,t,w) in E do delete (q,t,w) from E if (p,t,w) not in E then add (p,t,w) to E for each edge (w,t,q) in E do delete (w,t,q) from E if (w,t,p) not in E then add (w,t,p) to E if q in I then delete q from I if p not in I then add p to I if q in F then delete q from F if p not in F then add p to Freturn (Q,I,F,T,E)

Page 20: Recognising Languages We will tackle the problem of defining languages by considering how we could recognise them. Problem: Is there a method of recognising

Example Minimisation

1

2 4

5

a

b

a b

3 a,bb

ab b

reversed

the set Q’ from remove edge choices is:{ {3,5}, {2,3,5}, {2,3,4,5}, {1,2,3,4,5} }

1

2 4

5

a

b

a b

3 a,bb

ab b

1 2 3 4 5

12345

ttttt

ttttt

ttttt

ttttt

ttttt

ttttt

1

2

5

a

b

a 3,44 a,b

a,b

b