recognising languages we will tackle the problem of defining languages by considering how we could...
TRANSCRIPT
Recognising Languages
We will tackle the problem of defining languagesby considering how we could recognise them.
Problem:
Is there a method of recognising infinitelanguages?
i.e. given a language description and a string,is there an algorithm which will answer yes orno correctly?
We will define an abstract machine which takes a candidate string and produces the answer yes or no.
The abstract machine will be the specification of the language.
Finite State Automata
A finite state automaton is an abstract modelof a simple machine (or computer).
The machine can be in a finite number ofstates. It receives symbols as input, and theresult of receiving a particular input in aparticular state moves the machine to aspecified new state. Certain states arefinishing states, and if the machine is in one ofthose states when the input ends, it has endedsuccessfully (or has accepted the input).
Example: A1
1
2 3
4
a
bb
b
a
aa,b
FSA: Formal Definition
A Finite State Automaton (FSA) is a 5-tuple (Q, I, F, T, E) where:Q = states = a finite set;I = initial states = a subset of Q;F = final states = a subset of Q;T = an alphabet;E = edges = a subset of Q (T + ) Q.
FSA = labelled, directed graph = set of nodes (some final/initial) +
each arc may have a label from an alphabet +directed arcs (arrows) between nodes.
Example: formal definition of A1
Q = {1, 2, 3, 4}I = {1}F = {4}T = {a, b}E = { (1,a,2), (1,b,4), (2,a,3), (2,b,4), (3,a,3), (3,b,3), (4,a,2), (4,b,4) }
A11
2 3
4
a
bb
b
a
aa,b
Try going backwards
Given:
Q = {1, 2, 3, 4, 5}I = {1}F = {3}T = {a, c, h, m, t}E = { (1,c,2), (1,h,2), (1,m,2), (1,a,3), (1,h,5), (2,a,3), (3,t,4), (5,a,3) }
What’s the diagram?
Definitions
If (x,a,y) is an edge, x is its start state and y isits end state.
A path is a sequence of edges such that theend state of one is the start state of the next.
path p1 = (2,b,4), (4,a,2), (2,a,3)
A path is successful if the start state of thefirst edge is an initial state, and the endstate of the last is a final state.
path p2 = (1,b,4),(4,a,2),(2,b,4),(4,b,4)
The label of a path is the sequence of edgelabels.
label(p1) = baa.
Definitions (cont.)
A string is accepted by a FSA if it is the labelof a successful path.
Let A be a FSA. The language accepted by Ais the set of strings accepted by A, denoted L(A).
babb = label(p2) is accepted by A1.
The language accepted by A1 is the set ofstrings of a's and b's which end in a b, and inwhich no two a's are adjacent.
1
2 3
4
a
bb
b
a
a a,b
Non-Determinism
We would like to use FSAs as a mechanism to recognise languages, simply following the edges as we input symbols from a string.
But
problem – non-deterministic. That is, we may have a choice.
1. From one state there could be a number of edges with the same label. have to remember possible branching points as we trace out a
path, and investigate all branches.
2. Some of the edges could be labelled with , the empty string how does this affect our algorithm?
3. May be more than one initial state where do we start?
1
2
4
a
aa
b
b
Not Algorithmic
Deterministic FSA
A FSA is deterministic if:(i) there are no -labelled edges;(ii) for any pair of state and symbol (q,t), there is at most one edge (q,t, p); and(iii) there is only one initial state.
Notation: DFSA and NDFSA stand fordeterministic and non-deterministic FSArespectively.
A DFSA is a specification of an algorithm fordetermining whether or not a string is a member of a given language.
All three conditions must hold
Transition Functions
The transition function of A is the function
: (x, t) {y : edge (x, t, y) in A}
If A = (Q,I,F,T,E), then the transition matrixfor A is a matrix with one row for each statein Q, one column for each symbol in T, s.t.the entry in row x and column t is (x,t).
Example:
a b1 {2} {4}2 {3} {4}3 {3} {3}4 {2} {4}
where x denotes an initial state and x denotes a finish state.
Recognition Algorithm
Problem: Given a DFSA, A = (Q,I,F,T,E), anda string w, determine whether w L(A).
Note: denote the current state by q, and thecurrent input symbol t. Since A is deterministic, (q,t) will always be a singleton set or will be undefined. If it is undefined, denote it by ( Q).
Algorithm:
Add symbol # to end of w.q := initial statet := first symbol of w#.while (t # and q )begin
q := (q,t)t := next symbol of w#
endreturn ((t == #) & (q F))
Equivalence of DFSA's and NDFSA's
Theorem: DFSA = NDFSA
Let L be a language. L is accepted by a NDFSA iff
L is accepted by a DFSA
Algorithm: NDFSA -> DFSA
create unique initial stateremove -edgesremove edge choices
There is a simple algorithm to create a DFSAequivalent (i.e. accepts same language) toa given NDFSA. The reverse direction is trivial.
Algorithms
Algorithm: create unique initial state (informal)
Input: (Q, I, F, T, E)Q := Q {i} (where i Q)for each q I, E:= E {(i,,q)}I := {i}
a
b
1 2 3
654
ab
a
b
78
Algorithms
Algorithm: remove -edges (informal)
remove all single edges of form (q,,q) from Ewhile there are cycles (paths starting and endingat the same state) of -edges in Ebegin
select a cyclemerge all states in path into a single state, keeping all edges into and out of path
endwhile there are -edges in Ebegin
select an -edge (p, ,q)for each edge (q,t,r) add edge (p,t,r)if q F, add p to Fremove (p, ,q) from E
end
Removing Edge Choices
Algorithm: remove edge choices (subset construction)
Input: (Q,I,F,T,E)
I' := {I} F' := {} E' := {} S := {I} Q' : = {I}while S is not emptybegin
select X Sfor each t Tbegin
S' = {q Q:(p,t,q) E, for some p X}if S' F {}, then F' := F'{S'}if S' {}, then E' := E' {(X,t,S')}S := S {S'}\Q'Q' := Q' {S'}
endS := S\{X}
endreturn (Q',I',F',T,E')
1 2 3
a,b
a ba,b
(Q = {1,2,3}, I = {1}, F={3}, T={a,b}, E = {(1,a,1),(1,a,2),(1,b,1),
(2,b,3),(3,a,3),(3,b,3)})
REMEMBER
S is a set of sets!
Removing Edge Choices
I' F' E' S Q' X t S'
{1}({1},a,{1,2})({1},b,{1})({1,2},a,{1,2})({1,2},b,{1,3})
({1,3),a,{1,2,3})({1,3},b,{1,3})({1,2,3},a,{1,2,3})({1,2,3},b,{1,3})
{1}{1,2}-{1}
{1,3}-{1,2}{1,2,3}-{1,3}
-{1,2,3}
{1}{1,2}
{1,3}
{1,2,3}
{1}
{1,2}
{1,3}
{1,2,3}
abab
abab
{1,2}{1}{1,2}{1,3}
{1,2,3}{1,3}{1,2,3}{1,3}
{1,3}
{1,2,3}
a
1 1,2 1,2,3
b
a b
a
1,3
bb
a
REMEMBER S is a set of sets!
Minimum Size of FSA's
Let A = (Q,I,F,T,E). For any two strings, x, yin T*, x and y are distinguishable w.r.t. A ifthere is a string z T* s.t. exactly one ofxz and yz are in L(A). z distinguishes x and yw.r.t. A.
This means that with x and y as input, A must end in different states - A has to distinguish x and y in order to give the right results for xz and yz. This is used to prove the next result:
Theorem:
L T*. If, for some integer n, there are n elements of T* s.t. any two are distinguishablew.r.t. A, then any FSA that recognises L musthave at least n states.
Unique Minimal DFSA's
Theorem: Minimal DFSA
For a given language L, there exists a minimalDFSA accepting L, and it is unique.
Algorithm: DFSA => minimal DFSA
Input: (Q,I,F,T,E)
R := { }for all edges (p,t,q) E do add (q,t,p) to Rremove edge choices from (Q,I,F,T,R) to get (Q',I',F',T',E')Z := equivalent_states(Q, Q')(Q'',I'',F'',T,E'') := merge(Z,Q,I,F,T,E)
return (Q'',I'',F'',T,E'')
1
2 4
5
a
b
a b
3 a,bb
ab b
1
2
5
a
b
a 3,4
4 a,
b
a,b
b
Equivalent States
Algorithm: equivalent_statesInput: Q, Q'set all cells of M to tfor all p in Q do for all sets S in Q' do if p is in S then for each q in Q do if q not in S then M[p][q] := f else for each q in Q do if q in S then M[p][q] := fS := { }for all p in Q do Z := { } if M[p][] = t then for each q in Q do if M[p][q] = t then add q to Z M[q] [] := f add Z to Sreturn S
Note: M is a 2d array, indexed by Q and {} Q
merging equivalent statesAlgorithm: merge
Input: Z,(Q,I,F,T,E)
for all S in Z do select p in S for all q in S s.t. q p do delete q from Q for each edge (q,t,w) in E do delete (q,t,w) from E if (p,t,w) not in E then add (p,t,w) to E for each edge (w,t,q) in E do delete (w,t,q) from E if (w,t,p) not in E then add (w,t,p) to E if q in I then delete q from I if p not in I then add p to I if q in F then delete q from F if p not in F then add p to Freturn (Q,I,F,T,E)
Example Minimisation
1
2 4
5
a
b
a b
3 a,bb
ab b
reversed
the set Q’ from remove edge choices is:{ {3,5}, {2,3,5}, {2,3,4,5}, {1,2,3,4,5} }
1
2 4
5
a
b
a b
3 a,bb
ab b
1 2 3 4 5
12345
ttttt
ttttt
ttttt
ttttt
ttttt
ttttt
1
2
5
a
b
a 3,44 a,b
a,b
b