finite state automata. a very simple and intuitive formalism suitable for certain tasks a bit like a...

Post on 14-Dec-2015

216 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Finite State Automata

Finite State Automata• A very simple and intuitive formalism suitable for

certain tasks• A bit like a flow chart, but can be used for both

recognition and generation• “Transition network”• Unique start point• Series of states linked by transitions• Transitions represent input to be accounted for, or

output to be generated• Legal exit-point(s) explicitly identified

ExampleJurafsky & Martin, Figure 2.10

• Loop on q3 means that it can account for infinite length strings

• “Deterministic” because in any state, its behaviour is fully predictable

q0 q1 q2 q3 q4

b aa !a

Non-deterministic FSAJurafsky & Martin, Figure 2.18

• At state q2 with input “a” there is a choice of transitions

• We can also have “jump” arcs (or empty transitions), which also introduce non-determinism

q0 q1 q2 q3 q4

b aa !a

2.19

ε

Augmented Transition Networks

• ATNs were used for parsing in the 60s and 70s• For parsing, you need to pass constraints (e.g. for

agreement) as well as account for input: the Transition Networks were “augmented” by having a “register” into/from which such information could be put/taken.

• It’s easy to write recognizers, but computing structure is difficult

• ATNs quickly become very complex; one solution isto have a “cascade” of ATNs, where transitions can call other networks

Augmented Transition Networks

S q1

NP q1

ex

push NPput “num”

detput “num”

push VPget “num”

nput “num”

adj

q2

εpop NPprep

Exercises

q0 q1 q2 q3 q4

b aa !a

fsa([[0,b,1],[1,a,2],[2,a,3],[3,a,3],[3,!,end]]).

[0,b,1] [1,a,2] [2,a,3] [3,a,3] [3,!,end]

NDSFA

q0 q1 q2 q3 q4

b aa !

ε

fsa([[0,b,1],[1,a,2],[2,a,3],[3,empty,2],[3,!,end]]).

[0,b,1] [1,a,2] [2,a,3] [3,!,end] [3,empty,2]

FSA and NDFSA programsFirst load (consult) the file, eg 219.pl

| ?- help.Options are as followsrun - a simple recognizer; on prompt type in string with space between each element, ending in . or ! or ?run(v) - verbose recognizer gives trace of transitionsgen(X) - generate text; will interact at choice pointsrec(X,quiet) - to generate text deterministically. Type ; to get other grammatical sequences

| ?- run. b a a a a !Enter your string:

yes

FSA and NDFSA programs

| ?- run(v).Enter your string:

0-b-11-a-22-a-33-skip-22-a-33-skip-22-a-33-skip-23-!-end

yes

b a a a a !

| ?- gen(X).

FSA and NDFSA programs

Choice at state 3. Choose state from (1) [!,end](2) [empty,2]Select choice number: 2.

Choice at state 3. Choose state from (1) [!,end](2) [empty,2]Select choice number: 2.

Choice at state 3. Choose state from (1) [!,end](2) [empty,2]Select choice number: 1.

X = [b,a,a,a,a,!] ?

yes

| ?- rec(X,quiet).

X = [b,a,a] ?

FSA and NDFSA programs

;

X = [b,a,a,a] ? ;

X = [b,a,a,a,a] ? ;

X = [b,a,a,a,a,a] ?

yes

FSAs and regular expressions

• FSAs have a close relationship with “regular expressions”, a formalism for expressing strings, mainly used for searching texts, or stipulating patterns of strings

• Regular expressions are defined by combinations of literal characters and special operators

Regular expressionsCharacter Meaning Examples[ ] alternatives /[aeiou]/, /m[ae]n/ range /[a-z]/[^ ] not /[^pbm]/, /[^ox]s/? optionality /Kath?mandu/* zero or more /baa*!/+ one or more /ba+!/. any character /cat.[aeiou]/^, $ start, end of line\ not special character \.\?\^| alternate strings /cat|dog/( ) substring /cit(y|ies)/etc.

Regular expressions

• A regular expression can be mapped onto an FSA

• Can be a good way of handling morphology

• Especially in connection with Finite State Transducers

Finite State Transducers

• A “transducer” defines a relationship (a mapping) between two things

• Typically used for “two-level morphology”, but can be used for other things

• Like an FSA, but each state transition stipulates a pair of symbols, and thus a mapping

Finite State Transducers

• Three functions:– Recognizer (verification): takes a pair of strings

and verifies if the FST is able to map them onto each other

– Generator (synthesis): can generate a legal pair of strings

– Translator (transduction): given one string, can generate the corresponding string

Some conventions

• Transitions are marked by “:”

• A non-changing transition “x:x” can be shown simply as “x”

• Wild-cards are shown as “@”

• Empty string shown as “ε”

An exampleJ&M Fig. 3.9, p.74

q0

q6

q5

q4

q3

q2

q1

q7

f o xc a td o g

g o o s es h e e pm o u s e

g o:e o:e s es h e e pm o:i u:εs:c e

N:ε

N:ε

N:ε

P:^ s #

S:#

S:#

P:#

lexical:intermediate

q0

q6

q5

q4

q3

q2

q1

q7

g o o s es h e e pm o u s e

g o:e o:e s es h e e pm o:i u:εs:c e

N:ε

N:ε

N:ε

P:^ s #

S:#

S:#

P:#

[0] f:f o:o x:x [1] N:ε [4] P:^ s:s #:# [7][0] f:f o:o x:x [1] N:ε [4] S:# [7][0] c:c a:a t:t [1] N:ε [4] P:^ s:s #:# [7][0] s:s h:h e:e p:p [2] N:ε [5] S:# [7][0] g:g o:o o:o s:s e:e [2] N:ε [5] P:# [7]

f o x N P s # : f o x ^ s #f o x N S : f o x #c a t N P s # : c a t ^ s #s h e e p N S : s h e e p # g o o s e N P : g e e s e #

f o xc a td o g

Lexical:surface mappingJ&M Fig. 3.14, p.78

ε e / {x s z} ^ __ s #f o x N P s # : f o x ^ s #c a t N P s # : c a t ^ s #

q5

q4q0 q2 q3

q1

^: ε#

other

other

z, s, x

z, s, x

#, other z, x

^: ε

s ^: ε

ε:e s

#

f o x ^ s # f o x e s #c a t ^ s # : c a t ^ s #

q5

q4q0 q2 q3

q1

^: ε#

other

other

z, s, x

z, s, x

#, other z, x

^: ε

s ^: ε

ε:e s

#

[0] f:f [0] o:o [0] x:x [1] ^:ε [2] ε:e [3] s:s [4] #:# [0][0] c:c [0] a:a [0] t:t [0] ^:ε [0] s:s [0] #:# [0]

FST

• Can be generated automatically

• Therefore, slightly different formalism

FST compilerhttp://www.xrce.xerox.com/competencies/content-analysis/fsCompiler/fsinput.html[d o g N P .x. d o g s ] | [c a t N P .x. c a t s ] |[f o x N P .x. f o x e s ] |[g o o s e N P .x. g e e s e] 

s0: c -> s1, d -> s2, f -> s3, g -> s4.s1: a -> s5.s2: o -> s6.s3: o -> s7.s4: <o:e> -> s8.s5: t -> s9.s6: g -> s9.s7: x -> s10.s8: <o:e> -> s11.s9: <N:s> -> s12.s10: <N:e> -> s13.s11: s -> s14.s12: <P:0> -> fs15.s13: <P:s> -> fs15.s14: e -> s16.fs15: (no arcs)s16: <N:0> -> s12.

s0

s3

s2

s1

s4

c

d

f

g

s0: c -> s1, d -> s2, f -> s3, g -> s4.s1: a -> s5.s2: o -> s6.s3: o -> s7.s4: <o:e> -> s8.s5: t -> s9.s6: g -> s9.s7: x -> s10.s8: <o:e> -> s11.s9: <N:s> -> s12.s10: <N:e> -> s13.s11: s -> s14.s12: <P:0> -> fs15.s13: <P:s> -> fs15.s14: e -> s16.fs15: (no arcs)s16: <N:0> -> s12.

fst([[s0,[c,s1], [d,s2], [f,s3], [g,s4]],[s1,[a,s5]],[s2,[o,s6]],[s3,[o,s7]],[s4,[[o,e],s8]],[s5,[t,s9]],[s6,[g,s9]],[s7,[x,s10]],[s8,[[o,e],s11]],[s9,[['N',s],s12]],[s10,[['N',e],s13]],[s11,[s,s14]],[s12,[['P',0],fs15]],[s13,[['P',s],fs15]],[s14,[e,s16]],[fs15, noarcs],[s16,[['N',0],s12]]]).

FST 3.9

s0

q6

q5

q4

q3

q2

q1

q7

g o o s es h e e pm o u s e

g o:e o:e s es h e e pm o:i u:εs:c e

N:ε

N:ε

N:ε

PL:^ s #

SG:#

SG:#

PL:#

f o xc a td o g

s0

q1

f o xc a td o g

FST 3.9 (portion)[s0,[f,s1], [c,s3], [d,s5]],[s1,[o,s2]],[s2,[x,q1]],[s3,[a,s4]],[s4,[t,q1]],[s5,[o,s6]],[s6,[g,q1]],

s0 q1

f s1 s2

s3 s4

s5 s6

c

d

o

a

o

x

t

g

top related