cs 154, lecture 2: finite automata, closure properties ...anatomy of deterministic finite automata...

CS 154, Lecture 2:Finite Automata,Closure PropertiesNondeterminism,

Why so Many Models?

1

42

00

Streaming Algorithms

Deterministic Finite Automata

00,1

00

1

1

1

Anatomy of Deterministic Finite Automata

states

states

q0

q1

q2

q3start state (q0)

accept states (F)transition: for every state and alphabet symbol

00,1

00

1

1

1

0111 111

11

1

The automaton accepts the string if this process ends in a double-circle (accept state). Otherwise, the automaton rejects the string.

The DFA reads its string from left to right

What and Why DFASimple: Constant-size memory & read-once input

Both as important?

Foundational or archaic object?

So “close” to a Turing Machine and yet so far: nondeterminism (verified guessing) is “weak”, optimizing (and some settings of learning) is easy

Allows us to talk about – useful vocabulary, nondeterminism, (learning), limitations, streaming, communication complexity

IBM JOURNAL APRIL 1959

Turing Award winning paper

An alphabet Σ is a finite set (e.g., Σ = {0,1})

A string over Σ is a finite length sequence of elements of Σ

For string x, |x| is the length of xThe unique string of length 0 is denoted by ε

and is called the empty string

Some Vocabulary

Σ* = the set of all strings over Σ

Every language L over Σ corresponds to a unique function f : Σ* → {0,1}:

Define f such that f(x) = 1 if x L= 0 otherwise

Languages

A language over Σ is a set of strings over ΣIn other words: a language is a subset of Σ*

Languages = Boolean functions over strings

Q is the set of states (finite)Σ is the alphabet (finite) : Q Σ→ Q is the transition functionq0 Q is the start stateF Q is the set of accept/final states

Definition. A DFA is a 5-tuple M = (Q, Σ, , q0, F)

Definition. A DFA is a 5-tuple M = (Q, Σ, , q0, F)

Let w1, ... , wn Σ and w = w1... wn Σ*M accepts w if there are r0, r1, ..., rn Q, s.t.

• r0 = q0

• (ri, wi+1) = ri+1 for all i = 0, ..., n-1, and • rn F

Q = {q0, q1, q2, q3}

Σ = {0,1}

: Q Σ → Q transition function*q0 Q is start state

F = {q1, q2} Q accept states

M = (Q, Σ, , q0, F) where

0 1

q0 q0 q1

q1 q2 q2

q2 q3 q2

q3 q0 q2

*

A DFA is a 5-tuple M = (Q, Σ, , q0, F)

L(M) = set of all strings that M accepts = “the language recognized by M”= the function computed by M

1

0

0,1

1010 010100

0,1

L(M) = { w | w begins with 1}

M

0,1q0

L(M) ={0,1}*

q0 q1

0 0

1

1

L(M) = { w | w has an even number of 1s}

0

1

1

0

p q

Theorem:L(M) = {w | w has odd

number of 1s }

Induction Step Suppose for all w Σ*, |w| = n,M accepts w w has odd number of 1s

A string of length n+1 has the form w0 or w1, |w|=n, continue with a case analysis …

Proof: By induction on n, the length of a string.

Base Case n=0: ε L and ε L(M)

q q00

1 0

1q0 q001

0 0 1

0,1

Build a DFA that accepts exactly the strings containing 001

Definition: A language L’ is regular if

L’ is recognized by a DFA;

that is, there is a DFA M where L’ = L(M).

L’ = { w | w contains 001} is regular

L’ = { w | w has an even number of 1s} is regular

Closure Properties for Regular LanguagesUnion: A B = { w | w A or w B }

Intersection: A B = { w | w A and w B }

Complement: A = { w Σ* | w A }

Reverse: AR = { w1 …wk | wk …w1 A, wi Σ}

Concatenation: A B = { vw | v A and w B }

Star: A* = { s1 … sk | k ≥ 0 and each si A }

Theorem: if A and B are regular then so are: A B, A B, A, AR , A B, and A*

Union Theorem for Regular LanguagesGiven two languages L1 and L2

recall that the union of L1 and L2 is

L1 L2 = { w | w L1 or w L2 }

Theorem: The union of two regular languages is also a regular language


Proof: Let

M1 = (Q1, Σ, 1, q0, F1) be a finite automaton for L1

and

M2 = (Q2, Σ, 2, q0, F2) be a finite automaton for L2

We want to construct a finite automaton M = (Q, Σ, , q0, F) that recognizes L = L1 ∪ L2

1

2

Proof Idea: Run both M1 and M2 “in parallel”!

Q = pairs of states, one from M1 and one from M2

= { (q1, q2) | q1 Q1 and q2 Q2 }

= Q1 Q2

q0 = (q0, q0)1 2

F = { (q1, q2) | q1 F1 OR q2 F2 }

( (q1,q2), ) = (1(q1, ), 2(q2, ))

M1 = (Q1, Σ, 1, q0, F1) recognizes L1 and

M2 = (Q2, Σ, 2, q0, F2) recognizes L2

1

2


q0 q1

00

1

1

p0 p1

11

0

0

q0,p0 q1,p0

1

1

q0,p1 q1,p1

1

1

0000

What about the INTERSECTIONof two languages?

Proof Idea: Run both M1 and M2 “in parallel”!

Q = pairs of states, one from M1 and one from M2

= { (q1, q2) | q1 Q1 and q2 Q2 }

= Q1 Q2

q0 = (q0, q0)1 2

F = { (q1, q2) | q1 F1 AND q2 F2 }

( (q1,q2), ) = (1(q1, ), 2(q2, ))

M1 = (Q1, Σ, 1, q0, F1) recognizes L1 and

M2 = (Q2, Σ, 2, q0, F2) recognizes L2

1

2

Union Theorem for Regular Languages:The union of two regular languages is

also a regular language

Intersection Theorem for Regular Languages:The intersection of two regular languages is

also a regular language

“Regular Languages Are Closed Under Union”

“Regular Languages Are Closed Under Intersextion”

Complement Theorem for Regular Languages

The complement of a regular language is also a regular language:

If L is regular than so is L= { w Σ* | w L }

Proof ?

Closure Properties for Regular LanguagesUnion: A B = { w | w A or w B }

Intersection: A B = { w | w A and w B }

Complement: A = { w Σ* | w A }

Reverse: AR = { w1 …wk | wk …w1 A, wi Σ}

Concatenation: A B = { vw | v A and w B }

Star: A* = { s1 … sk | k ≥ 0 and each si A }

Theorem: if A and B are regular then so are: A B, A B, A, AR , A B, and A*

The Reverse of a Language

Reverse of L:LR = { w1 …wk | wk …w1 L, wi Σ}

If L is recognized by the usual kind of DFA,Then LR is recognized by a DFA that reads its strings from right to left

How can every “Right-to-Left” DFA be replaced by a normal “Left-to-Right” DFA?

Theorem: If L is regular, then is LR also regular

Reversing DFAsAssume L is a regular language. Let M be a DFA that recognizes L

We want to build a machine MR that accepts LR

If M accepts w, then w describes a directed path in M from the start state to an accept state

First Attempt: Try to define MR as M with the arrows reversed, turn start state into a final state, turn final states into starts

Problem: MR is not always a DFA!

It could have many start states

Some states may have more than oneoutgoing edge, or none at all!

Non-deterministic Finite Automata (NFA)

What happens with 100?

We will say this new machine accepts a string if there is some path that reaches

some accept state from some start state

1

1

At each state, we can have any number of out arrows for some letter Σ, including ε

Another Example of an NFA

Set of strings accepted by this NFA = {w | w contains a 0}

Multiple Start StatesWe allow multiple start states for NFAs, and Sipser

allows only oneCan easily convert NFA with many start states into

one with a single start state:

εε

ε

L(M) = {1,00}

Yet Another Example of an NFA

Q is the set of states

Σ is the alphabet : Q Σε ! 2Q is the transition function

Q0 Q is the set of start states

F Q is the set of accept states

A non-deterministic finite automaton (NFA) is a 5-tuple N = (Q, Σ, , Q0, F) where

2Q is the set of all possible subsets of QΣε = Σ {ε}

Def. Let w Σ*. Let N be an NFA. N accepts w if there’s a sequence of states r0, r1, ..., rk Q and w can be written as w1... wk with wi Σ ∪ {ε} s.t.

1. r0 Q0

2. ri+1 (ri, wi+1 ) for all i = 0, ..., k-1, and 3. rn F

L(N) = the language recognized by N= set of all strings machine N accepts

Language L’ is recognized by an NFA N if L’ = L(N).

(q3,1) =

N = (Q, Σ, , Q0, F)

Q = {q1, q2, q3, q4}

Σ = {0,1}Q0 = {q1, q2}

F = {q4} Q

(q2,1) = {q4}

(q1,0) = {q3}00 L(N)?

01 L(N)?

NFAs are generally simpler than DFAs

A DFA recognizing the language {1}

An NFA recognizing the language {1}

DeterministicComputation

Non-DeterministicComputation

accept or reject accept

reject

Are these equally powerful???

Parting thoughts:

Regular Languages can be manipulated

When life hands you ambiguity define

Nondeterministic Finite Automata

Is verifying easier than computing (I)?

Questions?

cs 154, lecture 2: finite automata, closure properties ...anatomy of deterministic finite automata...

Documents