ma/csse 474 theory of computation regular expressions intro
TRANSCRIPT
![Page 1: MA/CSSE 474 Theory of Computation Regular Expressions Intro](https://reader036.vdocuments.us/reader036/viewer/2022062517/56649f275503460f94c3f319/html5/thumbnails/1.jpg)
MA/CSSE 474Theory of Computation
Regular Expressions Intro
![Page 2: MA/CSSE 474 Theory of Computation Regular Expressions Intro](https://reader036.vdocuments.us/reader036/viewer/2022062517/56649f275503460f94c3f319/html5/thumbnails/2.jpg)
Your Questions?• Previous class days'
material• Reading Assignments
• HW5 problems• Anything else
Still more language ambiguity!
![Page 3: MA/CSSE 474 Theory of Computation Regular Expressions Intro](https://reader036.vdocuments.us/reader036/viewer/2022062517/56649f275503460f94c3f319/html5/thumbnails/3.jpg)
Regular Languages
Regular Language
Regular Expression
Finite State Machine
Describes
Accepts
![Page 4: MA/CSSE 474 Theory of Computation Regular Expressions Intro](https://reader036.vdocuments.us/reader036/viewer/2022062517/56649f275503460f94c3f319/html5/thumbnails/4.jpg)
Regular Expressions
The regular expressions over an alphabet are the strings that can be obtained as follows:
1. is a regular expression.2. is a regular expression.3. Every element of is a regular expression.4. If , are regular expressions, then so is .5. If , are regular expressions, then so is .6. If is a regular expression, then so is *.7. is a regular expression, then so is +.8. If is a regular expression, then so is ().
#7 is here for convenience only (syntactic sugar); many authors do not include + in the list of r.e. builders.
![Page 5: MA/CSSE 474 Theory of Computation Regular Expressions Intro](https://reader036.vdocuments.us/reader036/viewer/2022062517/56649f275503460f94c3f319/html5/thumbnails/5.jpg)
Regular Expression Examples
If = {a, b}, the following are regular expressions:
a(a b)*(abba )+(a bab)
![Page 6: MA/CSSE 474 Theory of Computation Regular Expressions Intro](https://reader036.vdocuments.us/reader036/viewer/2022062517/56649f275503460f94c3f319/html5/thumbnails/6.jpg)
Regular Expressions Define Languages
Define L, a semantic interpretation function for regular expressions (Let and be arbitrary regular expressions over alphabet ).
1. L() = .2. L() = {}.3. If c , L(c) = {c}.4. L() = L() L(). 5. L( ) = L() L(). 6. L(*) = (L())*. 7. L(+) = L(*) = L() (L())*. If L() is equal to , then
L(+) is also equal to . Otherwise L(+) is the language that is formed by concatenating together one or more strings drawn from L().
8. L(()) = L().
![Page 7: MA/CSSE 474 Theory of Computation Regular Expressions Intro](https://reader036.vdocuments.us/reader036/viewer/2022062517/56649f275503460f94c3f319/html5/thumbnails/7.jpg)
The Role of the Rules
• Rules 1, 3, 4, 5, and 6 give the language its power to define sets.
• Rule 8 has as its only role grouping other operators. • Rules 2 and 7 appear to add functionality to the
regular expression language, but they don’t.
2. is a regular expression.
7. is a regular expression, then so is +.
![Page 8: MA/CSSE 474 Theory of Computation Regular Expressions Intro](https://reader036.vdocuments.us/reader036/viewer/2022062517/56649f275503460f94c3f319/html5/thumbnails/8.jpg)
Operator Precedence in Regular Expressions
Regular ArithmeticExpressions Expressions
Highest Kleene star and + exponentiation
concatenation multiplication
Lowest union addition
a b* c d* x y2 + i j2
![Page 9: MA/CSSE 474 Theory of Computation Regular Expressions Intro](https://reader036.vdocuments.us/reader036/viewer/2022062517/56649f275503460f94c3f319/html5/thumbnails/9.jpg)
Analyzing a Regular Expression
L((a b)*b) = L((a b)*) L(b)
= (L((a b)))* L(b)
= (L(a) L(b))* L(b)
= ({a} {b})* {b}
= {a, b}* {b}.
![Page 10: MA/CSSE 474 Theory of Computation Regular Expressions Intro](https://reader036.vdocuments.us/reader036/viewer/2022062517/56649f275503460f94c3f319/html5/thumbnails/10.jpg)
From English to reg exps
L = {w {a, b}*: |w| is even}
L = {w {0, 1}*: w is a binary representation of a multiple of 4}
L = {w {a, b}*: w contains an odd number of a’s}
![Page 11: MA/CSSE 474 Theory of Computation Regular Expressions Intro](https://reader036.vdocuments.us/reader036/viewer/2022062517/56649f275503460f94c3f319/html5/thumbnails/11.jpg)
The Details Matter
a* b* (a b)*
(ab)* a*b*
![Page 12: MA/CSSE 474 Theory of Computation Regular Expressions Intro](https://reader036.vdocuments.us/reader036/viewer/2022062517/56649f275503460f94c3f319/html5/thumbnails/12.jpg)
More Regular Expression Examples
L ( (aa*) ) =
L ( (a )* ) =
L = {w {a, b}*: there is no more than one b in w}
L = {w {a, b}* : no two consecutive letters in w are the same}
![Page 13: MA/CSSE 474 Theory of Computation Regular Expressions Intro](https://reader036.vdocuments.us/reader036/viewer/2022062517/56649f275503460f94c3f319/html5/thumbnails/13.jpg)
The Details Matter
L1 = {w {a, b}* : every a is immediately followed a b}
A regular expression for L1:
A FSM for L1:
L2 = {w {a, b}* : every a has a matching b somewhere}
A regular expression for L2:
A FSM for L2:
![Page 14: MA/CSSE 474 Theory of Computation Regular Expressions Intro](https://reader036.vdocuments.us/reader036/viewer/2022062517/56649f275503460f94c3f319/html5/thumbnails/14.jpg)
Kleene’s Theorem Finite state machines and regular expressions define the same class of languages.
To prove this, we must show:
Theorem: Any language that can be defined by a regular expression can be accepted by some FSM and so is regular.
Theorem: Every regular language (i.e., every language that can be accepted by some DFSM) can be defined with a regular expression.
![Page 15: MA/CSSE 474 Theory of Computation Regular Expressions Intro](https://reader036.vdocuments.us/reader036/viewer/2022062517/56649f275503460f94c3f319/html5/thumbnails/15.jpg)
For Every Regular Expression There is a Corresponding FSM
We’ll show this by construction. An FSM for:
:
A single element c of :
:
![Page 16: MA/CSSE 474 Theory of Computation Regular Expressions Intro](https://reader036.vdocuments.us/reader036/viewer/2022062517/56649f275503460f94c3f319/html5/thumbnails/16.jpg)
Union
If is the regular expression and if both L() and L() are regular:
![Page 17: MA/CSSE 474 Theory of Computation Regular Expressions Intro](https://reader036.vdocuments.us/reader036/viewer/2022062517/56649f275503460f94c3f319/html5/thumbnails/17.jpg)
Concatenation
If is the regular expression and if both L() and L() are regular:
![Page 18: MA/CSSE 474 Theory of Computation Regular Expressions Intro](https://reader036.vdocuments.us/reader036/viewer/2022062517/56649f275503460f94c3f319/html5/thumbnails/18.jpg)
Kleene Star
If is the regular expression * and if L() is regular:
![Page 19: MA/CSSE 474 Theory of Computation Regular Expressions Intro](https://reader036.vdocuments.us/reader036/viewer/2022062517/56649f275503460f94c3f319/html5/thumbnails/19.jpg)
An Example
(b ab)*
An FSM for b An FSM for a An FSM for b
An FSM for ab:
![Page 20: MA/CSSE 474 Theory of Computation Regular Expressions Intro](https://reader036.vdocuments.us/reader036/viewer/2022062517/56649f275503460f94c3f319/html5/thumbnails/20.jpg)
An Example
(b ab)*
An FSM for (b ab):
![Page 21: MA/CSSE 474 Theory of Computation Regular Expressions Intro](https://reader036.vdocuments.us/reader036/viewer/2022062517/56649f275503460f94c3f319/html5/thumbnails/21.jpg)
An Example
(b ab)*
An FSM for (b ab)*:
![Page 22: MA/CSSE 474 Theory of Computation Regular Expressions Intro](https://reader036.vdocuments.us/reader036/viewer/2022062517/56649f275503460f94c3f319/html5/thumbnails/22.jpg)
For Every FSM There is a Corresponding Regular Expression
• We’ll show this by construction. The construction is different than the textbook's.
• Let M = ({q1, …, qn}, , , q1, A) be a DFSM.Define Rijk to be the set of all strings x * such that
• (qi,x) |-M (qj, ), and
• if (qi,y) |-M (q𝓁, ), for any prefix y of x (except y= and y=x), then 𝓁 k
• That is, Rijk is the set of all strings that take us from qi to qj
without passing through any intermediate states numbered higher than k.• In this case, "passing through" means both entering
and leaving.• Note that either i or j (or both) may be greater than k.
*
*
![Page 23: MA/CSSE 474 Theory of Computation Regular Expressions Intro](https://reader036.vdocuments.us/reader036/viewer/2022062517/56649f275503460f94c3f319/html5/thumbnails/23.jpg)
Example: Rijk
• Rijk is the set of all strings that take us from qi to qj without
passing through any intermediate states numbered higher than k.• In this case, "passing through" means both entering
and leaving.• Note that either i or j (or both) may be greater than k.
R000
R010
R011
R021
R022
R232
R233
![Page 24: MA/CSSE 474 Theory of Computation Regular Expressions Intro](https://reader036.vdocuments.us/reader036/viewer/2022062517/56649f275503460f94c3f319/html5/thumbnails/24.jpg)
DFAReg. Exp. construction• Rijk is the set of all strings that take M from qi to qj
without passing through any intermediate states numbered higher than k.
• Examples: Rijn is
• Also note that L(M) is the union of R1jn over all qj in A.
• We will show that for all i,j{1, …, n} and all k {0, …, n}, Rijk is defined by a regular expression.– We already know that the union of languages defined by
reg. exps. is defined by a reg. exp.
![Page 25: MA/CSSE 474 Theory of Computation Regular Expressions Intro](https://reader036.vdocuments.us/reader036/viewer/2022062517/56649f275503460f94c3f319/html5/thumbnails/25.jpg)
DFAReg. Exp. continued• Rijk is the set of all strings that take M from qi to qj without passing
through any intermediate states numbered higher than k.
It can be computed recursively:• Base cases (k = 0):
– If i j, Rij0 = {a : (qi, a) = qj}
– If i = j, Rii0 = {a : (qi, a) = qi} {}
• Recursive case (k > 0): Rijk is Rijk-1 Rikk-1(Rkkk-1)*Rkjk-1
• We show by induction that each Rijk is defined by some regular expression rijk.
![Page 26: MA/CSSE 474 Theory of Computation Regular Expressions Intro](https://reader036.vdocuments.us/reader036/viewer/2022062517/56649f275503460f94c3f319/html5/thumbnails/26.jpg)
DFAReg. Exp. Proof pt. 1
• Base case definition (k = 0):– If i j, Rij0 = {a : (qi, a) = qj}– If i = j, Rii0 = {a : (qi, a) = qi} {}
• Base case proof:Rij0 is a finite set of symbols, each of which is either or a single symbol from . So Rij0 can be defined by the reg. exp. rij0 = a1a2…ap (or a1a2…ap if i=j),where {a1, a2, …,ap} is the set of all symbols a such that (qi, a) = qj.
• Note that if M has no direct transitions from qi to qj, then rij0 is (it is if i=j and no "loop" on that state).
![Page 27: MA/CSSE 474 Theory of Computation Regular Expressions Intro](https://reader036.vdocuments.us/reader036/viewer/2022062517/56649f275503460f94c3f319/html5/thumbnails/27.jpg)
DFAReg. Exp. Proof pt. 2• Recursive definition (k > 0):
Rijk is Rijk-1 Rikk-1(Rkkk-1)*Rkjk-1
• Induction hypothesis: For each 𝓁 and 𝓂, there is a regular expression r𝓁𝓂k-1 such that L(r𝓁𝓂k-1 )= R𝓁𝓂k-1.
• Induction step. By the recursive parts of the definition of regular expressions and the languages they define, and by the above recursive defintion of Rijk : Rijk = L(rijk-1 rikk-1(rkkk-1)*rkjk-1)
![Page 28: MA/CSSE 474 Theory of Computation Regular Expressions Intro](https://reader036.vdocuments.us/reader036/viewer/2022062517/56649f275503460f94c3f319/html5/thumbnails/28.jpg)
DFAReg. Exp. Proof pt. 3• We showed by induction that each Rijk is
defined by some regular expression rijk.
• In particular, for all qjA, there is a regular expression r1jn that defines R1jn.
• Then L(M) = L(r1j1n … r1jpn ),
where A = {qj1, …, qjp
}
![Page 29: MA/CSSE 474 Theory of Computation Regular Expressions Intro](https://reader036.vdocuments.us/reader036/viewer/2022062517/56649f275503460f94c3f319/html5/thumbnails/29.jpg)
An ExampleStart q1 q2 q3
0
01
1
0,1
k=0 k=1 k=2r11k (00)*
r12k 0 0 0(00)*
r13k 1 1 0*1
r21k 0 0 0(00)*
r22k 00 (00)*
r23k 1 1 01 0*1
r31k (0 1)(00)*0
r32k 0 1 0 1 (0 1)(00)*
r33k (0 1)0*1