automata - chap1+introduction
TRANSCRIPT
-
8/10/2019 Automata - chap1+introduction
1/42
WELCOME TO SSK5204
AUTOMATA THEORY AND FORMAL LANGUAGE
DR. NOR FAZLIDA MOHD SANI
DEPT. OF COMPUTER SCIENCE,
& INFORMATION SECURITY RESEARCH GROUP
1
-
8/10/2019 Automata - chap1+introduction
2/42
CHAPTER 1: INTRODUCTION
Why study automata?
Terminology and mathematical concepts
Formal proof
Concepts of automata theory
2
-
8/10/2019 Automata - chap1+introduction
3/42
WHYSTUDYAUTOMATA?
Automata theory is the study of abstract computing
devices, or machines 1930s, before there were computers, Alan Turing
studied an abstract machine that had all the
capabilities of todays computer Turings goal was to describe what computer could do and
could not
1940s and 1950s, simpler kinds of machine, which
today call finite automata studied by a number ofresearchers.
Originally proposed to model brain function, turned out to
be extremely useful for a variety of other purposes 3
-
8/10/2019 Automata - chap1+introduction
4/42
CONT. Late 1950s, the linguist N. Chomsky study of formal
grammars Not strictly machine, these grammars have close relationship
to abstract automata and serve today as the basis of someimportant software components, including parts of compilers
In 1969, C. Cook extended Turings study Cook able to separate those problems that can be solvedefficiently by computer from those problems that can inprinciple be solved, but in practice take so much timeproblem called intractable or NP-hard
All these theoretical developments bear directly on what
computer scientists do today Finite automata & certain formal grammarsused in the
design and construction of important kinds of software
Turing machinehelp understand what we can expect fromour software
4
-
8/10/2019 Automata - chap1+introduction
5/42
CONT.
Regular expressions are used in many systems.
E.g., UNIX a.*b.
E.g., document-type definition, DTDs describe XML tags witha RE format like person (name, addr, child*).
Finite automata model protocols, electronic circuits. Theory is used in model-checking.
Context-free grammars are used to describe the syntax
of essentially every programming language.
Not to forget their important role in describing natural
languages.
And DTDs taken as a whole, are really CFGs.
5
-
8/10/2019 Automata - chap1+introduction
6/42
CONT.
When developing solutions to real problems, weoften confront the limitations of what software cando.
Undecidable thingsno program whatever can do it.
Intractable thingsthere are programs, but no fastprograms. Well learn how to deal formally with discrete
systems.
Proofs: You never really prove a program correct, but
you need to be thinking of why a tricky technique reallyworks.
Well gain experience with abstract models andconstructions.
Models layered software architectures.6
-
8/10/2019 Automata - chap1+introduction
7/42
AUTOMATATHEORY
Automata theory deals with the definitions and properties of
mathematical models of computation.
These models play a role in several applied areas of computer
science, such as:
Finite automataused in text processing, compiler, andhardware design Context-free grammarused in programming languages
and artificial intelligence
Excellent place to begin study of the theory of computation.
Allows practice with formal definitions of computation as itintroduces concepts relevant to other nontheoretical areas of
computer science.
7
-
8/10/2019 Automata - chap1+introduction
8/42
TERMINOLOGYANDMATHEMATICALCONCEPTS
- SETS
A group of object represented as a unit
May contain any type of object, incl. numbers, symbols, and
even other sets.
The objects in a set are called its elementsor members.
Thus, set {7, 21, 57} contains the element 7, 21, and 57
Symbol and denote set membership and nonmembership
7 {7, 21, 57} and 8 {7, 21, 57}
A is a subsetof B, written AB, if every member of A also is
a member of B
A is a proper subset of B, written A B, if A is a subset of B
and not equal to B.
8
-
8/10/2019 Automata - chap1+introduction
9/42
SETSCONT.
An infinite set contains infinitely many elements
The set of natural numbers as {1,2,3,} The set of integer is written {,-2,-1,0,1,2,} The set with 0 members is called empty set, written
To describe a set containing elements according to some rule,write {n|rule about n}
{n|n= m2 for some mN} means the set of perfect squares.
Two sets A and B, the un ionof A and B, writtenAB,
combining all the elements in A and B into a single set
Intersectionof A and B, writtenA B, is the set of elementsthat are in both A and B.
The compliment of A, , is the set of all elements underconsideration that are not in A.
Venn diagram examples9
-
8/10/2019 Automata - chap1+introduction
10/42
SEQUENCESANDTUPLES
A sequence of objects is a list of these objects in some order
Examples: sequence 7, 21, 57 written (7,21,57)
In sequence the order and repetition does matter.
Sequences may be finite or infinite.
Finite sequences often are called tup les
A sequence with kelements is a k-tuple. Thus (7,21,57) is a 3-
tuple. A 2-tuple is also called pair.
Sets and sequences may appear as elements of other sets
and sequences.
Power set of A is the set of all subsets of A
If A is the set {0,1}, the power set of A is the set {, {0}, {1}, {0,1}}.
The set of all pairs whose elements are 0s and 1s is {(0,0), (0,1),
(1,0), (1,1)}10
-
8/10/2019 Automata - chap1+introduction
11/42
SEQUENCESANDTUPLESCONT.
IfAand Bare two sets, the Cartesian Prod uct or cross produc t of
Aand B, writtenA B
Is the set of all pairs wherein the first element is a member ofAand
the second element is a member of B. Example: If A = {1,2} and B = {x,y,z}
A B= {(1,x), (1,y), (1,z), (2,x), (2,y), (2,z)}.
Also can take the Cartesian product of ksets,A1,A2, ,Ak, writtenA1 A2 Ak, It is the set consisting of all k-tuples (a1,a2,,ak)where aiAi. Example: If A and B are as above example,
A B A= {(1,x,1), (1,x,2), (1,y,1), (1,y,2), (1,z,1), (1,z,2), (2,x,1), (2,x,2), (2,y,1),
(2,y,2), (2,z,1), (2,z,2)}.
If we have the Cartesian product of a set with itself, we use the
shorthand AA A =Ak
Example: The set N2equals N N . It consists of all pairs of natural numbers. May
also write as {(i,j)|i,j 1}.
11k
-
8/10/2019 Automata - chap1+introduction
12/42
FUNCTIONSANDRELATIONS
A function is an object that sets up an input-output relationship.
If fis a function whose output value is bwhen the input value is a,
write as
f(a) = b.
Function also called a mapping
The set of possible inputs to the function called its domain.
The outputs of a function come from a set called its range.
The notation for saying that fis a function with domain Dand range R
is
f: DR
Describe a specific function in several ways:
Procedure for computing an output from a specified input
Table that list all possible inputs and gives the output for each input.
12
-
8/10/2019 Automata - chap1+introduction
13/42
FUNCTIONSANDRELATIONSCONT.
Example: Consider the function f: {0,1,2,3,4} {0,1,2,3,4}.
This function adds 1 to its input and then outputs the result modulo 5.
A number modulo m is the remainder after division by m. Forexample, the minute hand on a clock face counts modulo 60. When
we do modular arithmetic we define Zm = {0,1,2,,m-1}.With thisnotation, the aforementioned function f has the form f: Z5Z5.
13
n f(n)
0 1
1 2
2 3
3 4
4 0
-
8/10/2019 Automata - chap1+introduction
14/42
FUNCTIONSANDRELATIONSCONT.
Example: Two-dimensional table is used if the domain of
function is the Cartesian product of two sets. Function, g: Z4
Z4Z4. The entry at the row labeled iand the column labeled
jin the table is the value of g(i,j).
The function gis the addition function modulo 4.
14
g 0 1 2 30 0 1 2 3
1 1 2 3 0
2 2 3 0 1
3 3 0 1 2
-
8/10/2019 Automata - chap1+introduction
15/42
FUNCTIONSANDRELATIONSCONT.
When domain of a function f isA1 Akfor some setsA1,,Ak,the input to f is a k-tuple (a1,a2,,ak) and we call aithe arguments tof.
A function with k arguments is called a k-ary function, and k is the
arity of the function.
If k is 1, fhas single argument and f is called a unary function.
If k is 2, fis a binary function.
A predicate or property is a function whose range is {TRUE, FALSE}.
A property whose domain is a set of k-tuplesA Ais called arelat ion, a k-ary relation, or a k-ary relation on A.
A common case is a 2-ary relation, called binary relat ion.
If Ris a binary relation, the statement aRbmeans that Arb =
TRUE.
Similarly if Ris a k-ary relation, the statement R(a1,,ak) meansthat R(a1,,ak) = TRUE.
15
-
8/10/2019 Automata - chap1+introduction
16/42
FUNCTIONSANDRELATIONSCONT. Example: in childrens game called Scissor-Paper-Stone, the two player simultaneously
select a member of the set {SCISSORS, PAPER, STONE} and indicate their selection
with hand signals. If the two selections are the same, the game starts over. If the
selections differ, one player wins, according to the relation beats.
From table can determine that SCISSORS beatsPAPER is TRUE and that PAPER beats
SCISSORS is FALSE.
Describing predicates with sets instead of functions is more convenient. The predicate P :
D {TRUE, FALSE} may be written (D,S), where S = {a D| P(a) = TRUE}, or simply S
if the domain D is obvious from the context. Hence the relation beats may written
{(SCISSORS, PAPER), (PAPER, STONE), (STONE SCISSORS)}.
16
beats SCCISSORS PAPER STONE
SCCISSORS FALSE TRUE FALSE
PAPER FALSE FALSE TRUE
STONE TRUE FALSE FALSE
-
8/10/2019 Automata - chap1+introduction
17/42
FUNCTIONSANDRELATIONSCONT.
Special type of binary relation, called an
equivalence relat ion, captures the notion of two
objects being equal in some feature.
A binary relation Ris an equivalence relation if R
satisfies three condition:
1. Ris ref lexiveif for everyx,xRx;
2. R is symmetr icif for everyxand y,xRyimplies yRx;
and
3. R is t ransi t iveif for everyx, y, and z,xRyand yRzimpliesxRz.
17
-
8/10/2019 Automata - chap1+introduction
18/42
FUNCTIONSANDRELATIONSCONT.
Example: Define an equivalence relation on the
natural numbers, written 7. For i,j, Nsay that i7j, if i-jis a multiple of 7. This is an equivalence
relation because it satisfies the three conditions.
First, it is reflexive, as ii = 0, which is a multiple of 7. Second, it is symmetric, as ijis a multiple of 7 ifjiis
a multiple of 7.
Third, it is transitive, as whenever i-jis a multiple of 7
andj-kis multiple of 7, then i-k= (i-j) (j-k) is the sum oftwo multiples of 7 and hence a multiple of 7, too.
18
-
8/10/2019 Automata - chap1+introduction
19/42
GRAPHS
An und irected graph, or simply graph, is a set of points with
lines connecting some of the points.
The points are called nodesor vertices, and the lines are
called edges, as shown in following figure.
(a) Degree = (b) Degree =
The number of edges at a particular node is the degreeof that
node.
No more than one edge is allowed between any two nodes.19
1
2
3 4
5
1 2
3 4
-
8/10/2019 Automata - chap1+introduction
20/42
GRAPHSCONT.
In graph Gthat contains nodes iandj, the pair (i,j) represents
the edge that connects i andj.
The order of i andjdoesnt matter in an undirected graph, sothe pairs (i,j) and (j, i) represent the same edge.
If Vis the set of nodes of Gand Eis the set of edges, we sayG= (V, E).
Graph can be describe with diagram or more formally by
specifying Vand E.
Example:
Formal description for graph (a) is
({1,2,3,4,5}, {(1,2), (2,3), (3,4), (4,5), (5,1)})
Formal description for graph (b)
({1,2,3,4}, {(1,2), (1,3), (1,4), (2,3), (2,4), (3,4)})20
1
2
3 4
5
1 2
3 4
(a)
(b)
-
8/10/2019 Automata - chap1+introduction
21/42
GRAPHSCONT.
Graphs frequently are used to represent data.
For convenience, we label the nodes and/or edges of a graph,
which then called a labeled graph.
Graph Gis a subgraph of graph Hif the nodes of Gare a
subset of the nodes of H, and the edges of Gare the edges ofHon the corresponding nodes.
Figure shows a graph H and
a subgraph G (shown darker)
Pathis a sequence of nodes connected by edges.
Simple path is a path that doesnt repeat any nodes. A graph is connectedif every two nodes have a path
between them.
A path is a cyc leif it starts and ends in the same node.
21
-
8/10/2019 Automata - chap1+introduction
22/42
GRAPHSCONT.
Simple cycle is one that contains at least three nodes and
repeats only the first and last nodes.
A graph is a treeif it is connected and has no simple cycles.
Tree may contain a specially designated node called the
roo t. The nodes of degree 1 in a tree, other than the root, are
called the leaves.
If it has arrows instead of lines, the graph is a directed graph.
The number of arrows pointing froma particular node is the
outdegreeof that node, and The number of arrows pointing to a particular node is the
indegree.
22
-
8/10/2019 Automata - chap1+introduction
23/42
GRAPHSCONT.
In directed graph, edge from itojrepresented as a pair (i,j).
Formal description of directed graph Gis (V, E) where Vis the set
of nodes and Eis the set of edges.
Formal description for graph below:
({1,2,3,4,5,6}, {(1,2), (1,5), (2,1), (2,4), (5,4), (5,6), (6,1), (6,3)}).
A path in which all the arrows point in the same direction as its stepsis called a directed path.
A directed graph is st rongly con nected if a directed path connects
every two nodes.23
1 2
3
45
6
-
8/10/2019 Automata - chap1+introduction
24/42
STRINGSANDLANGUAGES
Strings and characters are fundamental building
blocks in computer science.
Alphabetto be any nonempty finite set.
The members of the alphabet are thesymbols
ofthe alphabet.
Generally use capital Greek letters and to
designate alphabets and a typewriter font for
symbols from an alphabet.
Example of alphabets:
1= {0,1};
2= {a,b,c,d,e,f,g,h,i,j,k,l,m,o,p,q,r,s,t,u,v,w,x,y,z};
= {0,1,x,y,z}.24
-
8/10/2019 Automata - chap1+introduction
25/42
STRINGSANDLANGUAGES CONT.
A str ing over an alphabet is a finite sequence of symbols
from that alphabet, usually written next to one another and not
separated by commas.
If 1= {0,1}, then 01001 is a string over 1.
If wis a string over , the lengthof w, written |w|, is thenumber of symbols that it contains.
The string of length zero is called the empty string and written
,plays the role of 0 in a number system
If w has length n, we can write w= w1w2wn where eachwi
. The reverseof w, written w , is the string obtained by writing
w in the opposite order (i.e., wnwn-1w1)
Stringz is a subst r ingof wif zappears consecutively within
w.25
-
8/10/2019 Automata - chap1+introduction
26/42
STRINGSANDLANGUAGES CONT.
If we have stringxof length mand string yof length n, the
concatenation ofxand y, writtenxy.
String obtained by appending yto the end ofx.
Superscript notation is used to concatenate a string with
itself many times.
The lex icographic order ing of strings is the same as the
familiar dictionary ordering, except that shorter strings
precede longer strings.
Thus the lexicographic ordering of all strings over the alphabet{0,1}is
(, 0,1,00,01,10,11,000,)
Alanguage
is a set of strings.
26
-
8/10/2019 Automata - chap1+introduction
27/42
BOOLEANLOGIC
Boo lean logic is a mathematical system built around the two
values TRUE and FALSE (Boolean values) always
represented by values 1 and 0.
Boolean values can be manipulated with special designed
operations, called Boolean operat ions, such as:
Negationor NOT, with symbol , the opposite value
Conjunct ion, or AND, with symbol , the conjunction of
two Boolean values is 1 if both of those values are 1
Disjunct ion, or OR, with symbol , the disjunction of two
Boolean values is 1 if either of those values is 1.0 0 = 0 0 0 = 0 0 = 1
0 1 = 0 0 1 = 1 1 = 0
1 0 = 0 1 0 = 1
1 1 = 1 1 1 = 127
-
8/10/2019 Automata - chap1+introduction
28/42
BOOLEANLOGICCONT.
Other Boolean operations:
Exclusive or, or XOR, symbol , is 1 if either but not
both of its two operands are 1
Equali ty, symbol , is 1 if both of its operands have
the same value Impl icat ion, symbol , is 0 if its first operand is 1 and
its second is 0; otherwise is 1.
0
0 = 0 0
0 = 1 0
0 = 10 1 = 1 0 1 = 0 0 1 = 1
1 0 = 1 1 0 = 0 1 0 = 0
1 1 = 0 1 1 = 1 1 1 = 128
-
8/10/2019 Automata - chap1+introduction
29/42
BOOLEANLOGICCONT.
Distr ibu t ive law for AND and OR, similar for
addition and multiplication, which states that a (b
+ c) = (a b) + (a c). The Boolean version comes
in two forms:
P(QR) equals (PQ) (PR), and its dual
P(QR) equals (PQ) (PR)
29
-
8/10/2019 Automata - chap1+introduction
30/42
FORMALPROOF
Proof is something that every computer scientist needs to
understand
Formal proof of the correctness of a program should go hand-
in-hand with writing of the program itself.
Recursion or iterationmight unlikely write the code correctlywhen testing tells the code is incorrect, we still need to get itright
To make recursion or iteration correctneed to set up an inductivehypothesis.
The process of understanding the workings of a correct program,
same as the process of proving theorems by induction.
Automata theory cover methodologies of formal proof:
1. Deductive (sequence of justified steps), and
2. Inductive (recursive proofs of a parameterized statement that use
the statement itself with lower values of the parameter)30
-
8/10/2019 Automata - chap1+introduction
31/42
DEDUCTIVEPROOFS
Consists of a sequence of statements whose truth
leads from some initial statement, called hypothesis
or the given statement(s), to a conclusion
statement.
Hypothesis may be true or false, typically consistsof several independent statements connected by
logical AND
Theorem is proved when go from a hypothesis Hto
a conclusion C, the statement is if Hthen C, saysthat Cis deduced from H.
31
-
8/10/2019 Automata - chap1+introduction
32/42
DEDUCTIVEPROOFSCONT.
Example:
Theorem 1.3: Ifx4, then 2xx2. Can convince informally that Theorem 1.3 is true, with H
is x4, has parameterx, thus neither true nor false.
Its true depends on value of parameterx; e.g., H is trueforx= 6 and false forx= 2.
The C is 2xx2 uses parameterxand true for certainvalues ofx. C is false forx = 3, since 23= 8, which is
not as large as 32 = 9. on the other hand, C is true forx
= 4, since 24= 42 = 16. Forx= 5, the statement is alsotrue, since 25= 32 and 52 = 25.
We have completed an informal but accurate proof. (we
shall return to the proof and make it more precise in
inductive proofs)32
-
8/10/2019 Automata - chap1+introduction
33/42
REDUCTIONTODEFINITIONS
Many in automata theory, the terms used in the statement
may less obvious.
If not sure how to start a proof, convert all terms in the
hypothesis to their definitions.
Example: Theorem 1.5: Let Sbe a finite subset of some infinite set U. Let T
be the complement of Swith respect to U. Then Tis infinite.
Restating the facts into definitions:
33
Original Statement New Statement
Sis finite There is a integer nsuch
that S=n
Uis infinite For no integerpis U=p
Tis complement of S S
T= Uand S
T=
-
8/10/2019 Automata - chap1+introduction
34/42
REDUCTIONTODEFINITIONSCONT.
Need to use a common proof technique called proof bycontradiction, which assume that the conclusion is false. Thenuse the assumption, together with parts of the hypothesis, to
prove the opposite of one of the given statements of the
hypothesis.
The contradiction of conclusion is Tis finite. Restate theassumption that T is finite as T=mfor some integer m. One of the given statement, ST= Uand ST= . Element
of Uare exactly the elements of Sand T. Thus, there must be n+
melements of U. Since n+ mis an integer, we have shown that
U=n + m, follows that Uis finite. But the statement that Uisfinite contradicts the given statement that Uis infinite.
By the principle of proof by contradiction we may conclude thetheorem is true.
34
-
8/10/2019 Automata - chap1+introduction
35/42
REDUCTIONTODEFINITIONSCONT.
Proofs do not have to be so wordy.
The reprove of theorem in a few lines:
PROOF: (of Theorem 1.5) We know that ST= Uand S
and T are disjoint, so S + T= U. Since S is finite,
S = nfor some integer n, and since U is infinite, there isno integerpsuch that U=p. So assume that T is finite;that is; T = mfor some integer m. Then U=S +T= n+ m, which contradicts the given statement thatthere is no integerpequal to U.
35
-
8/10/2019 Automata - chap1+introduction
36/42
OTHERTHEOREMSFORMS
The if-then form of theorem is most common in typical areasof mathematics.
However, there are other kinds of statement proved as
theorems also.
Ways of Saying If-Then Some other ways in which if Hthen C might appear:
Himplies C, Honly if C, Cif H, or Whenever Hholds, Cfollows(and with other variants form).
If-And-Only-If Statements
Form of A if and only if B, other form A iff B, A is equivalent to B,or A exactly when B.
These statements are actually two if-then statements: if A then Band if B then A
To prove A if and only if B by proving two statements:1. The if part: if B then A, and
2. The only-if part: if A then B, which often stated in equivalent form Aonly if B
36
-
8/10/2019 Automata - chap1+introduction
37/42
ADDITIONALFORMSOFPROOF
Proving Equivalence About Sets
In automata theory, we are frequently asked to prove a theorem
which says that the sets constructed in two different ways are the
same sets.
Often this sets are sets of character strings, and the sets are
called languages. If E and F are two expressions representing sets, the statement
E=F means that the two sets represented the same.
Commutative law of union says that we can take the union of two
sets R and S in either order., R S = S R
Contrapositive The contrapositive of the statement if H then C is if not C then
not H. A statement and its contrapositive are either both true or both
false, so we can prove either to prove the other.37
-
8/10/2019 Automata - chap1+introduction
38/42
ADDITIONALFORMSOFPROOFCONT.
Counterexample
A strategy for implementing a program for exampleandneed to decide whether or not the theorem is true.
The resolve the question, we may alternately try to prove
the theorem, and if cannot, try to prove that the statementis false.
Proof by Contradiction
Another way to prove a statement of form if H then C is toprove the statement H and not C implies falsehood.
Start by assume hypothesis H and the negation of theconclusion C.
Complete the proof by showing something known to be
false. (example Theorem 1.5)38
-
8/10/2019 Automata - chap1+introduction
39/42
INDUCTIVEPROOFS
Form of proof that is essential when dealing with
recursively defined objects or concepts such as
trees and expressions of various sorts.
Inductions on Integers
Given statement S(n), ninteger to prove. Common
approach is to prove:
1. Basis show S(i) for a particular integer i. Usually i=0 or i=1.
(or maybe higher, idepends on S)
2. Induction step assuming ni, wherei is the basis integer,
and show that if S(n) then S(n+1). The Induction Principle: If we prove S(i) and we prove that for all n
i, S(n) implies S(n+1), then we may conclude S(n) for all ni.
39
-
8/10/2019 Automata - chap1+introduction
40/42
INDUCTIVEPROOFSCONT.
Example: Theorem 1.3 states that Ifx4, then 2xx2.
BASIS: If x =4, then 2xandx2 are both 16. Thus 2442
holds.
INDUCTION: Suppose for somex4 that 2xx2. We
need to prove the same statement with x+1 in place of x,
that is 2[x+1][x+1]2.
In this case, we can write 2[x+1]as 2 2x . Since S(x) tells us
that 2xx2, we can conclude that 2x+1= 2 2x2x2.
But we need to show that 2x+1(x+1)2. One way to prove this
statement is to prove that 2x2 (x+1)2and then use the
transitivity of to show 2x+12x2 (x+1)2. In our proof that
2x2 (x+1)2 (1.1)
we may use the assumption that x 4. Begin by simplifying
(1.1):
x2 2x+1 (1.2)40
-
8/10/2019 Automata - chap1+introduction
41/42
INDUCTIVEPROOFSCONT.
Divide (1.2) byx, to get:
x2+ (1.3)
Since x 4, we know 1/ x 1/4. thus, left side of (1.3) is at
least 4, and the right side is at most 2.25. We have thus
proved the truth of (1.3). Therefore, Equations (1.1) and (1.2)are also true. Equation (1.3) in turn gives us 2x2 [x+1]2for x
4 and let us prove statement S(x+1), which we recall was 2x+1
(x+1)2.
41
-
8/10/2019 Automata - chap1+introduction
42/42
CONCEPTSOFAUTOMATATHEORY
The concepts include the alphabet ( a set ofsymbols), strings (a list of symbols from analphabet), and language (a set of strings from thesame alphabet).
Languages
If is an alphabet, and L*, then Lis a language
over .
Problems in automata is the question of deciding
whether a given string is a member of someparticular language
The problem Lis : Given a string win *, decide
whether or not wis in L. 42