cous i n(x , y) par ent (x , xp) par ent (y , yp) we w oul d lik e to c omput e the relation clos er...
Post on 19-Aug-2020
1 Views
Preview:
TRANSCRIPT
Temporal Logic and Datalog to QueryDatabases E volving in T ime
a 00
A lexander TuzhilinZ '
vi M. Kedem
WOA
MeN
Es
Jamaw
[9
m;
E’
fi
U
Q
.
Tech nical Report 484
December 1989
Temp oral Logic and Datalog to Qu eryDatabas es E vo lving in T ime
8
Alexander TuzhilinZvi M. Kedem
QM"
3.
VHflOO
\1
Tech nical Rep ort 484
81001 .
‘A‘N
‘wx
Maw
18
Jamaw
L9
December 1989
Temporal Logic and Datalog to
Databases E volving in T ime"
Alexander Tuzh ilinT
Zvi M. KedemI
A bst ract
In thi s paper, w e study a query language abou t databases evolv ing in (infini te)t ime . The syntax of the query language is based on a predi cate temporal logic . The
semant ics of the language is defined wi th infini te sequences of database states,which
in thi s paper are determined ei ther by pure Datalog programs or by negated Datalogprograms wi t h inflat ionary semant ics. In general , other mechani sms for defini ng thesemant ics , such as product ion systems
,can be used. We analyze the relat ive expressive
power of such a query language and the standard Datalog quer ies for bot h pure and
negated Datalog programs . We show that our query language has more expressivepower than Datalog queries for both pure and negated Datalog programs in general .However , w e also prove a surpris ing techni cal resul t that the existential fragment oftemporal logic has the same expressive power as Datalog queries for negated Datal ogprograms wi th inflat ionary semant ics . Thi s resul t impli es the col lapse of the exi stent ialfragment of temporal log ic for such programs: any temporal logic formul a from thatfragment can be reduced to an equivalent formula wi th a single poss i b i li ty operator .
1 Introduct ion
Consider a Datalog program P,adapted from[Ull88, p . that computes cousin rela
tionsh ip
cous i n (X , Y) par ent (X , Xp) par ent (Y , Yp) par ent (XP, Z ) par ent (YP, Z )
a xr yé YP.
‘This w ork w as partial ly supported by the Office of Naval Research under the contract number NOQO14
85-K-0046 and by the Nat ional Science Foundat ion under grant number GGR-89-6949.
Tlnformation Systems Department ; Leonard S tern School of Bus iness , New York Univers i ty,40West 4th
S treet, Room 624,
New York,NY 10003; Internet :3Department of Computer Science ; Courant Inst itute of Mathematical Sciences ; New York Univers ity;
25 1 Mercer S treet ; New York,NY 10012- 1 185 ; Internet :
cous i n (X , Y ) par ent (X , Xp) par ent (Y , Yp)
We woul d like to compute the relat ion closerCousins , defined by closerCoasins (r , y, u , v)
if and only if and cousin (u ,v) and a: and y are
“closer related”than u and v
in the obvious sense .
closerCousins can be computed with a temporal logic for
mula cousin(r,y} b efore assuming appropriate temporal semant i cs . In this
paper , we formal ly define both syntax and semant i cs of temporal logic queries .
To define semant i cs of temporal logic queries , w e have to int roduce a new Datalog
semant i cs first . Tradi t ionally, there is no not ion of t ime associated with Datalog program s :
all the cousins belong to the same“snapshot
” of cousin relat ion. Therefore,Datalog
semant i cs is associated with the state of the database at the “fixpoint t ime .
” We call
this seman t i cs static because the intermediate states of the database are of no semant i c
interest,and only the (stat i c) state at the fixpoint t ime is meani ngful .
In thi s paper , we adapt Datalog to describe an evolut ion of a database by int roducing
t ime . Specifical ly, w e assume that t ime instances are described by natural numbers with
the present being t ime instance We can now define the evolut ion of a database in
t ime . Given D th e state of a database at some t ime instance i 0; Di“ ,the s tate
of th e database at t ime instance i 1,i s obtained by applying al l th e program rules
simultaneous ly to D,. This process result s in an (infini te) sequence of database s tates
DO,D1 ,D2 , Clearly, there is no need to assume the exi stence of a fixpoint . S ince w e
assignmeanings to the element s of this sequence , we refer to the interpretat ion of aDatalog
program with thi s sequence as dynamic semant i cs . We will use the term dynamic Batalay
to refer to Datalog with dynami c semant i cs .
A s shown in[KeTu89] , dynam i c seman t i cs of databases can be described with various
formali sms,some of them unrelated to Datalog. In this paper w e concent rate on using
the following two types of Datalog programs . The first one will be pure Datalog, without
negat ions . The second will be negated Batalay, Datalog“
(negat ions being al lowed only
in the bodies of rules) with inflationary semantics We chose
inflat ionary semant i cs as i t can be applied to arbitrary negated Datal og program s .
Once semant i cs for dynami c databases is defined,the nex t s tep is to define a query
1Of course, t ime can be“shi fted, to descri be fini te pas t too .
language abou t future t ime instances of dynami c databases . One of the at tract ive features
of Datalog is the uni ty of the data model and the query lan gu age . S ince w e selected
Datalog to describe dynam i c databases , we would like to use the s tandard query language
on Datalog, i .e. Datalog queries that simply const i tute intent ional database predi cates
(IDE‘ predi cates) taken at
“fixpoint t ime .
” Although Datalog queries determine the s tate
of a database at the fixpoint t ime , one can st i ll at tempt to use them to provide answers
about intermediate database states as well . This mi ght be achievable by expanding the
underlying Datalog program so that i t s t i ll“remembers importan t intermediate states
at the fixpoint t ime . Alternat ively, we can choose a fu ture-related fragment of predicate
temporal logic as a query language for dynam i c Datalog and i t s
extensions .
In thi s paper , we study th e relat ionship between Datalog queries and temporal logic
queries . Clear ly, any Datalog query Q(x ), where x i s the vector of free variables in Q ,
can be expressed in a temporal logi c for Datalog or Datalog“ program P as where
0 i s the possibi li ty operator of temporal logic[Kro87] . Intui t ively, Q(x ) is t rue at the
“fixpoint t ime” if at some point in the execu t ion of program P
, Q(x ) becomes t rue (this
lat ter statement is denoted by Thus,temporal logi c is at leas t as powerful as
Datalog queries . Indeed,as w e show
,temporal logic queries have more expressive power
than Datalog queries for both Datalog andDatalog“ program s with inflat ionary semant i cs .
It i s techni cal ly challenging to analyze when Datalog queries are as powerful as tem
poral logic queries . We will show that under mi ld rest rict ions,the existential fragment
of temporal logi c (to be defined later), has the sam e expressive power as Datalog queries
for safe Datalog“ with inflat ionary semant ics . Among other things
,i t follows from thi s
that under some mi ld rest ri ct ions the existent ial fragment of temporal logic for Datalog“
with inflat ionary semant i cs col lapses to a s ingle poss ibility operator, i .e . any temporal logi c
formula from the exi s tent ial fragment can be expressed with a single possibi li ty operator
It may be worthwhile to state succinct ly the advan tages of temporal logi c queries over
Datalog queries for dynam i c semant i cs defined by Datalog program s and i t s extensions .
F irst , as claimed above,they have more expressive power for Datalog and Datalog
“ wi th
inflat ionary semant i cs . Second,they do not depend on exi s tence of any fixpoints for some
of Datalog extensions , e .g. for Datalog (the extension of Datalog“ where negat ions are
al lowed both in the body and the head of a rule)[AbVi89] 2 . Third,w e bel ieve that
temporal logi c queries are conceptual ly simpler . For the specific case of Datal og“ with
inflat ionary seman t i cs , any Datalog query Q can be expressed as oQ in temporal logic ;
however,as we show later , i t seems that some of the temporal logic queries expressible as
Datalog queries , can be expressed so only in a very compli cated w ay.
We now describe the organizat ion of the rest of the paper while highlight ing our
contribut ions . In Sect ion 2 , we define both some preliminary concept s that will be used
in Sect ion 4 and define a query language about dynam i c databases whose syntax i s based
on temporal logic and whose seman t i cs i s based on dynami c Datalog and its ex tensions .
In Sect ion3, w e describe related work . In Sect ion 4, we analyze the relat ive expressive
powers of temporal logi c and Datalog queries for Datalog and Datalog“ programs with
dynami c seman t i cs . We show how domain- independent3and domain-dependent queries
from the existent ial fragment of temporal logi c can be expressed wi th Datalog queries . Our
technique is bas ed on expressing temporal queries wi th ordering predicates we introduce in
the paper . These predi cates characterize the relat ive t imes when var ious tuples are inserted
into the IDE predi cates,and
,most importan t ly, can be computed using safe Datalog
“ rules .
2 Prelim inar ies
In t his sect ion,we define preliminary concepts that are necessary for understanding the
material presented in Sect ion 4. However,certain concepts
,mainly in Sect ion const i
tu te our original cont ribu t ion . We wi ll delineate these concepts from the work of other
researchers .
Datalog and I ts E x tens ions
A pure Datalog program is a fini te set of Horn clauses i .e . rules of the form A 4 B l A
B where A is a posi t ive li teral and BI , E u are posi t ive li teral s or bui l t- in predi cates
x i xj , x , xi , x, xj , et c . (the I gl
s are variables or constant s). A negated Datalog
2Actual ly, as stated above,temporal logic queries can be app l ied to any formal ism ab le to express dynamic
semantics .3This concept corresponds to the fami l iar one for static databas es and w i l l be defined precisely later.
program,Datal og
“
program ,i s a set of rules in which negat ions are allowed in the body
of a rule . Datalog program P has two types of predi cates . E xtens ional database predicates
(EDB predicates never appear in the head of a rule in P . For each EDB predi cate,P has
a set of ground atoms (fact s) associated with that predi cate . Th e predi cates appearing
in the left-hand side of some rule in P are called intentional database predicates (IDB
predicates) For a general survey of Datalog, see e .g.[U1188] .If P is a Datalog or Datalog
“ program,the schema of P
,seh (P), is the database
schema consist ing of schemas of al l the relat ions occurring in P . A domain of program
P,DOMp
,is the domain associated with schema s eh (P). Th e domains can be infini te in
general .
The seman t i cs of Datal og program s is tradi t ionally defined in terms of the least fix
point , whereas there are several types of semant i cs proposed in the li terature for Datalog“
.
In this paper , we will use inflationary semantics for Datalog“
because it i s applicable to al l Datalog“ programs . Both types of semant i cs are defined
with a mapping E VAL from EDB and IDB predi cates to IDB predi cates of a program P,
which,for a given instan ce of IDB predi cates D ,
computes all the new facts derivable from
D and from EDE predi cates by applying ru les in P .
4 IDB predi cates are ini t ially made
empty. The leas t fixpoint semant i cs i s defined as the fixpoint of the recurrence equat ion
D,“ E VAL(D, and inflat ionary semant ics with the fixpoint of the recurrence equat ion
D,“ D; UE VAL(D; ) (a fixpoint i s reached when D, These two semant i cs
coincide on pure Datalog program s . We call these types of seman t i cs , based on fixpoints ,
static because they are defined only in term s of the“final”(stat ic) result .
Another meaning of Datalog andDatalog“
programs can be associated with the ent ire
sequence of database states DO,D1 ,D2 , ,Dn , We cal l thi s kind of semant i cs dynamic
because i t defines the “dynam i cs
” of Datalog and Datalog“ programs . This dynam i c
semant i cs will become important below when w e define semant i cs of temporal logic .
Next , we define safety. Intui t ively, a ru le is safe if i t cannot produce new constant s .
Formal ly, w e define limited variables as in[Ull88] . A variable is lim i ted if i t ei ther occursposit ively in one of the predi cates in the body of a rule or can be equated to a constant
or to some other variable that occurs posi t ively in the body of a rule through a chain of
4For precise defini t ion of E VAL see[Ull88, p . 1 15]
equal i t ies . A Datalog“ rule is cal led safe[AbVi88] if al l the variables in the head of the rule
are limi teds . A Datalog“
program i s safe if al l of it s ru les are safe . Clear ly, safe Datalog“
program s cannot introduce new symbols when ru les are applied to a database . Therefore,
a safe Datalog“ program with inflat ionary semant i cs always has a fixpoint[AbVi88] . The
defini t ion of safety presented here (and in[AbVi88] ) differs from the convent ional defini t ion
of safety from[Ull88] . In[Ull88] , a ru le is safe if all the var iables (not only in the head
but al so in the body of a rule) are limi ted. Therefore,the convent ional defini t ion of safety
[Ull88] i s more restri ct ive than the defini t ion from[AbVi88] adopted in this paper .If program P i s not safe then i t generates all the new symbols from DOMp because a
variable in the head not occurring posi t ively in the body i s not bound to any value during
the ru le mat ching process and,therefore
,i s assigned any value from D0Mp . Consequent ly,
w e get infini t e relat ions for unsafe programs . To avoid this si tuat ion,w e cons ider only safe
programs in thi s paper .
A Datalog query for a Datalog or Datalog“ program P i s a predicate Q appearing
am ong the IDE predi cates of P
Temporal Logic as a Query Language
In this sect ion,w e define a query lan guage based on temporal logic . The syntax of a query
language is based on a predi cate temporal logi c and the seman t i cs ei ther on pure Datalog
or on Datalog“
. Separately, temporal logi c and Datalog have been extensively studied
before . One of our cont ribut ions in thi s paper lies in combining the two separate concepts
in one integral approach .
We cons ider the standard predi cate temporal logic[McArt76,ReUr71 ] with standard
temporal Operators (poss ibility), D (necessity), o (next-moment); binary operators w h i l e ,
unt i l,unle s s
,b e fore
,atnex t [Kro87] ; and with t ime defined with natural numbers .
A atnex t B is true if A is true at the first t ime B is true . If B is never true then atnex t
yields false6 . Defini t ions of other operators can be found in[Kr687] . [Kro87] al so shows
how all of them can be expressed in terms of atnext .
5A ctual ly, this defini t ion of safety is a. rew ri tten defini t ion of strong safety from[AbVi88] .6Our atnex t operator differs from the defini t ion of atnex t in[Kro87] . S ince each of the atnex t
operators can be expressed in terms of the other , the s l ight al teration ofmeaning does not affect our resu l ts .
Semant ics of a temporal logi c formula i s defined wi th a temporal structure[Kro87] ,which comprises the val ues of al l i t s predi cates at all the times in the future . Formally,
a temporal structure is a mapping K N P1 x”Pk , where N is a set of nat ural
numbers , and 73; i s the set of all the possible interpretat ions of a predi cate P,. The mapping
K assigns to each t ime ins tance (a natural number) the truth values for predi cates Pi . We
will u se Kt instead of K (t) to denote the value of the temporal structure K at t ime t . We
make an important as sumpt ion, natural in the database contex t , that domains of predicates
do not change over time.
From the database perspect ive , a temporal structure can be viewed as an infini te
sequence of database states,i .e . DO,D1 D2 , Various methods for defining sequences of
database states have been considered in In this paper,we consider Datalog with
dynam i c seman t i cs and Datalog“ with dynami c inflat ionary semant ics as two mechanisms
for defining sequences of database states . Specifical ly, a Datalog or Datalog“ program P
defines the temporal s tructure KP DO,D1 ,D2 , where D, i s the state of the database
(i .e . instan ces of predi cates from seh (P)) at t ime i , such that D,“ E VAL(D, for
i 3, where E VAL is the mapping di scussed on page 5 .
Sequences of database states have been studied before in
Also,[SeSh87] defines t ime sequences and operat ions on them . This research is related to
our work because temporal st ructures can be defined with these sequences . However , in
this paper , we concentrate on the sequences defined with Datalog and Datalog“ programs
and use them to define semant i cs of temporal logi c queries .
A temporal logic formul a d on Datalog or Datalog“ program P
,with al l the predi cates
in d belonging to schema seh (P), defines a query on P . Th e answ er to query d on P i s the
set of tuples {x or, al ternat ively, i t is defined with a static (t ime independent )
predicate
where K 1D is the temporal structure determ ined by program P (K5D
means that KP is
evaluated at t ime t In other words, d}, specifies the set of tuples x sat isfying the
temporal logi c formula d at t ime 0with semant i cs determ ined by program P . We will
refer to first order logi c predi cate d}; as being induced by temporal logic formula (15 and
program P . Note that we are interested in making predi ct ions now about events in the
future. For a very simple example consider gb A (x ) atnex t E (x). Then is t rue at
t ime 0 if A (x ) i s t rue at the first t ime instance when E (x) becomes true .
Denote the class of al l temporal logic formulae as TL and the clas s of al l quantifier-free
temporal logic formulae TLO Consider al l the TL formulae having the form (3x 1 ) (Exu )
qb(x 1 , . xn ), where xn ) is a TLO formula, and al l the TL formulae equivalent to
them (produce the sam e answers for all the temporal st ructures). Call this subclass of TL
formu lae th'
e existential fragm ent of TL and denote it as TLa
Given a Datalog or Datalog“
program P ,w e can ask ei ther Datalog queries or queries
expressed in temporal logi c on P . In Sect ion 4, we analyze the express ive powers of the
tw o approaches , but first , we compare our work wi th the work of others .
3Related Work
Our work is related to research on temporal databases work on Datalog and its extensions
and to temporal logi c .
The work on temporal databases
is concerned with the issues of represent ing finite sequences of database states with the
actual (materialized) data and querying these representat ions . In cont rast , w e are inter
ested in the mechanisms that generate infinite sequences of database states,in general ,
such as Datalog and i t s extensions , as well as in querying the sequences generated by these
mechani sms .
There is a large body of resear ch studying queries on Datalog and i t s extensions (we
again refer the reader to[Ull88] However,as stated before
,this research does not
consider temporal aspects of Datal og and is mainly interested in fixpoint queries . The
paper[Ch lm88] const i tutes an except ion to this . It studies evolut ion of databases in t ime
by introducing a single monadi c funct ion successor per predi cate and dividing at tributes
into temporal and non-temporal types . That approach to modell ing t ime in the fram ework
of logi c programming8 i s more general than the dynami c seman t ics of Datalog. This can
be illus trated with an example : a Datalog rule A (r ) B(x ) can be converted to the rule
B(t, x ) A (t 1,x ) in the formalism of[Ch Im88] . In addi t ion ,
the complexi ty of query7This is l ist represents the scope of the w ork in the field and is not meant to be exhaust ive .
8S ince a funct ion symbol is introduced.
processing and quest ions related to fini teness of leas t fixpoints are studied in[Ch 1m88] .F inal ly, the i ssues of comput ing infini t e fixpoints wi th fini te computat ions by introducing
infini te object s are considered. However,the semant ics of program s in[Ch Im88] i s st i ll
defined in terms of fixpoints and i s,therefore
,stat i c . In other words
, queries are st i ll
asked about the predi cates at the fixpoint t ime and not about “intermediate” s tages as
is done in temporal logi c . In contrast,we consider Datalog and Datalog
“
programs as
temporal structures for temporal logi c queries,and w e analyze the express ive power of
Datalog queries with temporal logic queries for Datalog and Datalog“ programs .
There is a large body of work on temporal logic . Textbooks , such as
[vBen83] , provided a general descript ion of this resear ch . However,thi s research considers
arbi trary temporal st ructures , whereas we are interested in the temporal structures gener
ated by fini te formal i sms such as Datalog and i t s extensions and in the expressive power
of temporal logic queries on these st ructures .
The paper[ClWa83] uses intent ional logi c[G al 75 ] to provide a formal semant i cs of
hi stori cal databases . It also di scusses how t ime-related queries can be expressed in inten
tional logi c. It is well-known that temporal logi cs const i tute fragment s of the intent ional
logic[G al 75 ] . However , we use temporal logic and not the intent ional logi c as a query
language,as i t is bet ter su i ted for the relat ional model and therefore i t is more applicable
to Datalog program s .
As w as stated before,there has been no analys is of the relat ive expressive powers of
temporal logi c andDatalog queries . In the nex t sect ion,we car ry such an analysi s . In par
ticular, we prove a surprising result that for negated Datalog with inflat ionary semant ics ,
Datalog queries have the sam e express ive power as the domain independent (to be defined
later) exi s tent ial fragment of temporal logi c .
4 Temporal Logic vs . Datalog Quer ies
In this sect ion,w e compare the expressive power of temporal logi c and Datalog queries
for two groups of programs . The first group const i tutes pure Datalog programs , and
the second group consist s of Datalog“ program s with the inflat ionary semant i cs (we will
implici t ly assume that semant i cs is inflat ionary for Datalog“
programs in the sequel). F i rst ,
w e show that any Datalog query can be expressed in temporal logic with a simple formula
for both types of programs .
Pro p os it i on 1 For any Datalog query Q defined on Datalog“
program P there is a tem
poral logic query defined on P such that the tw o formulae define th e same mapping.
Pro of: The temporal logic formula i s simply {x 0 A tuple x belongs to the
fixpoint of P if and only if at some point in t ime Q(x ) i s true . I
S ince Datalog programs const i tute a subset ofDatalog“
programs,Proposi t ion 1 holds
for Datalog programs as well .
It i s simple to express Datalog queries in temporal logi c . However,the quest ion
whether or not temporal logi c queries can be expressed as Datalog queries is a non—t rivial
one. We determ ine the answer to this quest ion in the rest of this sect ion. Subsect ion
deal s wi th pure Datalog and Subsect ion with Datalog“
Temporal Logic vs . Datalog Quer ies for Datalog
Th eorem 2 TL ) has more expressive pow er than Datalog queries for Batalay programs .
Pro of : We claim that the query defined by the TLO formula A (x ) atnex t E (x ) where
A and B are EDBs i s not expressible in Datalog The claim follows from the fact that
this query is not monotone in predi cate B in general . In cont ras t,Datalog programs are
monotone in al l their predi cates . I
Temporal Logic vs . Datalog Queries for Negated Datalog
w ith Inflat ionary S emant ics
In this sect ion,w e prove the main techni cal result of this paper that the ex i s tent ial frag
ment TLO of temporal logic with mi ld restrict ions imposed on i t has the sam e expressive
power as Datalog queries for safe negated Datalog programs with inflat ionary semant ics .
Restrict ions imposed on TLO have the following nature . The an swer to a Datalog query
on a safe Datalog“ program const i tutes a fini te relat ion because safe rules cannot pro
duce new symbols . However , an arbi trary query from TLO can produce an infini te answer .
10
Therefore , we define the concept of domain independence of temporal logi c queries as a
general izat ion of domain independent relat ional queries[Ull88] and restri ct TLO to domain
independent queries to guarantee fini te answers . Formally, we will prove that for any safe
Datalog“
program P and a domain independent query d on P from TLg there is a safe
Datalog“ program P
’and a Datalog query Q such that d}, E Qp r
We al so prove another stronger result : we do not rest rict queries from TLo to be
domain independent and treat infini te an swers in some “uniform fashion by introducing
a special constant to ou tside DOMp such that (loosely speak ing) the val ue of a temporal
logic query on infini telymany tuples coincides with the value of th e query on w . Th e exact
mean ing of thi s statement will be provided later in this paper .
We structure the proof of the main resul t of this sect ion as follows . F irst,w e construct
ordering predi cates for a temporal logic query d coding the order in which various t uples
are added to the IDE predi cates . Second,w e show how these predi cates can be used to
produce a Datalog query“equ ival ent
”to d) . Third,
we show how these predi cates can be
produced with safe Datalog“ programs . In the next subsect ion
,w e provide an example
that i llustrates these s teps .
E xamp le
Consider a simple temporal formula A (x , y) atnex t F irst w e define
the predi cates coding the order in which tuples are inserted into the IDB predi cates .
Let tA (x , y) be the t ime instan ce when the tuple (x , y) i s inserted into the predi cate
A ,and let t3(y, 2 ) be the t ime instance when the tuple (y, z ) i s inserted into th e predi cate
B . tA (x , y) and t3(y,z ) are well defined, because under th e inflat ionary semant ics of
Datalog“
, once a tuple i s inserted into a predi cate , i t will never be removed from i t . In
general, 0S oo . tA (x ,y) 0means that (x , y) w as in A at t ime 0(and
therefore A w as an EDB), and tA (x , y) 00 means that (x , y) w as never inserted into A ;
simi lar ly for t3.It follows from thedefini t ion of 4b}; th e predi cate inducedby d ,
and from the defini t ion
of atnex t operator that
True if tA (x , y) 5 t3(y, z ) 00
“SP0“y,z ) F als e otherwise
1 1
Note that by the defini t ion of atnex t , if tB(y, z ) 00 then F alse .
Fur thermore,observe that the value of y, 2 ) depends only on the relat ive t imes when
A (x , y) and B(y, 2 ) become true , that i s when (x ,y) i s inserted into A and (y,z ) i s inserted
into B .
To formally code these relat ive order of“ insert ion t imes
,w e introduce 1 1 order
ing predi cates : RO=A=B<ooa RO=A<B<00 7 R0 A<B=om RO<A=B<00 a R0<A <B=om
R0<A=B=om R0=B<A=om These predi cates cover all
possibi li t ies of relat ive t imes of tuple insert ions into predi cates A and B
The notat ional structure of these predi cates is R000P1 01 P292 c>o awhere 90,91 ,92 E
and {PI ,P2 } {A ,B} . Such a predi cate i s defined by
TTUC If 090tp191 tp29200RoeoPi 61 P202 00F alse otherwi se
For example, i s t rue if and only if (x , y) w as not in A and (y, z )
w as not in B at t ime 0, and (x , y) w as inserted into A and (y, z ) w as inserted into B at
the sam e t ime instance .
Nex t,we show how th e ordering predi cates can be used to define a Datalog query
equival ent to 45. As z ) i s t rue if and only if tA (x , y) S t3(y, z ) 00,i t follows
that i }: i s equ ival ent to R0=A=B<oo V R0=A <B<oo V R0=A<B=oo V RO<A=B<oo V R0<A <B<oog
o
Clear ly, if we int roduce a new predi cate Q such that Q 4 R; for al l the five predi cates
R, appearing in the previous disjunct ion,then Q and d}; are equ ival ent .
In th e proof of the main theorem we will show how such ordering predi cates in the
di sjunct ion can be compu ted with safe Datalog“
program s . Thi s mean s that d is equ ival ent
to Q on a safe Datalog“ program .
O rde r ing P r ed i cat e s for Quant ifier-fre e Fo rmu lae
In this sect ion,w e rest ri ct our at tent ion only to quantifier-free temporal logic formu lae
TLC
We formally define ordering predi cates on a query d now . A temporal logi c formula
9In this examp le i t doesn’t matter w hether tA (x , y) 0or tA (x , y) 0, but in general such dist inct ionsneed to be made.
prove the lemma by induct ion on the number of operators in d ,which without loss of
general i ty are n,V
,atnex t . Recall that we only consider x sat isfying d (x ).
1 . d is A , for some i . Then i A IJ-(x ) The choi ce of interval s
clear ly does not depend on x (as long as
2 . d is By induct ion, U{ I ,-
(x ) j E Jl } , where J 1 does not depend on
x . Then U{I j (X )lj g J1 IJ-
(x ) Note that the set of intervals does
not depend on x .
3. gb is 451 V 9152 By induct ion, U{ I , 6 J1 } and E
Jz} , where J1 , J2 do not depend on x . Then U{ I j (X ) j 6 J I UJQ} . Clearly,
the set of interval s over whi ch is true does not depend on x .
i s (451 atnex t ¢2 )o Let J’U) fi
’
Kj'
2 j) (INK ) (0) (INK ) Q
and let
M 2138”Both[1 and J are well—defined because , by induct ion,
the set of interval s comprising
does not depend on x and because whether or not I j l (X ) 0) also does not
depend on x .
Then,
u{ l , (IMj) Q A010) n)}
By induct ion,the choice of intervals in dl does not depend on x . Therefore the set
of interval s comprising al so does not depend on x .
Lemma 4 Let (b be a TLO formula. Then for each i either (1570
¢i> (X l)
Pro o f: S ince R; (x ) holds if and only if the corresponding a ,
-
(x ) holds , based on
3, all w e have to do is to check if Io(x ) Q
14
Th e above lemma part i t ions the index set I into two subset s[True and I p ag e , where
Irm a fi l (X )) and[F als e fi l n M" (X )) l The nex t
lemma says that the an swer to a query from TLO i s determined by some set of ordering
predi cates .
Lemma 5 For any TLO formula (15 and for all x,V ; € 1Tm R,
-(x )
Pro o f: One par t of th e lemma follows from Lemma 4. To prove the other part , consider
any value of x . S ince R; ’s are collect ively exhaus t ive , E , (x ) will be true for some i 6 I . In
addi t ion,i t i s easy to see that this i i s in I
Note that the set I rm a is defined only by the formula (15 and i s independent of the
program P or the values of the EDBs .
Based on the above,the computat ion of d}; can be reduced to the computat ion of th e
individual predi cates R; . We wi ll exam ine under whi ch condi t ions they can be compu ted
with safe Datalog“ programs .
We define the set of base ordering predi cates for the predi cate instances A I ,A 2 , A ,l
from d . For any i ,j ,n
,let x ; ,j denote some sequence list ing the union of the
variables in x ; and x i . Then the base ordering predi cates are di vided into four types , each
of them being defined as follows .
1 . for i,n . i s t rue if and only if 0
2 . for i, j i s t rue if and only if 0
tAJ-(X i ) 00 .
3. for 2,j S O<A .
is true if and only if 0
tAJ (x , 00 .
4. for i,n . i s t rue if and only if 00 .
Predi cates of types 1 , 2 , and3will be cal led bounded; predi cate of type 4 will be calledunbounded.
Note that , for example , Roz "14 <A3=A l <A 2 =A 5 =w SO=A‘ A SA 5=00 "
Generally,
Lemma 6 Let {S j lj G J } be some enumeration of the base ordering predicates . E ach
ordering predicate R, is equivalent to AjGJ i S J‘ for some J, C J .
P ro o f: Follows from the fact that any 09;oA4,9,-,A , 9g, A,“9,-n oo can be represented as
a conjunct ion of a set of formu lae of the form0 A 0 A. A,
oo , 0 A , A,
00, A ; 00 .
Lemma 7 Bounded base ordering predicates can be computed by safe Datalog“ rules .
Pro o f: We consider each of the th ree types in turn and show for each type how to compute
predi cates of this type with safe rules .
1 . can be t rue only if A ; i s an EDB and holds . For each EDB A , w e
write the rule : No rules are writ ten for IDBs .
2 . To handle this type we want to state that there is a (fini te) instance in t ime t 0such
that and Aj(X j ) both become true for the first t ime (simultaneously). Thus ,
we wan t to say that for some t , A ; (x , and Aj(xj ) were fal se at t ime t 1 and became
t rue at t ime t . To handle this,for each IDE Ag, we introduce a trailing predi cate Afi
with the property that if A ; (x , became t rue at t, Af-(x i ) becomes t rue at t ime t + 1 .
Such IDBs A : can be computed by the rules : Using the trai ling
predi cates,w e write the rules : A i (x i )A
3. To handle thi s type , we use the trai ling predi cates defined above and the ru les com
pu t ing them . The derivat ion i s s light ly non- intu it ive,but the predi cates can be
compu ted after adding the rules : 4 Aj(xj ) A n AS
Note that the unbounded predi cates cannot in general be computedwith safe Datalog“
programs because they are in general infini t e,as i s the case for the formu la n (n A(x ) atnex t
A (x )) which is true for all x and for all programs . We will provide two solut ions to this
problem . The firs t solut ion i s to restrict the considerat ion of TLO formulae to the class
of domain independent formulae , to be defined below ,that guarantee fini te ins tan ces of
16
unbounded predi cates . The second approach will be di scussed in Sect ion To define
domain independence ,w e introduce first the not ion of the domain of a temporal logic query
with respect to a program .
D efin i t ion 8 G iven a Datalog“
program P and a temporal logic formula (15, the domain of
d) w ith respect to P,dompg ,
is the set of all the constants appearing in d , the constants
appearing in all the EDB predicates of P,in rules of P,
and the constants in all the future
instances of predicates in P,i . e. constants inferred by program P .
S ince we consider only safe programs , no new constant s will be added to dampy by
applying rules from P . Therefore,the domain of a safe formula contains only cons tant s in d ,
in EDB predi cates , and the remaining constant s of P . The domain of d with respect to P ,
dompm gives rise to a predi cate on DOMp which is true on the elements of dampy, and only
on these element s . If no confusion ar i ses,we will use the same notat ion for that predi cate
as for the domain i tself. Also, ,xn) will denote
Using Defini t ion 8, domain independence i s defined as follows .
Defini t ion 9 A temporal logic formula (b is domain independent if for any safe Datalog“
program P the predicate d}, is the same for any domain DOMp D dompm i . e d}, does
not depend on DOMp .
Note that Defini t ion 9consti tutes an ex tension of the defini t ion of a domain indepen
dent formula[Ull88] to temporal logic and Datalog“ programs . Also note that a domain
independent query returns only fini te answers,each constan t in the answer contained in
the domain Using the not ions just defined,we can state the following lemma.
Lemma 10 For each bas e ordering predicate S A ; =oo ; the predi cate 5 145 00domp
,¢
can be computed w ith safe Datalog“ rules .
P ro o f: The key observat ion is that a safe Datalog“ program reaches a fixpoint and i t is
possible to detect this fixpoint using safe Datalog" ru les , since dampy, i s fini te .
Let FP be the flag predicate, which becomes t rue one t ime instance after the fixpoint
i s reached. Then
FP
17
Clearly, the predicate domp,¢ can be computed with a Datal og program .
To finish the proof, note that FP can be computed using the rule
FP t" A § (x.
Th e ru le says that FP becomes true when the predi cates in P stop changing over t ime,
i .e . at the fixpoint t ime . This rul e is not normal ized; however , the variables x , range over
the domain which is fini te . Therefore, w e can remove universal quantifiers and
replace them with fini te conjunct ions . After that,the rule can be normalized to a set of
safe Datalog“ rules . I
Main Th eorem for th e Domain-independent C ase
In Sect ion we defined ordering and base ordering predicates and showed how ordering
predi cates can be used to compute the answer to a query. We al so showed how base ordering
predi cates could be computed with safe Datalog“ rules . We are ready to put these resul ts
together now and to state the preliminary version of the main theorem in the following
lemma.
Lemma 1 1 For any safe Datalog“
program P over some (poss ibly infinite) domain DOM p
and any domain independent temporal logic query (25 in TLO, there exists a safe Datalog“
program P’over DoMp and a Datalog query Q ,
such that (b; and Qp : define the same
predicate, w here denotes the IDE Q as computed by program P’
Pro o f: Based on Lemmas 5 and 6 , we can write
an
¢P Vi eIT rueAjeJ i 5 ]
Observe that for base ordering predi cates of of type 1 , 2 , and3: S , E S , A domp 'dn
and to s implify subsequent discussion we will write 5 ; for Sj for predi cates of this type .
(For base ordering predi cates of type 4, S ; w as introduced al ready in the proof of Lemma
S ince d}, i s domain independent ,
¢i> E ¢i> (10mm 5 AfeJ.(S i A (10mm )
18
which can be al so writ ten as
5 Vi G IT v-ue AJE J i 5 ;
From Lemmas 7 and 10 w e know that all 55, j E J can be compu ted wi th safe Datalog“
rules .
The program P’ consi s ts of program P
,the program s that compu te predicates S ; for
j E J and the following set of rules :
Q(x ) for i 6 [True
R, (x ) for i 6 IT , “
where x ; deno tes th e var iables of and Q is an IDE predicate not appearing in other
rules . Q const i tutes the query in quest ion . I t follows from (2 ) that Q and d}; define the
same predicate . I
Now ,w e are ready to state the main theorem in the version for domain independent
formu lae .
Theo rem 1 2 For any safe Datalog“
program P over some domain DOM p and any do
main independent temporal logic query (15 in TLa, there exists a safe Datalog“
program P’
over DOMp and a Datalog query Q such that d}; and Qp ' define the same predicate.
Pro o f: Consider a TLg formula d . By the defini t ion of TLa, d can be wri t ten as
where d’i s quantifier-free . By the previous lemma
,we can find a program and a query Q
that computes the sam e resul t as d’. Take project ion of Q
’on the free vari ables of d to
obtain the resu lt . I
C oro l lary 13Let P be a safe Datalog“
program over some domain DOMp and let d from
TL ; be such that all the base ordering predicates appearing in d‘
P as described in (I ) are
bounded. Then there exists a safe Datalog“
program P’over DOMp w hose rules do not
contain constants from the EDBs in P and a Datalog query Q ,such that d
'
Pand t define
the same predicate.
Pro o f: Note that we dropped the domain independence requirement because i t appeared
only in connect ion wi th unbounded predicates . The proof follows immediately from the
construct ion of the rules comput ing the bounded base ordering predicates . I
19
E xam p le 1
We return to the example at the beginning of the introduct ion. Th e program P w as
cous i n (X , Y) par ent (X , Xp) par ent (Y , Yp) par ent (XP, Z ) par ent (YP, Z )
& XP # YP
cous i n (X , Y) par ent (X , Xp) par ent (Y ,Yp) cous i n (Xp , Yp)
We were interested in comput ing relat ion defined with the
temporal logi c formul a cous in(x , y) b efore cous in(a ,v). The following Datalog
“
progr am
computes closerCousins in a w ay out lined in th e proof of Theorem 12 .
cous i nCX ,Y) par ent (X , Xp) par ent (Y , Yp) par ent (XP, Z ) par ent (YP, Z )
s xp sé rp
cous i n (X ,Y) parent (X , Xp) parent (Y , Yp) cous i n (Xp , Yp)
cous i n (X , Y)
cous i n (U,V)
Another interes t ing exam ple const i tutes the program comput ing closerThen relat ion
ship based ou t ransi t ive closure relat ionshi p transClosure. closerThan(x , y, u ,v) i s defined
as transClosure(x , y) before As before,a Datalogn program can be
used to compu te closerThan,thus comput ing which pai rs of nodes in the graph are closer
to each other than other pai rs of nodes .
C o ro l lary 1 4 For any domain independent query (15 from TL3and a safe Datalog“
pro
gram P there is a predicate Q and a safe Datalog“
program P’such that d}; (x )
w here (as w as defined in S ection 2) KO
P’
is a temporal structure defined by
program P'taken at present time.
Pro o f: Follows from Theorem 12 and Proposi t ion 1 . I
This corollary proves the collapse of the domain independent ex i stent ial fragment of
temporal logi c but only for temporal structures determined by Datalog“
programs wi th
inflat ionary semant i cs . Specifically, any domain independent TLa formula can be reduced
to a simple formula <>Q (but for a different program).
20
T heorem 1 6 For any safe Datalog“
program P over some domain DOMp and any tem
poral logic query d in TLa, there exists a safe Datalog“
program P”over DOMp U {w }
w here to Q’DOMp and a Datalog query Q such that for any x over DOMp if and
only if
Note that the replacement of x by x[w ] cannot be done by safe Datalog“ ru les,and
,
therefore,we part ial ly “step out side of”our formal i sm .
5 C onc lus ions
In this paper , we proposed a query language for dynami c databases . Th e syntax of this
lan guage is based on the fu ture- related fragment of temporal logi c and i t s semant i cs on
Datalog program s or their extensions (e .g. Datalog“ with inflat ionary semant ics ). We
compared thi s language wi th ordinary Datalog queries in terms of expressive power . It
w as shown that temporal logic has st ri ct ly more expressive power than Datalog queries
for Datalog and Datalog“
programs wi th inflat ionary semant i cs . However,for Datalog
“
programs with inflat ionary semant i cs we have proven the surprising resu lt that the do
main independent ex i s tent ial fragment of temporal logi c has th e sam e expressive power as
Datalog This result implies the collapse of the domain independent exi stent ial
fragment of temporal logic for Datalog“
program s wi th inflat ionary semant i cs .
Temporal logi c as a query language has some other importan t advantages over Datalog
queries besides more expressive power . F irst,i t i s more general than Datalog queries
because temporal logic can be asked on program s that have no fixpoints , whereas Datalog
queries depend on the exis tence of a canoni cal fixpoint , and because i t can be used with
other formali sms besides Datalog. In other words,temporal logi c can be used in the
contex ts where Datalog queries have no meaning . Second,as w e have argued,
temporal
logic i s generally eas ier to use than Datalog queries in the cases when temporal logi c queries
can be expressed with Datalog queries : some temporal logic queries can be expressed only
in a very complicated w ay in Datal og.
There has been substant ial research conducted recent ly on integrat ing product ion
sys tem s with databases . Th e special issue of the S IGMOD Record[S IGMOD89] provides10Some of our resu l ts can be general ized to formu lae outs ide the existent ial fragment .
22
an overview of this work . Also,[AbVi89] st udied an ex tension of Datalog
“ with negat ions
al lowed both in the head and in the body of a rule . Clearly, these two formali sms are
i somorphic . In both of these approaches,Datalog queries do not make sense in general
because a fixpoint of a program may not ex i s t . However,temporal logic queries on these
extensions are well-defined,and,
w e believe,more natural . In[KeTu89] , we proposed
a general approach to defining dynami c databases based on Relat ional Di screte E vent
Sys tems and Models (RDE Ses and RDBMS ). All the formali sms ment ioned in this paper
const i tute exam ples of RDBMS . It turns out that any RDEM,and not only Datalog and
Datalog“
,can be used as a semant i c bas i s for temporal logi c queries . This makes temporal
logi c a powerful approach for defining queries on dynami c databases .
A cknow ledgem ents
We wou ld like to thank Haim G ai fman Yuri Gurevich,andMihalis Yannakakis for useful
di scuss ions . Corollary 14 w as suggested to us by Haim G aifman.
References
[S IGMOD89] S IGMOD Record. September 1989. Special i ssue on ru le management andprocessing in expert database systems .
[AbVi88] S . Abiteboul and V . Vianu . Procedural and declarat ive database updatelanguages . In Proceedings of PODS Sympos ium,
pages 240— 250, 1988.
[AbVi89] S . Abiteboul and V . Vianu . Fixpoint extensions of first-order logic and
Datalog- like languages . In IE EE'
Sympos ium on Logic in Computer S cience,1989.
[AbVi90] S . Abiteboul and V . Vianu . A transact ion-based approach to relat ionaldatabase specificat ion . to appear in JA CM. (Techni cal Report CS87-102 ,Universi ty of California at San Diego , August
[Ar86 ] G . Ariav. A temporal ly oriented data model . TODS , — 5 27 , 1986 .
[Ch Im88] J . Chom i cki and T . Imielinski . Temporal deduct ive databases and infini teobject s . In Proceedings of PODS Sympos ium, pages 6 1— 73, 1988.
[ClCr87] J . Clifford and A . Croker . The historical data model (HRDM) and algebrabased on lifespans . In Proceedings of the International Conference on Data
E ngineering, 1987 . IE EE Computer Society.
23
J . Clifford and D. S . Warren. Formal semant i cs for t ime in databases .TODS
,— 254, 1983.
S . K. Gadia. A homogeneous relat ional model and query languages fortemporal databases . TODS , —448, 1988.
D. Gal lin . Intensional and Higher Order Modal Logic. North Holland,Am
sterdam , 1975 .
S . Ginsburg and K. Tanaka. Computat ion- tuple sequences and object histories . A CM Transactions on Database Systems
,— 2 1 2
,1986 .
Y . Gurevich and S . Shelah . F ixed-point ex tensions of first-order logic . An
nals of Pure and Applied Logic, — 280,1986 .
Z . M. Kedem and A . S . Tuzh ilin. Relat ional databas e behavior : u t i lizingrelat ional discrete event systems and models . In Proceedings of the A CM
Sympos ium on Princ iples of Databas e Systems,1989.
P. G . Kolai t is and C . H . Papadimitriou . Why not negat ion by fixpoint? InProceedings of PODS Symposium,
pages 231 — 239, 1988.
F . Kroger . Temporal Logic of Programs . Springer-Verlag 1987 . EATCS
Monographs on Theoret ical Computer Science .
N . A . Lorentzos andR. G . Johnson. TRA : amodel for a temporal relat ionalal gebra. In C . Rolland, F . Bodart
,and M. Leonard
,edi tors
,Temporal A s
peets in Information Systems,pages 95— 108, North-Holland, 1988.
R. P. McArthur . Tense Logic. D. Reidel Publishing Company, 1976 .
S . B. Navath e and R. Ahmed. TSQL a language interface for historydatabases . In C . Rolland, F . Bodart
,and M. Leonard
,edi tors , Temporal
A spects in Information Systems, pages 109— 122 , North-Holland, 1988.
N . Rescher and A . Urquhart . Temporal Logic. Springer-Verlag , 1971 .
A . Segev and A . Shoshan i . Logical modeling of temporal data. In Proceedings of A CM SI GMOD Conference, pages 454—4 66 , 1987 .
R. Snodgrass . Th e temporal query language TQuel . TODS , — 298,1987 .
A . U . Tan sel . Adding time dimension to relat ional model and extendingrelat ional algebra. Information Systems
,—355 , 1986 .
J . Ullman . Principles of Database and Know ledge-Bas e Systems . Volume 1,
Computer Science Press , 1988.
24
[vBen83] J . van Benthem . The Logic of Time. D. Reidel Publishing Company, 1983.[Via87] V . Vianu . Dynam i c funct ional dependencies and database aging .
— 59, 1987 .
25
NYU COMPS C I TR— 484
Tuz h i l i n , A l exande r
Us i ng t empo r a l l og i c and
Dat al og t o que ry dat abas e s
evo l v i ng i n t i me . c . 2
NYU COMPS C I TR— 484
Tuz h i l i n , A l exande rUs i ng t empo r al l og i c and
Dat a l og t o q ue ry dat abas e s
e vo l v i ng i n t i me .
BO RROW ER'
S NA ME
Th is book may be kept
FOURTE EN DA YSA fine w ill be charged for each day th e book is kept overtime.
top related