deductive databases presentation

Post on 17-Jun-2015

1.438 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Deductive databases

Maroun Baydoun Inf 1312

Making smart applications

I. Overview

II. Theoretical study

III. Functional and Technical context

IV. Conclusion

Overview

Data: The backbone of applications

Com

plex

ity

Performance issuesLarge storage space Administrative nightmares

Create applications that rely on fewer volumes of data?

Make applications more intelligent?

Do this using existing data storage solutions?

Deal with multimedia to take full advantage of it?

Apply all the mentioned above in today’s applications?

Theoretical study

Databases: The prime medium for data storage

Data storage has evolved over the years:

• Decks of cards.• Magnetic tapes.• Hard drives.• Databases.• Relational databases.

SQL: Making dialogs with the database

• Developed in the 70s by IBM.

• Used as a front-end for all major RDBS such as Oracle, MS SQL and MySQL.

• Most RDBMS offer their own set of constructs to extend the functionality of standard SQL.

• Used to describe and access data.

SQL: The things we cannot express

• SQL has made querying databases a simple task.

• However, some queries are difficult or impossible to express.

• This is especially relevant in recursion.

What are the alternatives?

Alternatives to standard SQL

• Recursive SQL offers a solution to travel tree-like structure.

• Pros: 1. Can get the job done.2. Similar syntax to standard SQL.

• Cons:1. Not a standard solution.2. Syntax varies among RDBMS.3. Only permits linear recursion.

Recursive SQL

with t1 (parent, child) as (select * from table2 where parent in

(select parent from table1)union allselect * from table2 as t2, t1 where t2.parent = t1.child) select * from t1

Datalog: Asking questions and getting answers

• Datalog is a nonprocedural query and rule language based on Prolog.

• It deals with :• Rules.• Facts.• Queries.

Datalog rules

• A rule determines the logic behind the data.• Composed of two parts:

• The head :an atom• The body : one or more ANDed atoms (subgoals)

p :- q, not r.

The head

‘Is true if’

The first atom/subgoal

The second atom/subgoal

Negation

AND

Datalog rules

• For a given set of values , a rule is considered true if all its subgoals evaluate to true for those values.

• The preceding example can be expressed as :

‘p is true if q is true AND not r is true’

Datalog rules

• Two special cases of rules are worth noting:• Unite rules : rules composed of only a head part

p.

• Recursive rules: rules appears subgoals in their own bodies

p: - r, p.

Recursive rules

• Constitute the most important aspect in Datalog.

• Without such rules , Datalog is as expressive as Select-From-Where statements in SQL.

Recursion in Datalog= solution to SQL shortcomings

Datalog facts

• A tuple inside a relation.• An instance of a rule.• Represent the dataset the program will operate on

predicate name (list of constants).

A fact can only operate on

constants. No variables are allowed

The name of the rule/relation to which the fact belongs

Example facts

• Considering the following rule:

Student(X, Y)

Some facts derived from this rule are:

• Student (‘Maroun’, ‘UPA’).• Student (‘Elie’, ‘USJ’).• Student (‘Tony’, ‘LAU’).

Datalog queries

• A query is a question asked to a Datalog program.• The program will issue zero or many answers to that

question.

?- predicate name (list of constants/variable).

A query can have either constants or variables or both depending on its

type

The name of the rule/relation to which the query is related

Example queries

• Considering the same rule introduced earlier:

Student(X, Y)

Some queries related to that rule are:

• ?- Student (‘Maroun’, ’UPA’). yes• ?- Student (‘Maroun’, ’USJ’). no• ?- Student (‘Maroun’,Y). UPA • ?- Student (X,’USJ’). Elie

Deductive databases: When databases meet Datalog

• A database system that can make deductions.

• Comes up with facts not stored in the dataset.

• More powerful than relational databases.

• Deals with two forms of data:

• Data stored into relations.• Data inferred on runtime.

Deductive databases usage

• Mainly used for research or academic purposes.

• Little adoption in real-life applications.

• Organizations prefer regular relational database.

A better approach would be to rely on a regular relational database and inject a Datalog layer on top of it.

Functional and Technical context

Tools and technologies used

• Programming platform: Java EE 6 with Java Persistence API (JPA).

• Framework: Java Server Faces with PrimeFaces components.

• Application server: Glassfish Application Server v3.

• IDE: NetBeans IDE 6.8

• Database server: Oracle 10g + Oracle Multimedia.

• Datalog engine: IRIS reasoner.

• RDF engine: Seasame.

Oracle MultiMedia

• Enables the database to store, manage and retrieve audio, video, images and other types of media.

• Provides the following objects to represent media content:

• ORDAudio• ORDDoc• ORDImage• ORDImageSignature• ORDVideo

• Offers two methods to manipulate these objects:• Using PL/SQL stored procedure.• Using Java API.

Oracle MultiMedia

• A notable feature is the ability to compare images stored in the database according to the color, texture, shape and location.

• This is achieved using the evaluateScore(sig1 IN ORDImageSignature,sig2 IN ORDImageSignature, weights IN VARCHAR2) method of the ORDImageSignature object

• It evaluates the distance (the score) between the two given signatures according to the supplied weights. The bigger the distance, the less similar the images are.

IRIS Reasoner

• An open-source Datalog reasoner .

• Parses entire Datalog programs written in human-readable formats.

• Evaluates queries over a knowledge base composed of facts and rules.

• Enables external data sources to be plugged in.

Sesame

• Java framework used to store and query RDF data.

• Hides all the complexities of RDF by providing an API that resembles the JDBC API.

• Uses repositories to store RDF schema and data.

• Repositories can be accessed remotely using HTTP.

The proposed solutions for the problems

• Create applications that rely on fewer volumes of data: Using Datalog and the IRIS to deduce knowledge.

• Create more intelligent applications: Using the IRIS Reasoner, the application can come up with a knowledge base not stored in the database.

• Using existing data sources: The application relies on a traditional relational database but extends it with logic programming and deduction features.

The proposed solutions for the problems

• Dealing with multimedia content: The application can dig into the content of an image for search and comparison purposes using Oracle Multimedia.

• Applying those capabilities into today’s applications: The solution applies those feature within a very popular application type (social networks) proving that the work done can become practical and not just theoretical.

Overview of the application

• A prototype of a social network.

• Showcases the solutions mentioned above.

Overview of the application

The idea behind the application

‘Knowing a person’s parents, all his family members can be deduced’

Knowing the mother and father, the application can eventually deduce the identity of:• Brothers and sisters.• Grandparents.• Uncles and aunts.• Cousins• And virtually any other family member.

The application’s brain

• The application relies on a limited set of Datalog to deduce many family members.

• Here are the rules:

male(?X).female(?X).

father(?X,?Y).mother(?X,?Y).

child(?X,?Y):-father(?Y,?X).child(?X,?Y):-mother(?Y,?X).

The application’s brain

brother(?X,?Y):- male(?X),father(?Z,?X),father(?Z,?Y),mother(?T,?X),mother(?T,?Y),not ?X=?Y.

sister(?X,?Y):- female(?X),father(?Z,?X),father(?Z,?Y),mother(?T,?X),mother(?T,?Y),not ?X=?Y.

paternalGrandFather(?X,?Y):- father(?X,?Z),father(?Z,?Y).maternalGrandFather(?X,?Y):- father(?X,?Z),mother(?Z,?Y).

paternalGrandMother(?X,?Y):- mother(?X,?Z),father(?Z,?Y).maternalGrandMother(?X,?Y):- mother(?X,?Z),mother(?Z,?Y).

uncle(?X,?Y):-male(?X),father(?Z,?Y),brother(?X,?Z).uncle(?X,?Y):-male(?X),mother(?Z,?Y),brother(?X,?Z).

aunt(?X,?Y):-female(?X),father(?Z,?Y),sister(?X,?Z).aunt(?X,?Y):-female(?X),mother(?Z,?Y),sister(?X,?Z).

The application’s brain

• These rules are all that is needed to deduce family members.

• IRIS will:

• Parse the Datalog program.• Extract rules.• Receive Datalog queries.• Fetch tuples from Oracle database.• Return answers to the application.

The application’s architecture

The application’s architecture

The database design

Conclusion

• The solution proposed was using traditional relational databases while integrating the powerful Datalog language on top of them.

• It gave us the power of Datalog combined with the flexibility of relational databases.

• Working on this project has proved that dedictive database concepts can be applied on real-life applications.

• In the future we should see more of those examples where Datalog is integrated in applications.

Thank you for your attention

top related