a knowledge based interface for distributed biological ... · the context of the access to...
TRANSCRIPT
A knowledge based interface fordistributed biological databases
Paolo Bresciani
�
and Paolo Fontana�
and Paolo Busetta
�
�
brescian,pfontana,busetta
�@itc.it.
(
�
)ITC-irst (TRENTO) and (�
)IASMAA (San Michele a.A.)
with the collaboration of Giorgio Valle and Stefano Toppo
CRIBI - University of PaduaNETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.1
Outline of the Talk
� Motivation for new approaches in biological DBaccess
� The current state of the art (2 examples)
� Our Knowledge Based approach:
� an example of interaction
� some technical details
� Extending to multiple DBs
NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.2
Biological Database access
The formulation of the intended query for retrieving thedesired data is a problem for every database user.As a simple example in Biology, consider the task ofsearching for KDEL receptor :
� what does the user exactly mean with KDELreceptor? Is she looking for the description of that
functionality; or for any protein with that functionality; or for
any genomic sequence that is expressed in such a protein?
� moreover, does the user really know all theconsequences of looking for all (let’s say) theprotein having KDEL receptor functionality?
NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.3
Biological Database access cont’
It may be very useful to know some relevant limitationson the form of the query, when already someconstraints are imposed:E.g., KDEL receptor function can NOT be exhibited byany protein in the cell nucleus.
“protein located in the nucleus and with KDELreceptor function”is inconsistent: submitting it to any biological DBresults in a useless interaction (loss of time andmoney).
NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.4
Biological Database access cont’
It may be very useful to know some relevant limitationson the form of the query, when already someconstraints are imposed:E.g., KDEL receptor function can NOT be exhibited byany protein in the cell nucleus.
� “protein located in the nucleus and with KDELreceptor function”is inconsistent: submitting it to any biological DBresults in a useless interaction (loss of time andmoney).
NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.4
Main sources of errors in queries
Current query systems do not provide any support toavoid (or limit) the source of conceptual errors inqueries.
Many sources of errors:
� Lack of knowledge on the domain
Limited knowledge on some parts of the domain
Terminology disagreement
Little understanding of the domain representationinside the database: terminology, taxonomy,relationships, constraints
NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.5
Main sources of errors in queries
Current query systems do not provide any support toavoid (or limit) the source of conceptual errors inqueries.
Many sources of errors:
� Lack of knowledge on the domain
� Limited knowledge on some parts of the domain
Terminology disagreement
Little understanding of the domain representationinside the database: terminology, taxonomy,relationships, constraints
NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.5
Main sources of errors in queries
Current query systems do not provide any support toavoid (or limit) the source of conceptual errors inqueries.
Many sources of errors:
� Lack of knowledge on the domain
� Limited knowledge on some parts of the domain
� Terminology disagreement
Little understanding of the domain representationinside the database: terminology, taxonomy,relationships, constraints
NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.5
Main sources of errors in queries
Current query systems do not provide any support toavoid (or limit) the source of conceptual errors inqueries.
Many sources of errors:
� Lack of knowledge on the domain
� Limited knowledge on some parts of the domain
� Terminology disagreement
� Little understanding of the domain representationinside the database: terminology, taxonomy,relationships, constraints
NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.5
The current solutions
The common way to deal with the problem is by
� being as much expert as possible in the domain
� being as much aware as possible of the designand implementation details of the DB.
This may be sometimes interesting (domain knowl-
edge), even if difficult, but also tedious (DB design and
implementation details) specially when using several
and changing DBs.
NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.6
Our solution
We introduce a concept-demonstrator of a knowledgebased Visual Query System. It has been applied inthe context of the access to biological databases, withthe following advantages for the user:
� allows to interactively and iteratively buildconsistent queries only;� allows to interactively explore the databasesemantics by gradually browsing only theinteresting parts of the conceptual model;� uses simple, but effective, features for queryrefinement and generalization.
NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.7
A QBE [Zloof] interface example(The SRS: Sequence Retrieval System)
NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.8
A slightly better example(The muscle-trait DB — CRIBI-UniPD)
NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.9
Problems and difficulties
� In the first case, matching strings must beprovided
In the second case, a more “guided” interface isavailable, but the selection still is among long listsof terms
In any case no semantic support is provided.
NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.10
Problems and difficulties
� In the first case, matching strings must beprovided
� In the second case, a more “guided” interface isavailable, but the selection still is among long listsof terms
In any case no semantic support is provided.
NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.10
Problems and difficulties
� In the first case, matching strings must beprovided
� In the second case, a more “guided” interface isavailable, but the selection still is among long listsof terms
In any case no semantic support is provided.
NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.10
The “flat files” legacy problemID HSA010063 standard; DNA; HUM; 1730 BP.
AC AJ010063;
SV AJ010063.1
DT 01-OCT-1998 (Rel. 57, Created)
DT 07-JAN-2000 (Rel. 62, Last updated, Version 2)
DE Homo sapiens telethonin gene
KW telethonin gene.
OS Homo sapiens (human)
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;
OC Eutheria; Primates; Catarrhini; Hominidae; Homo.
RN [1]
RP 1-1730
RA Pallavicini A.L.;
RL Submitted (06-AUG-1998) to the EMBL/GenBank/DDBJ databases.
RL Pallavicini A.L., Complesso interdipartim. Vallisneri Dipartimento di
RL Biologia, Universita di Padova, via G.Colombo 3, 35121, ITALY.
DR SWISS-PROT; O15273; TELT_HUMAN.
FH Key Location/Qualifiers
FT source 1..1730
FT /chromosome="17"
FT /db_xref="taxon:9606"
FT /organism="Homo sapiens"
FT /map="q12"
FT promoter 504..529
FT /gene="telethonin"
FT 5’UTR 520..529
FT /evidence=EXPERIMENTAL
FT /gene="telethonin"
FT mRNA join(520..639,887..1730)
FT /gene="telethonin"
FT CDS join(530..639,887..1280)
FT /db_xref="SWISS-PROT:O15273"
FT /gene="telethonin"
FT /product="telethonin"
FT /protein_id="CAA08987.1"
FT /translation="MATSELSCEVSEENCERREAFWAEWKDLTLSTRPEEGCSLHEEDT
FT QRHETYHQQGQCQVLVQRSPWLMMRMGILGRGLQEYQLPYQRVLPLPIFTPAKMGATKE
FT EREDTPIQLQELLALETALGGQCVDRQEVAEITKQLPPVVPVSKPGALRRSLSRSMSQE
FT AQRG"
FT exon 520..639
FT /number=1
FT /gene="telethonin"
FT intron 640..886
FT /number=1
FT /gene="telethonin"
FT exon 887..1280
FT /number=2
FT /gene="telethonin"
FT TATA_signal 488..495
FT /gene="telethonin"
FT 3’UTR 1281..1730
FT /evidence=EXPERIMENTAL
FT /gene="telethonin"
FT polyA_signal 1707..1712
FT /evidence=EXPERIMENTAL
FT /gene="telethonin"
FT polyA_site 1730
FT /evidence=EXPERIMENTAL
FT /gene="telethonin"
FT STS 1451..1630
FT /note="mapped with RH GB4 on chromosome 17q12"
FT /gene="telethonin"
XX
SQ Sequence 1730 BP; 348 A; 486 C; 612 G; 280 T; 4 other;
gtggtgaggg tgactgggga ctaggcacta ggcctttggt gcaggcgcct gaggacktgg 60
ttgcactctc ccttctgggg atatgccctt gagcccaggc agaggagagc acagcccagg 120
gcaggacctg gcagccctgg tacagagccc agagggggca tcagttcctg ctggtcctgc 180
tctgtttaca gacaasctgc tgtcctccct gcaaagggga gtgggtgggg cagagggcaa 240
NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.11
Terminology standardization
Fortunately some relevant steps ahead have beendone in the last few years.
In particular GeneOntology is one of the mostimportant efforts toward terminology standardization.It aims at providing a support for data-integration and
inter-operability among sequence data and data from functional
analyses.
This is crucial for the discovery of the functions of new sequences
by comparison with already studied and annotated sequences.
Molecular Function, Biological Process, and Cellular Component
are classified in three hierarchies.
NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.12
Terminology standardization
Fortunately some relevant steps ahead have beendone in the last few years.In particular GeneOntology is one of the mostimportant efforts toward terminology standardization.
It aims at providing a support for data-integration and
inter-operability among sequence data and data from functional
analyses.
This is crucial for the discovery of the functions of new sequences
by comparison with already studied and annotated sequences.
Molecular Function, Biological Process, and Cellular Component
are classified in three hierarchies.
NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.12
Terminology standardization
Fortunately some relevant steps ahead have beendone in the last few years.In particular GeneOntology is one of the mostimportant efforts toward terminology standardization.It aims at providing a support for data-integration and
inter-operability among sequence data and data from functional
analyses.
This is crucial for the discovery of the functions of new sequences
by comparison with already studied and annotated sequences.
Molecular Function, Biological Process, and Cellular Component
are classified in three hierarchies.NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.12
The Knowledge Based Approach
NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.13
Intelligent Interface
Need of Intelligent Interfaces that must:
� be easy and intuitive to be used� be “natural” to be used and understood� require no knowledge of DB technologies or datarepresentation formalisms� possibly require only little or partial knowledge ofthe application domain� be capable to give some semantic advice to theusers
� reduce user’s cognitive effort.
NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.14
Semantics based approachfor query formulation support
An approach is needed that:
� is semantically well founded
� is based on a representation of the model/schemaof the DB and of the domain (biology)
� support specific kinds of reasoning (terminologicalreasoning)
a Knowledge Based approach
NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.15
Semantics based approachfor query formulation support
An approach is needed that:
� is semantically well founded
� is based on a representation of the model/schemaof the DB and of the domain (biology)
� support specific kinds of reasoning (terminologicalreasoning)
� a Knowledge Based approach
NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.15
The ingredients of our system
� Visual Interface for:
� listing domain concepts� representing queries� transforming queries
� A Conceptual Model representation:an Ontology (better, a Knowledge Base)(Tambis + part of GeneOntology)� A reasoner (the Description Logics (DL) ReasoneriFaCt [Horrocks – U.Manchester] + specific code).
NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.16
Semantic checks
Some important features of the interface:
� Only consistent actions are allowed(actions that lead to consistent queries).
� Only relevant modifications are proposed(transformations that produce queriessemantically not equivalent to the original).
� Only close modifications are proposed(not all the consistent and relevant modificationsare proposed, but only those that lead to querieswith semantics close to the original one).
NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.17
An example of interaction
NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.18
NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.19
NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.20
NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.21
NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.22
NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.23
NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.24
NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.25
NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.26
NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.27
NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.28
NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.29
NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.30
NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.31
NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.31
NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.31
KRR services
NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.32
Knowledge Based Approach means . . .
Description Logics
NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.33
Knowledge Based Approach means . . .Description Logics
NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.33
DL are . . .
L � �
A � C
D � C � R � �
R �C � � � � � C
D � �
R �C � R �1 � R
�
n �C � R
�
n �C � � � � �
I � �
∆I � � I �
semantically sound (and complete) automaticreasoning services as:
� consistency-checking� subsumption� classification
KB = {C D . . . }NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.34
DL: FOL based semantics
��� � FA
�
γ
��� A
�
γ
�
� � � FP
�
α � β � � P
�
α � β �Infix Prefix Semantics
C �"! #$
C
� FC
�γ
�
C
%
D
� �!&
C D
�
FC
�γ
�('FD
�
γ
�
C
)
D
� #*
C D
�
FC
�
γ
�(+
FD
�
γ
�
,
R -C � �. .
R C� ,
x -FR
�
γ � x � / FC
�
x
�
0
R -C �21 #34R C
� 0
x -FR
�
γ � x � '
FC
�
x
�
......
...
R
51 �"6 !7 4 * 1 4
R
�
FR
�
β � α �
R 8 Q �29 #3 � #1 4
R Q
� 0
x FR
�
α x
� '
FQ
�
x β
�
......
...
NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.35
The KB;;;;;;;;;;;;;; DISJOINT COVERING ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
(FACT::disjoint NUCLEUS CELL ORGANELLE CYTOSOL RIBOSOME SPLICEOSOME
MEMBRANE ENDOPLASMIC-RETICULUM GOLGI-COMPLEX)
(FACT::implies CELLULAR-PART (:OR NUCLEUS ORGANELLE CYTOSOL RIBOSOME
SPLICEOSOME MEMBRANE ENDOPLASMIC-RETICULUM GOLGI-COMPLEX))
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;; RELATIONSHIPS RESTRICTIONS ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
(implies
(:AND protein (:SOME part-of nucleus))
(:ALL has-function (:OR nucleic-acid-binding transcription-factor-binding)))
(implies
(:AND protein (:SOME has-function (:OR nucleic-acid-binding transcription-factor-binding)))
(:ALL part-of nucleus))
(implies (:AND protein (:SOME has-function (:OR cell-adhesion signal-transduction))) (:ALL part-of cell-membrane))
(implies (:AND protein (:SOME part-of cell-membrane)) (:ALL has-function (:OR cell-adhesion signal-transduction)))
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;; RELATION DOMAIN AND RANGE ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
(implies enzyme (:SOME has-function enzyme-function))
(implies enzyme (:ALL has-function enzyme-function))
(implies holoenzyme (:SOME has-function enzyme-function))
(implies holoenzyme (:ALL has-function enzyme-function))
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;; CONCEPT DEFINITIONS ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
(FACT::defprimconcept PROTEIN (:AND (:AT-LEAST 1 HAS-FUNCTION :TOP)
(:SOME HAS-FUNCTION BIOLOGICAL-FUNCTION)
(:ALL HAS-FUNCTION BIOLOGICAL-FUNCTION)
(:AT-LEAST 1 HAS-SEQUENCE :TOP)
(:ALL HAS-SEQUENCE PROTEIN-SEQUENCE)
; (:SOME TRANSLATED-FROM (:OR DNA MESSENGER-RNA))
; (:ALL TRANSLATED-FROM (:OR DNA MESSENGER-RNA))
(:SOME TRANSLATED-FROM MESSENGER-RNA)
(:ALL TRANSLATED-FROM MESSENGER-RNA)
;modificato (:OR DNA MESSENGER-RNA) da PF 08/03/2002
(:SOME HAS-SECONDARY-STRUCTURE PROTEIN-SECONDARY-STRUCTURE)
(:ALL HAS-SECONDARY-STRUCTURE PROTEIN-SECONDARY-STRUCTURE)
(:SOME HAS-TERTIARY-STRUCTURE TERTIARY-STRUCTURE)
(:ALL HAS-TERTIARY-STRUCTURE TERTIARY-STRUCTURE)
(:ALL HAS-QUARTERNARY-STRUCTURE QUARTERNARY-STRUCTURE)
(:SOME HOMOLOGOUS-TO (:OR PROTEIN NUCLEIC-ACID))
(:ALL HOMOLOGOUS-TO (:OR PROTEIN NUCLEIC-ACID))
(:SOME PART-OF (:OR EXTRA-CELLULAR CELLULAR-PART))
(:ALL PART-OF (:OR EXTRA-CELLULAR CELLULAR-PART))
MACROMOLECULAR-COMPOUND
(:SOME POLYMER-OF AMINO-ACID)
(:ALL POLYMER-OF AMINO-ACID)
(:ALL HAS-BOUND COFACTOR)))
NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.36
Reasoning with DL: Refinement
Refine Biological-SpaceQ
:
x
;=< Protein
:
x
;?>A@ Enzyme
:
x
;>
has-function
:
x B w ;?>
Nucleic-Acid-Binding
:
w
;?>
is-in
:
x B y ;?>
Biological-Space
:
y
;?>
is-in
:
y B z ;>
Cell
:
z
;(AND Protein (NOT Ezime)
(SOME has-function Nucleic-Acid-Binding)
(SOME is-in (AND Biological-Space (SOME is-in Cell))
Ref=
MSS
�C
X
D
Q
E�
�
PROTEIN
% - - - % 0is-in - �
X
% 0
is-in -Cell
� � '
Q F Q E' G 0
Q
E E�
�
PROTEIN
% - - - % 0is-in - �
Y
% 0
is-in -Cell
� �
s -t - Y F X'
Q F
Q
E E F Q EH �
,
and:
MGS
Equivalences
NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.37
Architecture
Socket Interface
KB Manager
iFact Reasoner
ODBC
PostreSQLUser
Interface
Configurationfiles
(eg. translationtables)
KB
MuscleTRAIT
NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.38
Towards accessing more DBs
DB NDB 1
Wrapper NWrapper 1 ...
Queryplanner
Answerintegrator
Localschemataknowledge
Globalschema
integrator
Registerschema
Query
Mediator
Semanticinterface
KBQuery builder Reporter
NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.39
Towards accessing more DBs
DB NDB 1
Wrapper NWrapper 1 ...
Queryplanner
Answerintegrator
Localschemataknowledge
Globalschema
integrator
Registerschema
Query
Mediator
Semanticinterface
KBQuery builder Reporter
NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.39
Promising opportunities fromAgent Technologies
� accessing multiple DBs with one homogeneousinterface
� architectural flexibility (easily extensible)
� robustness (in case of redundancy)
� caching and notification (multiple users, repeated“similar” queries)
standard protocol for schemata interchange andfor communication (XML; DAML+OIL)
wrappers design and maintenance
query planning
NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.40
Promising opportunities fromAgent Technologies
� accessing multiple DBs with one homogeneousinterface
� architectural flexibility (easily extensible)
� robustness (in case of redundancy)
� caching and notification (multiple users, repeated“similar” queries)
� standard protocol for schemata interchange andfor communication (XML; DAML+OIL)
� wrappers design and maintenance
� query planningNETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.40
Conclusions and FutureDevelopments
A VQS prototypical system, applied to the access tobiological databases, has been introduced.The use of KR reasoning tools adds interestingsemantic services, like:
� consistency checking of the queries;� intelligent incremental browsing and exploration ofthe database semantics;� effective features for relevant query refinementand generalization.
Some preliminary ideas on an agent basedarchitecture for accessing distributed DBs have beenpresented.
NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.41
Conclusions and FutureDevelopments
A VQS prototypical system, applied to the access tobiological databases, has been introduced.The use of KR reasoning tools adds interestingsemantic services, like:
� consistency checking of the queries;� intelligent incremental browsing and exploration ofthe database semantics;� effective features for relevant query refinementand generalization.
Some preliminary ideas on an agent basedarchitecture for accessing distributed DBs have beenpresented.NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.41
References
�
M. M. Zloof. Query-by-example: A database language. IBM System
Journal, 16(4):324-343, 1977�
T. Cartacci, S.K. Chang, M.F. Costabile, S. Levialdi and G. Santucci. A
graph-based framework for multiparadigmatic visual access to
database. IEEE Transactions on Knowledge and Data Engineering,
8(3):455-475, 1996.�
P. Bresciani, M. Nori and N. Pedot. A Knowledge Based Paradigm for
Querying Databases. Proc. DEXA 2000, LNCS #1873. Springer.�
P. Bresciani, M. Nori and N. Pedot. QueloDB: a Knowledge Based Visual
Query System. Proc. IC-AI 2000, vol. III, June 2000. CSREA Press.�
M. Ashburner at al. Creating the gene ontology resources: design and
implementation. Genome Research, 11(8):1425-1433, Aug. 2001.�
P. G. Baker, C. A. Goble, S. Bechhofer, N. W. Paton, R. Stevens and A.
Brass. An ontology for bioinformatics applications. Bionformatics,
15(6):510-520, 1999.NETTAB 2002. Bologna, July 13th, 2002. P. Bresciani: “A knowledge based interface for distributed biological databases” – p.42