![Page 1: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/1.jpg)
From Enterprise Information Integration to Community-Based Mediation
Alin Deutsch,
Yannis Katsis,
Michalis Petropoulos
Yannis Papakonstantinou
A presentation by
on joint works with
CSE DepartmentCSE Department
![Page 2: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/2.jpg)
Data Integration Requirements & Desiderata (high level)
• Provide application with integrated database– single point of (query/update) access to the
data• Provide distribution and heterogeneity
transparency– heterogenous formats, heterogenous
interfaces, different rates of change (static versus dynamic), autonomous sources
• Decouple application logic from integration• Easily add/change sources• Customize the delivery of content
![Page 3: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/3.jpg)
Integration Software
Most-Generic Integration System Architecture
. . .Information
SourceInformation
SourceInformation
Source
ClientApplication
ClientApplication
ClientApplication
![Page 4: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/4.jpg)
Information+ Service
Source
SIGMOD Community’s Architecture forUnified Access to Data & Services
Local CommonLocal CommonModel (XML) View Model (XML) View
+ Services+ Services
Mediator
Integrated (XML) Global View / Ontology + ServicesIntegrated (XML) Global View / Ontology + Services
Wrapper
Cache &Replication
(Web) ClientApplication
Information+ Service
Source
Wrapper
(Web) ClientApplication
(Web) ClientApplication
Local CommonLocal CommonModel (XML) View Model (XML) View
+ Services+ Services
![Page 5: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/5.jpg)
Approaches towards View-BasedData Integration
Local As View(LAV)
Global As View(GAV)
GLAV=GAV+LAV
IntegrationSpecification
Method
Info Model &Query Language
Relational (SQL)
XML (XQuery)
Object-Oriented
Warehousing(materialized
views)
On-Demand(virtual views)
Storage Method
![Page 6: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/6.jpg)
Enterprise Information IntegrationReaches Maturity
• Materialized View (Warehousing) approach well-adopted since mid/late 90s– GAV function role played by Extract-Transform-Load tools– Human Intervention Occasionally Needed in Cleaning Up
• Concordance tables for Object Identification
• Virtual View (Mediation) approach at early adoption – many years of research
• Distributed db’s, federated db’s, mediators– moving well into mainstream
• BEA AquaLogic (XML, Virtual, GAV view)• IBM DB2
Enterprise
![Page 7: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/7.jpg)
Current Enterprise InformationIntegration Deployments
IntegrationIntegrationAdminAdmin
Marketing
Local View MLocal View M
Integrated Global View V(M, S, E)Integrated Global View V(M, S, E)
Sales
Local View SLocal View S
Service
Local View ELocal View E
View Builder(design time)
Mediator QueryProcessor (run time)
GAV View V
Schemas Data
• Small Domain• Mostly Vertical Partition of Sources • Primarily Application-Driven View or Identity View• Integration Administrator/Developer in charge
Enterprise
![Page 8: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/8.jpg)
Opportunities and Needs Presented by “Motivated” Communities
• Emerging Myriads of Internet Communities of– Myriads of sources and clients– Source owners motivated to participate
• EII does not address needs– Expensive– Bottleneck of Single Integration Admin
• Make building corresponding portals similar to starting and participating in newsgroups
• Appropriate tools needed to enable source owner and client participation
Communities
![Page 9: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/9.jpg)
InformationSource 1
Local XML View S1
Client Application 1
Integrated XML View G
InformationSource n
Local XML View Sn
GLAV: V1 GLAV: Vn
GAV: V1a GAV: Vm
a
Application View V1a (G) Application View Vm
a (G)
Client Application m
Integrated View Owner’s Domain
Source Owner’sDomain
SourceOwner’sDomain
DataServices 1
DataServices n
A Community-Based Information Modeling Architecture
Data Services
![Page 10: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/10.jpg)
Visual Tools Matter! (example from the Enosys Query Builder)
C:\Enosys\projects\allPONS.qpr* - Enosys Query Builder
OPEN & VIEW OPEN & VIEW SOURCE SCHEMAS IN SOURCE SCHEMAS IN
XMLXML DRAG & DROP TO CREATE TARGET XML VIEW
TARGET SCHEMA
(XML VIEW)
AUTOMATICALLY GENERATED
MAPS
1
2
![Page 11: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/11.jpg)
C:\Enosys\projects\allPONS.qpr* - Enosys Query Builder
XML RESULTXQUERY BASED
ON DESIGN SPECS
RUN & TEST XQUERYRUN & TEST XQUERY3
![Page 12: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/12.jpg)
Architecture for Large-Scale Data Integration System and Design Tools
How can the user query and Browse the integrated data?
QURSED
What queries can my app issue?What integrated view services can I build?
CLIDE
How do I export my databaseservices functionality?
RIDE-Services
SourceDomain
WebDomain
ApplicationDomain
IntegrationDomain
Application
DataSource
DataSource
MediatorGlobal View
Schema
Developer
IntegrationEngineer
SourceOwner
Application
Web Forms& Reports
SourceSchema
…
WebService
WebService
WebService
SourceSchema …
How do I export my data?
RIDE
WebServices
Cache(Metadata)
![Page 13: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/13.jpg)
Dual Interactive Registration Problems
New App
?New Query
?
Source Services
Register Source Given GlobalRegister Source Given Global
Schema, Constraints &QueriesSchema, Constraints &QueriesGuide the client in
query/form writing
Apps
?Queries
?
Register Client Given SourcesRegister Client Given Sources
Guide the source owner in registering a new source and services
New Source and Services
Global ViewGlobal View
![Page 14: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/14.jpg)
Source Data Registration
• How do my source attributes map to global attributes– mappers & automatic
matchers• How do my data relate to
queries & other sources– Inconsistencies?– What takes to
contribute to queries?– How much should I
clean up?• Multiple ways of dealing
with redundancy
Apps
?Queries
?
Server SideServer Side
New Source
Global View
![Page 15: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/15.jpg)
How to achieve this Goal
Apps
?Queries
?
BeforeBefore
New Source
Apps
?Queries
?
NowNow
New Source
Look at all sources & queries
Decide how to register
your source
Follow the suggestions
of the interface
Global View Global View
Source Registration Tool
![Page 16: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/16.jpg)
Our Goal in Source Registration
Guide the source owner visually through the registration of the source
so as to avoid/warn about (potential) inconsistencies and contribute information to the answer of the queries
while exposing the minimum information possible and/or minimizing effort
![Page 17: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/17.jpg)
The Problem
?
Client Queries
? ?
Mediator(Global DB)
Sources (Actual Local DBs)
17
![Page 18: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/18.jpg)
The Contribution Problem
?
Client Queries
? ?
Sources (Actual Local DBs)
• What is the contribution of source S to the result of the query Q?
S
Q
18
Mediator(Global DB)
![Page 19: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/19.jpg)
The Problem
Client Queries
?
Sources (Actual Local DBs)
• What is the contribution of source S to the result of the query Q?
S
Q
19
Mediator(Global DB)
Q:Q: carscars
carsreviews
Q: cars JOIN reviewsQ: cars JOIN reviews
S is Self Sufficient w.r.t. Q
S is Now Complementary w.r.t. Q
![Page 20: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/20.jpg)
Relational Schemas: Local and Global
attributes
relations
?
• Relational Schemas
• Visual Representation
makeS1
CarmakeOriginSales
autoS2
IdModel
detailIdEngineBaseprice
Source 1Business Magazine
Source 2Car Magazine
GlobalCar Portal
carG
ModelCarmake
Carmakebrand
Origin
DoorsBaseprice
Carmake
20
![Page 21: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/21.jpg)
Source Registration using GLAV Mappings
• Source Registration:Source Registration: Correspondence between a source schema and the global schema = Set of Mapping Constraints of the form
(U V)
• Open WorldOpen World
• Global and Local As View (GLAV)Global and Local As View (GLAV)
?
CQ=
over source schema
CQ= over global schema
21
![Page 22: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/22.jpg)
Target Constraints
• Constraints on the global schema = Set of Constraints of the form
(U V)
• Also Expresses Dependencies (PKs, Ref Integrity, …)Also Expresses Dependencies (PKs, Ref Integrity, …)
?
CQ=
over global schema
CQ= over global schema
22
![Page 23: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/23.jpg)
Visual Representation of Mappings (1)
?
• Visual Representation (IBM Clio)
brandCO
Business Magazine: Provides Carmake and Origin
U1(C, O) :- make(C, O, S)V1(C, O) :- brand(C, O)
(U1 V1)
carG
ModelCarmake
Carmakebrand
Origin
DoorsBaseprice
makeS1
CarmakeOriginSales
OC
makeCOSOC
23
![Page 24: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/24.jpg)
Visual Representation of Mappings (2)
?
• Visual Representation (IBM Clio)
Car Magazine: Provides Model, Carmake and Baseprice
autoIMC
detailIEB
carMC?B
autoS2
IdModel
detailIdEngineBaseprice
Carmake
carG
ModelCarmake
Carmakebrand
Origin
DoorsBaseprice
24
![Page 25: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/25.jpg)
Query Semantics
• Queries in UCQ=
• Set of Possible Global InstancesSet of global instances that satisfy all constraints
• Query Answers = Set of Certain AnswersThe tuples appearing in the answer to Q for any possible global instance
Possible globalinstances
Answer to Qfor any of the
possible globalinstances
CertainCertainAnswersAnswers
to Qto Q
Answer to Qfor any of the
possible globalinstances
CertainCertainAnswersAnswers
to Qto Q
Possible globalinstances
?
26
![Page 26: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/26.jpg)
Source Instance’s Contribution
Answer to Q - Answer to
Q
• For given instances of the sources
Contribution to Q of Source Instance
=
The tuples in answer of Q not provided by the other sources
27
![Page 27: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/27.jpg)
Source Registration’s Contribution
• Source Registration: Source Mappings
• Degrees of Source Registration’s Contribution
Self Sufficient
Now Complementary
Later Complementary
Unusable
More contribution
Less contribution
28
![Page 28: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/28.jpg)
Self Sufficient Registration: Example
?Baseprices of ModelsBaseprices of Models
carM3
BMW?
45K
carG
ModelCarmake
Carmakebrand
Origin
DoorsBaseprice
Green Registration isSelf Sufficient
carModel
CarmakeDoors
Baseprice
Example
29
![Page 29: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/29.jpg)
Self Sufficient Registration: Definition
Answer to Q
Source instance
s.t.
The source has a non empty contribution in the absence
of the other sources
Answer to Q-
Self SufficientSelf Sufficient
X X X X
30
![Page 30: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/30.jpg)
Now Complementary Registration: Example
?
Baseprices of ModelsBaseprices of Modelsby German manufacturersby German manufacturers
carM3
BMW?
45K
carG
ModelCarmake
Carmakebrand
Origin
DoorsBaseprice
brandBMW
Germany
Green Registration isNow Complementary
brandCarmakeOrigin =
‘Germany’
carModel
CarmakeDoors
Baseprice
Example
31
![Page 31: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/31.jpg)
Now Complementary Registration: Definition
Answer to Q
Not Self Sufficient
&
Source instances
s.t.
The source has a non empty contribution in combination
with the other existing sources
Answer to Q-
Now ComplementaryNow Complementary
32
![Page 32: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/32.jpg)
Later Complementary Registration: Example
?
Baseprices of ModelsBaseprices of Modelsby German manufacturersby German manufacturers
carM3
BMW?
45K
carG
ModelCarmake
Carmakebrand
Origin
DoorsBaseprice
brandBMW
Germany
Green Registration isLater Complementary
brandCarmakeOrigin =
‘Germany’
carModel
CarmakeDoors
Baseprice
Example
33
![Page 33: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/33.jpg)
Later Complementary Registration: Definition
Answer to Q
Not Self Sufficient &Not Now Complementary
&
Potential future sources
& Source instances
s.t.
The source has a non empty contribution in combination
with the future sources
Answer to Q-
Later ComplementaryLater Complementary
34
![Page 34: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/34.jpg)
Unusable Registration: Example
?car
G
ModelCarmake
Carmakebrand
Origin
DoorsBaseprice
brandCarmake
OriginOrigin of CarmakesOrigin of Carmakes
Green Registration isUnusable
Example
35
![Page 35: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/35.jpg)
Unusable Registration: Definition
Answer to Q
Not Self Sufficient &Not Now Complementary &Not Later Complementary
The source has a empty contribution regardless of
what sources enter the system
Answer to Q- =
UnusableUnusable
36
![Page 36: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/36.jpg)
Subtleties for Unusable Registrations
?
Baseprices and DoorsBaseprices and Doorsof Modelsof Models
carG
ModelCarmake
Carmakebrand
Origin
DoorsBaseprice
carM3
BMW?
45K
carM3
BMW2?
Green Registration isUnusable
carModel
CarmakeDoors
Baseprice
Example
37
![Page 37: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/37.jpg)
In presence of PK Unusable Example becomes Later Complementary
?
Baseprices and DoorsBaseprices and Doorsof Modelsof Models
carG
ModelCarmake
Carmakebrand
Origin
DoorsBaseprice
Green Registration isLater Complementary
carM3
BMW?
45K
carM3
BMW2?
M3BMW
M3BMW
carM3
BMW2
45K
carModel
CarmakeDoors
Baseprice
Example
38
![Page 38: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/38.jpg)
Decidability Results
None Primary keys Primary keys + Referential Integrity Constraints
Self SufficientYes Yes No
Now complementary Yes Yes No
Later complementary Yes Yes ?
UnusableYes Yes ?
Target constraintsTarget constraints
DDeeggrreeeess
Overview: What is decidableOverview: What is decidable
39
![Page 39: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/39.jpg)
Issues
UniqueUnique client query client query MultipleMultiple client queries client queriesVsVs
Contribute to: - all queries? - one query? - specific queries? - some queries based on some ranking?
DataData independence independence DataData dependence dependenceVsVs
M1: cars, refPricesM2: reviewsQ: cars JOIN reviews JOIN refPrices
e.g.
DB1: cars, refPrices (Audis)DB2: reviews (Hondas)
(M2, Q) now-complementarybut Certain Answers for Instances DB1, DB2 =
![Page 40: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/40.jpg)
Putting it all together
ArchitectureArchitecture
?Query
Global Schema
LocalSchemas
Q
S1 Sn Sn+1
…M1 Mn Mn+1
S’
Guide the source owner visually through the registration of the source
so as to raise contribution to the answer of the queries
while exposing the minimum info possible and/or minimizing effort
4 categories:4 categories:
Self Sufficient / Now Complementary / Later Complementary / Unusable
Query Answering / Mappings / SchemasQuery Answering / Mappings / Schemas
Architecture
Goal
Registeredsources
Newsource
Mappings
…
Contribution
![Page 41: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/41.jpg)
Example 1
carCommunity
price
modeldrive
usedAd*
modelreview*
vin
*
price
refPrice*model
model
condition
quality
Local SchemasLocal Schemas Global SchemaGlobal Schema
carAppQuery
price
modeldrive
usedAd*
modelreview*
vin
*
price
refPrice*model
model
condition
quality
QueryQuery
Without primary keys in the target
Unusable BLUE: Map at least one of the groups
carAutoTrader
vin
cmodel
price
ad*
carId
id*
![Page 42: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/42.jpg)
Example 1
carCommunity
price
modeldrive
usedAd*
modelreview*
vin
*
price
refPrice*modelcondition
quality
Local SchemasLocal Schemas Global SchemaGlobal Schema QueryQuery
Unusable
= cmo
carAutoTrader
vin
cmodel
price
ad*
carId
id* car
AppQuery
price
modeldrive
usedAd*
modelreview*
vin
*
price
refPrice*model
model
condition
quality
model
Without primary keys in the target
![Page 43: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/43.jpg)
Example 1
carCommunity
price
modeldrive
usedAd*
modelreview*
vin
*
price
refPrice*modelcondition
quality
Local SchemasLocal Schemas Global SchemaGlobal Schema QueryQuery
= cmo
carAutoTrader
vin
cmodel
price
ad*
carId
id* car
AppQuery
price
modeldrive
usedAd*
modelreview*
vin
*
price
refPrice*model
model
condition
quality
model= price
LaterComplementary
Without primary keys in the target
![Page 44: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/44.jpg)
Example 2
carCommunity
price
modeldrive
usedAd*
modelreview*
vin
*
price
refPrice*model
model
condition
quality
Local SchemasLocal Schemas Global SchemaGlobal Schema
carAppQuery
price
modeldrive
usedAd*
modelreview*
vin
*
price
refPrice*model
model
condition
quality
QueryQuery
Unusable
carAutoTrader
vin
cmodel
price
ad*
carId
id*
With primary keys in the target
![Page 45: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/45.jpg)
Example 2
carCommunity
price
modeldrive
usedAd*
modelreview*
vin
*
price
refPrice*model
model
condition
quality
Local SchemasLocal Schemas Global SchemaGlobal Schema
carAppQuery
price
modeldrive
usedAd*
modelreview*
vin
*
price
refPrice*model
model
condition
quality
QueryQuery
= price
= vin
carAutoTrader
vin
cmodel
price
ad*
carId
id*
LaterComplementary
With primary keys in the target
![Page 46: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/46.jpg)
Lessons learned
To merge data with that of other sources (become complementary):
Pick a relation and provide…
…all its attributes asked by the query
…its primary key and one of its attributes asked by the query
In absence of primary keys
In presence of primary keys
The number of choices increases in presence of primary keys
Foreign keys on the target affect the suggestions
Target constraints make a differenceTarget constraints make a difference
![Page 47: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/47.jpg)
Large-Scale Data Integration Systems
How can the user query and Browse the integrated data?
QURSED
What queries can the mediator answer for me?
CLIDE
How do I export my databaseservices functionality?
RIDE-Services
SourceDomain
WebDomain
ApplicationDomain
IntegrationDomain
Application
DataSource
DataSource
MediatorGlobal View
Schema
Developer
IntegrationEngineer
SourceOwner
Application
Web Forms& Reports
SourceSchema
…
WebService
WebService
WebService
SourceSchema …
How do I export my data?
RIDE
![Page 48: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/48.jpg)
Running Example
SchemaComputers(cid, cpu, ram, price)NetCards(cid, rate, standard, interface)
Views
V1 ComByCpu(cpu) (Computer)*SELECT DISTINCT Com1.*FROM Computers Com1WHERE Com1.cpu=cpu
V2 ComNetByCpuRate(cpu, rate) (Computer,
NetCard)*SELECT DISTINCT Com1.*, Net1.*FROM Computers Com1, Network Net1WHERE Com1.cid=Net1.cidAND Com1.cpu=cpuAND Net1.rate=rate
Parameterized Views
DellDell CiscoCiscoSchema
Routers(rate, standard, price, type)
Views
V3 RouByTypeW() (Router)*
SELECT DISTINCT Rou1.*FROM Routers Rou1WHERE Rou1.type='Wired'
V4 RouByTypeWL() (Router)*
SELECT DISTINCT Rou1.*FROM Routers Rou1WHERE Rou1.type='Wireless'
Computersfor a given cpu
Computers & NetCardsfor a given cpu & rate
Wired Routers
Wireless Routers
![Page 49: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/49.jpg)
Running Example
• Global schema puts togetherthe Dell and Cisco schemas
• Resembles the schema of CNET.com portal
Column Associations• (Computers.cid, NetCards.cid)• (NetCards.rate, Routers.rate)• (NetCards.standard, Routers.standard)
Global Schema
V1
Application
V3V2
Dell Cisco
MediatorGlobal Schema
Developer
V4
![Page 50: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/50.jpg)
Sophisticated Mediators Make Feasibility Hard to Predict
Feasible Queries FQ• Equivalent CQ query rewritings using the views• Might involve more than one views• Order might matter
V4
Mediator
RouByTypeWL()
Routers.*
10.11b
50Wirele
ss
54.11g
120Wirele
ssA
B
V2
ComNetByCpuRate(‘P4’, ‘10’)
C
DComputers.* NetCards.*A123
P4512 400A123
10.11b
USB
B123
P41024
550B123
54.11g
USB
Feasible
ComNetByCpuRate(‘P4’, ‘54’)
Computers.* NetCards.* Routers.*A123
P4 512 400A123
10.11b
USB
10.11b
50Wireless
B123
P41024
550B123
54.11g
USB
54.11g
120Wireless
E
Query:Get all ‘P4’ Computers, together with their NetCards and their compatible ‘Wireless’ Routers
Query:Get all Computers
Infeasible
![Page 51: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/51.jpg)
Problem
1. Large number of sources2. Large number of views3. Mediator capabilities
Developer formulates an application query Is an application query feasible? If not, how do I know which ones are feasible?
Previous options:– The developer had to browse the view definitions
and somehow formulate a feasible query– Or formulate queries until a feasible one is found
(trial-and-error)
No system-provided guidance
![Page 52: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/52.jpg)
The CLIDE Solution
A query formulation interface, which interactively guides the user toward feasible queries by employing a coloring scheme
CLIDE
V1
Application
V3V2
Dell Cisco
MediatorGlobal Schema
Developer
V4
![Page 53: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/53.jpg)
QBE-Like Interfaces
Microsoft SQL-Server
![Page 54: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/54.jpg)
CLIDE Interface
• Table, selection, projection and join actions• Color-based suggestions• Feasibility Flag
Projection BoxesTable Boxes
Selection Boxes
Feasibility FlagTable Alias
![Page 55: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/55.jpg)
CLIDE Interface
Yellow required action– All feasible queries require this action
White optional action– Feasible queries can be formulated
w/ or w/o these actions
Snapshot 1
![Page 56: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/56.jpg)
CLIDE Interface
Snapshot 2
Blue required choice of action– At least one feasible (next) query cannot
be formulated unless this action is performed
V1
Mediator
ComByCpu(‘P4’)
cidcpu
rampric
eA123 P4 512 400B123 P4 1024 550
rampric
e512 400
1024 550A B
C
![Page 57: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/57.jpg)
CLIDE Interface
Join Lines:• Only yellow and blue are displayed• Must appear in Column Associations
Snapshot 3
![Page 58: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/58.jpg)
CLIDE Interface
Snapshot 4
![Page 59: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/59.jpg)
CLIDE Interface
Snapshot 5
• * any other constant• Red prohibited action
– Does not appear in any feasible query
– Lead to “Dead End” state
![Page 60: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/60.jpg)
CLIDE Interface
Snapshot 6
V4
Mediator
RouByTypeWL()Routers.*
10.11b
512Wirele
ss
54.11g
1024
Wireless
A
B V2
ComNetByCpuRate(‘P4’, rate)
D
E
Computers.* NetCards.*A123
P4512 400A123
10.11b
50
B123
P41024
550B123
54.11g
120
ramprice
rateinterface
price
512 400 10 USB 501024
550 54 USB 120
F
![Page 61: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/61.jpg)
CLIDE Facts
• Rapid Convergence
– At every step, yellow and blue actions lead to a feasible query in a minimum number of steps
• Completeness of Suggestions
– Every feasible query can be formulated by performing yellow and blue actions at every step
• Minimality of Suggestions
– At every step, only a minimal number of actions are suggested, i.e., the ones that are needed to preserve completeness
![Page 62: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/62.jpg)
Join ActionTableAction
SelectionAction
Interaction Graph
• Nodes are queries– One for each qCQ
• Edges are actions– Table, selection, projection and join actions
• Green nodes are feasible queries• Infinitely big structure
– All CQ queries– All possible combinations of actions formulating them
Com1.cid=Net1.cidCom1.cpu=‘P4’Com1 Com1.ram Rou1…… Com1.price ……
s
… … ………Net1 …
![Page 63: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/63.jpg)
Interaction Graph: Colorable Actions
• Colorable actions AC labeloutgoing edges of the current node
Net1
Com1.cpu=*
Com1.price=*
Rou1
Com1.ram=*
Com1.cid=*
Com2
Com1.cid
……
……
…
Com1.cpu
……
……
Current Node
![Page 64: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/64.jpg)
Interaction Graph: Colors
Com1.cpu=*
…
……
……
……
…
……
…
…
CurrentNode
Net1 Com1.cid=Net1.cid
Com2.cid=Net1.cid
Com2
Com2.cpu=‘P4’ Net1.rate=‘54Mbps’
Net1.rate=’54Mbps’…… … … …
… …
Com1.cpu=*Com1.cpu=* Rou1 Net1.rate=Rou1.rate ……… …
Net1.rate=’54Mbps’ …
Com1.cid=Net1.cid
Com1.cid=Net1.cid …Net1
Com1.cpu=*
Com1.price=*
Rou1
Com1.ram=*
Com1.cid=*
Com2
Com1.cid
Com1.cpu
• Yellow action – Every path from current node n to a
feasible node contains • Blue action
– At least one feasible query cannot be formulated unless this action is performed (minimality)
• Red action – No path to a feasible node contains
![Page 65: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/65.jpg)
Color Determined By a Finite Set of Feasible Queries
• Start by considering the closest feasible queries FQC
• FQC is sufficient to color actions in AC • Theorem: Set of Closest Feasible Queries is Finite
• How far can closest feasible queries FQC be?
Based on Maximally Contained Queries FQMC?
n
…
…
…
…
…
…
ClosestFeasibleQueries FQC
Challenge: Infinitely Many Feasible Queries
Radius
Infinitely manyfeasible queries
?…
…
![Page 66: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/66.jpg)
Color Algorithm
• Assuming fixed SELECT clause (projection list)• Covered extensively in literature
– MiniCon, Bucket, InverseRules
• FQMC is finite
Maximally Contained Query
Maximally Contained Queries FQMC
Query: Q1Get all Computers
Query: Q2Get all Computers with a given cpu
Query: Q3Get all Computerswith a given cpu & ram
Not Maximally ContainedMaximally Contained Query
Query: Q4Get all Computerswith a given ram
![Page 67: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/67.jpg)
Color Algorithm
• Compute maximally contained queries FQMC
• The radius pL is the longest path to a node n’ such that q(n’) in FQMC
All FQC queries are reachable via a path of length p pL
ClosestFeasibleQueries FQC
MaximallyContainedQueries FQMCn
…
…
…
…
…
…
Maximally Contained Queries FQMC
pL Radius
…
![Page 68: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/68.jpg)
Color Algorithm
• Theorem: All queries in FQMC are in FQC
• But not all queries in FQC are in FQMC
More on Closest Feasible Queries
ClosestFeasibleQueries FQC
MaximallyContainedFeasibleQueries FQMC
…
…
…
…
…
…
More feasible nodes
n
![Page 69: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/69.jpg)
Color Algorithm
• Naïve Approach
– Start from n and explore paths up to length pL
More on Closest Feasible Queries
ClosestFeasibleQueries FQC
MaximallyContainedFeasibleQueries FQMC
…
…
…
…
…
…
n
![Page 70: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/70.jpg)
Color Algorithm
• Collapse Aliases to compute FQC \ FQMC
• Check satisfiability
Collapse Aliases
ClosestFeasibleQueries FQC
MaximallyContainedFeasibleQueries FQMCn
…
…
…
…
…
…
![Page 71: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/71.jpg)
Color Algorithm
Coloring Non-Projection Actions• No interaction graph materialization• Use of containment mapping from current query to the closest feasible
ones• An action is colored
– Yellow, if is mapped into all queries in FQC
– Red, if is not mapped into any query in FQC
– Blue, if is mapped into at least one query qF in FQC, no other action in AP is mapped into qF, and is neither yellow nor red
Coloring Projection Actions• Never colored yellow• Can be colored blue only if
– the current query is feasible– it is not colored red
• Which ones are red?– Bring all projection atoms from views such that feasibility is preserved
– If action is not mapped into any query in FQC, then is red
![Page 72: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/72.jpg)
Other Back-End
Parameterized Views Back-End
CLIDE Implementation
ActionCurrentQuery
ClosestFeasibleQueries
SchemasViews
MiniConContainment
TestCollapseAliases
ColorActions
Front-End
Developer
MaximallyContainedQueries
OptimalMaximallyContainedQueries
ColoredActions
ColumnAssociations
MiniCon• Outputs redundant and non-minimal queries• Affects CLIDE’s rapid convergence and minimality propertiesContainment Test• Well-known NP-complete problem• Polynomial when query is acyclicCollapse Aliases / Color Actions• Reuse containment mappings created by MiniCon
![Page 73: From Enterprise Information Integration to Community-Based Mediation Alin Deutsch, Yannis Katsis, Michalis Petropoulos Yannis Papakonstantinou A presentation](https://reader035.vdocuments.us/reader035/viewer/2022062408/56649f345503460f94c51cac/html5/thumbnails/73.jpg)
CLIDE Performance
0
2
4
6
8
10
12
14
(# of Joins, # of Selections) in Current Query
Tim
e (
sec) 112 Views
168 Views
224 Views
280 Views
• Queries
A-span = 7B-span = 4Selections = 4,6,8,10
A
B1…
C1
B2 C1A
BK
B1…
C1
CL
…• Schema
…
Bi
… Ci
• Views
A
BK
B1…
C1
CL
…
… …
BiM
Bi1…
CiM
Ci1…
Chains of Stars