combining semantic and multimedia query routing techniques for unified data retrieval in a pdms*
DESCRIPTION
ISDSI 2009 – June 25th , 2009. Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS*. Claudio Gennaro 1 , Federica Mandreoli 2,4 , Riccardo Martoglia 2 , Matteo Mordacchini 1 , Wilma Penzo 3,4 , and Simona Sassatelli 2 - PowerPoint PPT PresentationTRANSCRIPT
Combining Semantic and Multimedia Combining Semantic and Multimedia Query Routing Techniques for Unified Query Routing Techniques for Unified
Data Retrieval in a PDMS*Data Retrieval in a PDMS*
Claudio GennaroClaudio Gennaro11, Federica Mandreoli, Federica Mandreoli2,42,4, Riccardo Martoglia, Riccardo Martoglia22,,
Matteo MordacchiniMatteo Mordacchini11, Wilma Penzo, Wilma Penzo3,43,4, and , and Simona SassatelliSimona Sassatelli22
11 ISTI – PI/CNR, Pisa ISTI – PI/CNR, Pisa22 DII - Università degli Studi di Modena e Reggio Emilia DII - Università degli Studi di Modena e Reggio Emilia
3 3 DEIS - Università degli Studi di BolognaDEIS - Università degli Studi di Bologna44 IEIIT – BO/CNR, Bologna IEIIT – BO/CNR, Bologna
Claudio GennaroClaudio Gennaro11, Federica Mandreoli, Federica Mandreoli2,42,4, Riccardo Martoglia, Riccardo Martoglia22,,
Matteo MordacchiniMatteo Mordacchini11, Wilma Penzo, Wilma Penzo3,43,4, and , and Simona SassatelliSimona Sassatelli22
11 ISTI – PI/CNR, Pisa ISTI – PI/CNR, Pisa22 DII - Università degli Studi di Modena e Reggio Emilia DII - Università degli Studi di Modena e Reggio Emilia
3 3 DEIS - Università degli Studi di BolognaDEIS - Università degli Studi di Bologna44 IEIIT – BO/CNR, Bologna IEIIT – BO/CNR, Bologna
* This work is partially supported by the Italian co-founded Project NeP4B* This work is partially supported by the Italian co-founded Project NeP4B* This work is partially supported by the Italian co-founded Project NeP4B* This work is partially supported by the Italian co-founded Project NeP4B
ISDSI 2009 – June 25th, 2009 ISDSI 2009 – June 25th, 2009 ISDSI 2009 – June 25th, 2009 ISDSI 2009 – June 25th, 2009
Motivation ICTs over the Web have become a strategic asset Internet-based global market place where automatic cooperation
and competition are allowed and enhanced
The aim: development of an advanced technological infrastructure for SMEs to allow them to search for partners, exchange data and negotiate without limitations and constraints
The architecture: inspired by Peer Data Management Systems (PDMSs)
NeP4B Project(Networked Peers for Business)
ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS
Peer1
Peer2
Peer3
Peer5
Peer4
Peer6
TXT MM
TXT MM
TXT MM
TXT MM
TXT MM
TXT MM
The Reference Scenario A peer: a single SME or a mediator
It is queried by exploiting a SPARQL-like query language similarity predicates (FILTER function LIKE )
It keeps its data in an OWL ontology Multimedia objects multimedia
attributes in the ontology (e.g. image)
Tile
pricecompany
Image
imagePeer1
domdom dom ran
size
dom
materialdom
origindom
ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS
SELECT ?companyWHERE { ?tile a Tile .
?tile company ?company . ?tile price ?price . ?tile origin ?origin . ?tile image ?image . FILTER ( (?price < 30) && (?origin = ‘Italy’) &&
LIKE (?image, ‘myimage.jpg’, 0.3))}LIMIT 100
Peer1
Peer2
Peer3
Peer5
Peer4
Peer6
TXT MM
TXT MM
TXT MM
TXT MM
TXT MM
TXT MM
The Reference Scenario (cont.) Peers are connected by means of semantic
mapping (with scores) Peers collaborate in solving users queries
How to effectively and efficiently answer a query?
By adopting effective and efficient query routing techniques Flooding is not adequate
Overloads the network (traffic & computational effort) Overwelms the querying peer (irrelevant results)
Q
Q’
Q’’
Queries are formulated on the peer’s ontology
Answers can come from any peer that is connected through a semantic path of mappings
ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS
Combining Semantic and Multimedia Query Routing Techniques We leverage our distinct experiences on semantic [Mandreoli et al.
WIDM 2006, Mandreoli et al. WISE 2007] and multimedia [Gennaro et al. DBISP2P 2007, Gennaro et al. SAC 2008] query routing
We propose to combine our approaches in order to design an innovative mechanism for a unified data retrieval
We pursue: Effectiveness by selecting the semantically best suited subnetworks Efficiency by promoting the network’s zones where potentially
matching objects are more likely to be found, while pruning the others
ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS
Two main aspects characterize our scenario: the semantic heterogeneity of the peers’ ontologies the execution of multimedia predicates
Outline
Motivation
Query Answering Semantics
Query Routing Semantic Query Routing Multimedia Query Routing Combined Query Routing
Routing Strategies
Experimental Evaluation
Conclusions and Future Works
ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS
Query Answering Semantics Peer pi has an ontology Oi ={Ci1 , … , Cim}
A semantic mapping: a fuzzy relation M(Oi,Oj) OiOj where each instance (C,C’) has a membership grade μ(C,C’) [0,1]
evaluation of f on a local data instance i: s(f,i) [0,1] s(f,i) = s( f(φ1, …, φn), i ) = sfunφ ( s(φ1,i), … , s(φn,i) )
Boolean semantics for relational predicates non-Boolean semantics for similarity predicates t-norm ( resp. t-conorm) for scoring conjunctions (resp. disjunctions)
Local query answers: Ans(f,pi) = { (i,s(f,i)) | s(f,i)>0 }
Local query execution
Query formula:f ::= <triple_pattern> <filter_pattern>
<triple_pattern> ::= triple | <triple_pattern> ∧ <triple_pattern>
<filter_pattern> ::= φ | <filter_pattern> <filter_pattern> |∧ <filter_pattern> ∨ <filter_pattern> |
(<filter_pattern>) φ is a relational (=,<,>, <=, >=) or similarity (~t) predicate
pi
Oi
f
ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS
Ans(f,pi)
Query Answering Semantics (cont.)
f undergoes a chain of reformulations ff1…fm
s(f, Pp…pm ) = sfunr (s(f, p1), s(f1, p2), … , s(fm-1, pm))
sfunr is a t-norm
One-step query reformulation (pipj) according to the mapping M(Oi,Oj)
s(f,pj) = sfunc ( μ(C1 ,C’1), … , μ(Cn ,C’n) )
sfunc is a t-norm
Query answering semantics f submitted to p
P’ = {p1,…,pm} : set of accessed peers
Pp…pi : path used to reformulate f over each pi in P’
Ans (f, {p} U P’) = Ans(f,p) U Ans(f, Pp…p1 ) U … U Ans(f, Pp1…pm )
where : Ans(f, Pp…pi ) = (Ans(f,pi), s(f, Pp…pi ))
Multi-step query reformulation (pp1…pm = Pp1…pm )
pi pj
M(Oi,Oj)
Oi Oj
f
ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS
p p1 pm…
ff1
f2fm
(Ans(f,pj), s(f,pj))
(Ans(f,pm), s(f,Pp…pm))
Outline
Motivation
Query Answering Semantics
Query Routing Semantic Query Routing Multimedia Query Routing Combined Query Routing
Routing Strategies
Experimental Evaluation
Conclusions and Future Works
ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS
Peer1
Peer2
Peer3
Peer5
Peer4
Peer6
Semantic Query Routing
A Generalized Semantic Mapping relates each each concept C in Oi to the set of concepts Cs in Oj
s taken from the mappings in Ppi…pjs according to an aggregated score which expresses the semantic similarity between C and Cs.
Whenever pi forwards a query to one of its neighbors pj , the query might follow any of the semantic paths originating at pj, , i.e. in pj’s subnetwork
Main idea: introduction of a ranking approach for query routing which promotes pi’ neighbors whose subnetworks are the most semantically related to the query.
Preliminaries: pj
s : the set of peers in the subnetwork rooted at pj
Ojs: set of schemas {Ojk | pjk in pj
s}
Ppi…pjs : set of paths from pi to any peer in pj
ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS
Peer2s
When a peer receives a query formula f, it exploits its SRI scores to determine a ranking for its neighborhood: Rp
sem (f)[pi] = sfunc(μ(C1,C1s), …, μ(Cn,Cn
s))
Each peer p maintains a matrix named Semantic Routing Index (SRI) containing the membership grades given by the generalized semantic mappings between itself and its neighborhood Nb(p)
Peer1
Peer2
Peer3
Peer5
Peer4
Peer6
Semantic Query Routing (cont.)
SRISRIPeer1Peer1 TileTile originorigin ……
Peer1Peer1 1.01.0 1.01.0 ……
Peer2Peer2 0.850.85 0.700.70 ……
Peer3Peer3 0.650.65 0.850.85 ……
ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS
SRI[i,j] represents how the j-th concept is semantically approximated by the subnetwork rooted at the i-th neighbor
SRI-based Query Processing
Outline
Motivation
Query Answering Semantics
Query Routing Semantic Query Routing Multimedia Query Routing Combined Query Routing
Routing Strategies
Experimental Evaluation
Conclusions and Future Works
ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS
Multimedia Query Routing
Main idea: introduction of a ranking approach for query routing which promotes pi’ neighbors whose subnetworks contain the highest number of potentially matching objects.
ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS
The execution of multimedia predicates is inherently costly (CPU and I/O)
Preliminaries Each peer’s object is classified w.r.t. its distance (dissimilarity) to some
reference objects (pivots)E.g.: object O, pivot P, with d(O, P) = 42
Each peer maintains a condensed description of such a classification of its objects by using histograms (Peer Indices)
0 10042
0 0 0 01 Bit-vector
0 0 0 01
0 1 0 00
0 0 0 01} 0 1 0 02
ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS
Multimedia Query Routing (cont.) Each peer p also maintains a Multimedia Routing Index (MRI) containing
the aggregated description of the resources available in its neighbors’ subnetworks
Similarity-based Range Queries over metrics objects For each query Q, a vector representation QueryIdx(Q) is built by setting to
1 all the intervals covered by the requested range
MRI-based Query Processing
Rpmm(f)[pi]=
mink [QueryIdx(Q) * MRI(p,pis)]
Σ j=1…n mink [QueryIdx(Q) * MRI (p,pjs)]
When a peer p receives Q, it computes the percentage of potential matching objects in each neighbor’s subnetwork w.r.t the total objects in Nb(p):
Peer1
Peer2
Peer3
Peer5
Peer4
Peer6
Any MRI row MRI(p,pis) is built by summing up the Peer
Indices in the i-th neighbor’s subnetwork
MRIMRIPeer1Peer1 [0.0,0.2][0.0,0.2] [0.2,0.4][0.2,0.4] ……
Peer2Peer2 88 2929 ……
Peer3Peer3 300300 1,5211,521 ……
Outline
Motivation
Query Answering Semantics
Query Routing Semantic Query Routing Multimedia Query Routing Combined Query Routing
Routing Strategies
Experimental Evaluation
Conclusions and Future Works
ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS
Combined Query Routing
ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS
Optimal aggregation algorithms can only work with monotone aggregation functions E.g. min, mean, sum…
Both the semantic and multimedia scores induce a total order They can be combined by means of a meaningful aggregate
function in order to obtain a global ranking:
We inspire to Fagin, Lotem & Naor: “Optimal Aggregation Algorithms for
Middleware”. Journal of Computer and System Sciences, 66: 47-58, 2003.
Rpcomb (f) = α Rp
sem (f) β Rpmm(f)
ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS
Combined Query Routing (cont.)
SRIPeer1 MRIPeer1 min( )
Peer2 0.70 0.53 0.53
Peer3 0.65 0.76 0.65
Peer1
Peer2
Peer3
Peer5
Peer4
Peer6
E.g.
SELECT ?companyWHERE { ?tile a Tile .
?tile company ?company . ?tile price ?price . ?tile origin ?origin . ?tile image ?image . FILTER ( (?price < 30) && (?origin = ‘Italy’) &&
LIKE (?image, ‘myimage.jpg’, 0.3))}LIMIT 100
Final ranking: Peer3-Peer2 Peer3 roots the most promising subnetwork!
Outline
Motivation
Query Answering Semantics
Query Routing Semantic Query Routing Multimedia Query Routing Combined Query Routing
Routing Strategies
Experimental Evaluation
Conclusions and Future Works
ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS
Routing Strategies
Efficiency (Depth First Model) the best peer in one hop!
o it exploits the only information provided by Rpcomb (f)
Effectiveness (Global Model) the best known peer!
o it exploits the information provided by Up is visited Rpcomb (f)
The adopted routing strategy determines the set of visited peers and induces an order on it
When a peer p receives Q it computes the ranking Rpcomb(f) on its
neighbors
Different routing strategies relying on such ranking and having different performance priorities are possible:
Outline
Motivation
Query Answering Semantics
Query Routing Semantic Query Routing Multimedia Query Routing Combined Query Routing
Routing Strategies
Experimental Evaluation
Conclusions and Future Works
ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS
Experimental Evaluation
Simulation environment for SRI and MRI Peers belong to different semantic categories
Ontologies: consisting of a small number of classes Multimedia contents: hundreds of images taken from the Web,
characterized by two MPEG7 standard features (scalable color & edge histogram)
Network topology: randomly generated with the BRITE tool in the size of few dozens of nodes
Queries: on randomly selected peers Routing strategy: DF search Aggregation function: mean Stopping condition: number of retrieved results
ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS
Experimental setting
Effectiveness Evaluation
ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS
We measure the semantic quality of the obtained results (satisfaction)
Outline
Motivation
Query Answering Semantics
Query Routing Semantic Query Routing Multimedia Query Routing Combined Query Routing
Routing Strategies
Experimental Evaluation
Conclusions and Future Works
ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS
Conclusions & Future Works We presented a novel approach for processing queries
effectively and efficiently in a distributed and heterogeneous environment, like the one of the NeP4B Project
We proposed an innovative query routing approach which exploits the semantic of the concepts in the peers’ontologies and the multimedia contents in the peers’repositories
We experimentally prove the effectiveness of our techniques with a series of exploratory tests
(In the future) we will: perform new tests on larger and more complex scenarios integrate our techniques in a more general framework for
query routing (including latency, costs, etc.)
ISDSI’09 - C. Gennaro, F. Mandreoli, R. Martoglia, M. Mordacchini, W. Penzo, S. Sassatell - Combining Semantic and Multimedia Query Routing Techniques for Unified Data Retrieval in a PDMS