rdf conjunctive query selectivity estimation

11

Click here to load reader

Upload: inria-oak

Post on 23-Jun-2015

65 views

Category:

Data & Analytics


0 download

DESCRIPTION

..

TRANSCRIPT

Page 1: Rdf conjunctive query selectivity estimation

RDF  conjunc*ve  query  cardinality  es*ma*on  

Owner  :  Stama,s/Damian  Presenter  :  Soudip  

Page 2: Rdf conjunctive query selectivity estimation

Overview  •  This  project  provides  methods  to  es,mate  the  cardinality  of  

(for  the  result  of)  a  conjunc,ve  query  •  It  requires  a  summary  with  sta,s,cs  informa,on  that  can  be  

provided  as  a  serialized  summary  or  by  providing  the  database  connec,on  containing  sta,s,cs  and  dic,onary  tables  informa,on  

•  This  project  also  include  methods  to  generate  a  dic,onary-­‐encoded  version  of  the  triples  table  and  the  triples  table  sta,s,cs  in  a  database,  for  both  the  plain  and  the  dic,onary-­‐encoded  triples  tables  

•  It  is  a  refactored  extrac,on  (for  code    reusability)  of  the  RDFViewSelec,on’s  (project)  cardinality  es,mator    

Page 3: Rdf conjunctive query selectivity estimation

Few  Details  •  Online  repository:    

–  hLps://scm.gforge.inria.fr/svn/distriples/RDFOp,m  •  Code  size  (java)  

–  4041  (LoC),  15  packages      •  List  of  people  contributed-­‐  

–  Present:  Stama,s,  Damian,  Ioana  –  Past:  Julien  Leblay    

•  Current  Owner  (OAK  member)  of  the  Code-­‐  Stama,s/Damian  •  Who  is  using  the  code  now  

–  Fragmented  Query  Execu,on  (Damian)  –  CliqueSquare  (Stama,s)  –  Op,mizer  (hLps://scm.gforge.inria.fr/svn/distriples/trunk/Op,mizer)  

(Stama,s/Zoi)    

Page 4: Rdf conjunctive query selectivity estimation

Func*onal  Architecture  

Importer

RDF Loader

Parser

Key ValueS P O

Data Store (PostgreSQL DB)

Dictonary TableSummary Table (6)Data Table

CQ Parser

DB Interfaces

Cardinality Estimator

ConjunctiveQueryRDF

Cardinality Info.

Triples Summary, Dictonary

Main Modules

S Count(*)

Count(P)

Count(O)

Min(P)

Max(P)

Min(O)

Max(O)

P Count(*)

Count(S)

Count(O)

Min(S)

Max(S)

Min(O)

Max(O)

O Count(*)

Count(S)

Count(P)

Min(S)

Max(S)

Min(P)

Max(P)

S P Count(*)

Count(O)

Min(O)

Max(O)

P O Count(*)

Count(S)

Min(S)

Max(S)

S O Count(*)

Count(P)

Min(P)

Max(P)

CQ, Summary, Dictonary

Page 5: Rdf conjunctive query selectivity estimation

RDF  Loader  

Importer

RDF Loader

Parser

Key ValueS P O

Data Store (PostgreSQL DB)

Dictonary TableSummary Table (6)Data Table

CQ Parser

DB Interfaces

Cardinality Estimator

ConjunctiveQueryRDF

Cardinality Info.

Triples Summary, Dictonary

Main Modules

S Count(*)

Count(P)

Count(O)

Min(P)

Max(P)

Min(O)

Max(O)

P Count(*)

Count(S)

Count(O)

Min(S)

Max(S)

Min(O)

Max(O)

O Count(*)

Count(S)

Count(P)

Min(S)

Max(S)

Min(P)

Max(P)

S P Count(*)

Count(O)

Min(O)

Max(O)

P O Count(*)

Count(S)

Min(S)

Max(S)

S O Count(*)

Count(P)

Min(P)

Max(P)

CQ, Summary, Dictonary

–   Parses  input  RDF  files  content  –   Extracts  <subject,  property,  object>  triples  

–  Loads  the  triples  into  the  DB    

Page 6: Rdf conjunctive query selectivity estimation

Data  Store  

Importer

RDF Loader

Parser

Key ValueS P O

Data Store (PostgreSQL DB)

Dictonary TableSummary Table (6)Data Table

CQ Parser

DB Interfaces

Cardinality Estimator

ConjunctiveQueryRDF

Cardinality Info.

Triples Summary, Dictonary

Main Modules

S Count(*)

Count(P)

Count(O)

Min(P)

Max(P)

Min(O)

Max(O)

P Count(*)

Count(S)

Count(O)

Min(S)

Max(S)

Min(O)

Max(O)

O Count(*)

Count(S)

Count(P)

Min(S)

Max(S)

Min(P)

Max(P)

S P Count(*)

Count(O)

Min(O)

Max(O)

P O Count(*)

Count(S)

Min(S)

Max(S)

S O Count(*)

Count(P)

Min(P)

Max(P)

CQ, Summary, Dictonary

–   Stores  triples  in  the  DB  and  creates  3  different  tables  –   Data  Table    

•  Stores  the  basic  triples  –  Summary  Table  

•  Stores  different  summaries  of  triples    •  6  different  summary  tables  

–  Dic,onary  Table  •  Stores  integer  values  corresponding  to  each  entries  in  the  data  table    

 

Page 7: Rdf conjunctive query selectivity estimation

Cardinality  Es*mator  Module  

Importer

RDF Loader

Parser

Key ValueS P O

Data Store (PostgreSQL DB)

Dictonary TableSummary Table (6)Data Table

CQ Parser

DB Interfaces

Cardinality Estimator

ConjunctiveQueryRDF

Cardinality Info.

Triples Summary, Dictonary

Main Modules

S Count(*)

Count(P)

Count(O)

Min(P)

Max(P)

Min(O)

Max(O)

P Count(*)

Count(S)

Count(O)

Min(S)

Max(S)

Min(O)

Max(O)

O Count(*)

Count(S)

Count(P)

Min(S)

Max(S)

Min(P)

Max(P)

S P Count(*)

Count(O)

Min(O)

Max(O)

P O Count(*)

Count(S)

Min(S)

Max(S)

S O Count(*)

Count(P)

Min(P)

Max(P)

CQ, Summary, Dictonary

–   Takes  input  •  A  conjunc,ve  query  •  Data  from  Summary  and  Dic,onary  tables  

– Outputs  cardinality  informa,on  for  the    input  query    

 

Page 8: Rdf conjunctive query selectivity estimation

CQ  Parser  

Importer

RDF Loader

Parser

Key ValueS P O

Data Store (PostgreSQL DB)

Dictonary TableSummary Table (6)Data Table

CQ Parser

DB Interfaces

Cardinality Estimator

ConjunctiveQueryRDF

Cardinality Info.

Triples Summary, Dictonary

Main Modules

S Count(*)

Count(P)

Count(O)

Min(P)

Max(P)

Min(O)

Max(O)

P Count(*)

Count(S)

Count(O)

Min(S)

Max(S)

Min(O)

Max(O)

O Count(*)

Count(S)

Count(P)

Min(S)

Max(S)

Min(P)

Max(P)

S P Count(*)

Count(O)

Min(O)

Max(O)

P O Count(*)

Count(S)

Min(S)

Max(S)

S O Count(*)

Count(P)

Min(P)

Max(P)

CQ, Summary, Dictonary

     

–   Taken  as  it  is  from  the  project  Conjunc,ve  Query  

 

Page 9: Rdf conjunctive query selectivity estimation

DB  Interfaces  

Importer

RDF Loader

Parser

Key ValueS P O

Data Store (PostgreSQL DB)

Dictonary TableSummary Table (6)Data Table

CQ Parser

DB Interfaces

Cardinality Estimator

ConjunctiveQueryRDF

Cardinality Info.

Triples Summary, Dictonary

Main Modules

S Count(*)

Count(P)

Count(O)

Min(P)

Max(P)

Min(O)

Max(O)

P Count(*)

Count(S)

Count(O)

Min(S)

Max(S)

Min(O)

Max(O)

O Count(*)

Count(S)

Count(P)

Min(S)

Max(S)

Min(P)

Max(P)

S P Count(*)

Count(O)

Min(O)

Max(O)

P O Count(*)

Count(S)

Min(S)

Max(S)

S O Count(*)

Count(P)

Min(P)

Max(P)

CQ, Summary, Dictonary

–   Contains  two  interfaces  to  load  data  from  the  DB  to  the  memory  •  Summary  Interface  –  to  get  data  from  the  Summary  table  

•  Dic,onary  Interface  -­‐    to  get  data  from  the  Dic,onary  table  

 

Page 10: Rdf conjunctive query selectivity estimation

Cardinality  Es*mator  

Importer

RDF Loader

Parser

Key ValueS P O

Data Store (PostgreSQL DB)

Dictonary TableSummary Table (6)Data Table

CQ Parser

DB Interfaces

Cardinality Estimator

ConjunctiveQueryRDF

Cardinality Info.

Triples Summary, Dictonary

Main Modules

S Count(*)

Count(P)

Count(O)

Min(P)

Max(P)

Min(O)

Max(O)

P Count(*)

Count(S)

Count(O)

Min(S)

Max(S)

Min(O)

Max(O)

O Count(*)

Count(S)

Count(P)

Min(S)

Max(S)

Min(P)

Max(P)

S P Count(*)

Count(O)

Min(O)

Max(O)

P O Count(*)

Count(S)

Min(S)

Max(S)

S O Count(*)

Count(P)

Min(P)

Max(P)

CQ, Summary, Dictonary

–   Takes  as  input  CQ  and  Summary  data  and  produces  the  cardinality  info  for  the  input  CQ  as  output  

–  It  uses  3  algorithms  (two  use  sta,c  summary  and  one  uses  MaxMin  Summary)    for  calcula,ng  the  cardinality  (hLps://scm.gforge.inria.fr/svn/distriples/RDFOp,m/)  

 

Page 11: Rdf conjunctive query selectivity estimation

Thank  you!!