presentazione di powerpoint - informatics europe · searching the web of data requires...

49
Search Computing November 7, 2011 Stefano Ceri ... and the SeCo Project Team Adnan Abid, Mamoun Abu Helu, Davide Barbieri, Daniele Braga , Marco Brambilla, Alessandro Bozzon , Alessandro Campi, Davide Chicco, Emanuele Della Valle, Piero Fraternali, Nicola Gatti, Giorgio Ghisalberghi, Michael Grossniklaus, Davide Martinenghi, Marco Masseroli, Maristella Matera, Chiara Pasini, Silvia Quarteroni , Marco Tagliasacchi, Luca Tettamanti, Salvatore Vadacca, Serge Zagorac

Upload: others

Post on 05-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Presentazione di PowerPoint - Informatics Europe · Searching the Web of Data requires demand-driven service composition Composition abstractions should emphasize few elements: service

Search Computing

November 7, 2011

Stefano Ceri ... and the SeCo Project Team

Adnan Abid, Mamoun Abu Helu, Davide Barbieri, Daniele Braga, Marco

Brambilla, Alessandro Bozzon, Alessandro Campi, Davide Chicco, Emanuele

Della Valle, Piero Fraternali, Nicola Gatti, Giorgio Ghisalberghi, Michael

Grossniklaus, Davide Martinenghi, Marco Masseroli, Maristella Matera,

Chiara Pasini, Silvia Quarteroni, Marco Tagliasacchi, Luca Tettamanti,

Salvatore Vadacca, Serge Zagorac

Page 2: Presentazione di PowerPoint - Informatics Europe · Searching the Web of Data requires demand-driven service composition Composition abstractions should emphasize few elements: service

Prof. Stefano CeriDatabase Management

Motivating Examples – What Search Engines can’t do 2

“Where can I find a theater close to Union Square, San

Francisco, showing a recent thriller movie, close to a

good steak house?”

Page 3: Presentazione di PowerPoint - Informatics Europe · Searching the Web of Data requires demand-driven service composition Composition abstractions should emphasize few elements: service

Prof. Stefano CeriDatabase Management

Search For a Solution Using All Keywords

Page 4: Presentazione di PowerPoint - Informatics Europe · Searching the Web of Data requires demand-driven service composition Composition abstractions should emphasize few elements: service

Prof. Stefano CeriDatabase Management

Split the task, and search for theaters first

Page 5: Presentazione di PowerPoint - Informatics Europe · Searching the Web of Data requires demand-driven service composition Composition abstractions should emphasize few elements: service

Prof. Stefano CeriDatabase Management

But there’s no thriller!

Try another theater: Found! (The Next Three Days)

close enough to Union square....

5

Page 6: Presentazione di PowerPoint - Informatics Europe · Searching the Web of Data requires demand-driven service composition Composition abstractions should emphasize few elements: service

Prof. Stefano CeriDatabase Management

Independent search for steak house

Page 7: Presentazione di PowerPoint - Informatics Europe · Searching the Web of Data requires demand-driven service composition Composition abstractions should emphasize few elements: service

Prof. Stefano CeriDatabase Management

Done! Close enough!

(data integration and ranking in the user’s brain)

Page 8: Presentazione di PowerPoint - Informatics Europe · Searching the Web of Data requires demand-driven service composition Composition abstractions should emphasize few elements: service

Prof. Stefano CeriDatabase Management

VISION

8

Page 9: Presentazione di PowerPoint - Informatics Europe · Searching the Web of Data requires demand-driven service composition Composition abstractions should emphasize few elements: service

Prof. Stefano CeriDatabase Management

The Search Computing Project

ERC-founded project

5 years – Started in 2009, now at month 36

Build theories, methods and tools to support search-

oriented multi-dimensional queries– Given a multi-domain query

– Build global solutions by integrating data produced by

search services

– Rank global solutions according to a global rank function and output results in rank order

– Support user-friendly interfaces for query definition and result browsing, which allow adding search domains while the search process proceeds and possibly change the relative weight of each ranking

Page 10: Presentazione di PowerPoint - Informatics Europe · Searching the Web of Data requires demand-driven service composition Composition abstractions should emphasize few elements: service

Prof. Stefano CeriDatabase Management

Search Computing = Search Service Composition

Searching the Web of Data requires demand-driven

service composition

Composition abstractions should emphasize few

elements: service invocations, fundamental operations,

precedences, global constraints on execution

Data composition should be search-driven – producing

few top results very fast

11

Pipe Parallel

Trulia.com

real estate

Walkscore.com

walkability

Metro.net

public transit

LocalCensus.com

demographics

GOOD, 30 results, 10 calls GOOD, 30 results, 5 seconds, 50 calls

Page 11: Presentazione di PowerPoint - Informatics Europe · Searching the Web of Data requires demand-driven service composition Composition abstractions should emphasize few elements: service

Prof. Stefano CeriDatabase Management

Modular software view of search applications

New generation software for building focused search

applications

Covering the functionalities of vertical search systems

(e.g. “expedia”, “amazon”) on more focused application

domains (e.g. localized real estate or leasure planning,

sector-specific job market offers, support of biomed

research, ...)

Should be easy-to-build, easy-to-query, easy-to-maintain,

easy-to-scale...

12

Page 12: Presentazione di PowerPoint - Informatics Europe · Searching the Web of Data requires demand-driven service composition Composition abstractions should emphasize few elements: service

Prof. Stefano CeriDatabase Management

TECHNOLOGICAL FRAMEWORK

13

Page 13: Presentazione di PowerPoint - Informatics Europe · Searching the Web of Data requires demand-driven service composition Composition abstractions should emphasize few elements: service

Prof. Stefano CeriDatabase Management

Search Computing architecture: overall view 14

Main Query flow

Domain

Repository

Front End

Query Planner

Cache

Query To Domain

MapperCache

Query Analysis

Cache

Query Engine

OP 1 OP 2 OP N Cache...

WS-Framework

Cache

Service

Repository

Result

Transformation

Cache

WS

World

High-Level Query

Sub-queries

Concrete

Query Plan

Low-level queriesMerged Results

Domain

FrameworkCache

Final User

Results

<Uses> relation

High level query

“Where can I attend a DB

scientific conference close to

a beautiful beach reachable

with cheap flights?”Sub query 1

“Where can I attend a DB

scientific conference?”

Sub query 2

“place close to

a beautiful beach?”

Sub query 3

“place reachable with

cheap flight?”

Low level query 1

ConfSearch(“DB”,placeX,dateY)Low level query 2

TourSearch(“Beach”,PlaceX)Low level query 3

Flight(“cost<200”,PlaceX,DateY)

Query plan

Services invocations

and operators execution

Results

Presented resultsESWC-Crete-Olympic

CAISE- Hammamet – Alitalia

TOOLS-Malaga-EasyJet

Page 14: Presentazione di PowerPoint - Informatics Europe · Searching the Web of Data requires demand-driven service composition Composition abstractions should emphasize few elements: service

Prof. Stefano CeriDatabase Management

Search Computing architecture: incremental prototyping15

Prototype 1:

Core behaviour of the

system.

• Query engine

• Domain repository

• Service repository

• Result presentation

<Uses> relation

Domain

Repository

Front End

Query Planner

Cache

Query To Domain

MapperCache

Query Analysis

Cache

Query Engine

OP 1 OP 2 OP N Cache...

WS-Framework

Cache

Service

Repository

Result

Transformation

Cache

WS

World

High-Level Query

Sub-queries

Concrete

Query Plan

Low-level queriesMerged Results

Domain

FrameworkCache

Final User

Results

Ad

min

In

terf

ace

Lo

w-le

ve

l q

ue

rie

s

Su

b-q

ue

rie

s

Co

ncre

te Q

ue

ry P

lan

Prototype 2:

Vertical solutions

• ER Domain description

• Query planner

• Application design tools

Prototype 3:

Ontology-driven search

• Ontological query

interpretation

• Ontological description &

annotation of services

Prototype 4:

NL or keyword queries

Page 15: Presentazione di PowerPoint - Informatics Europe · Searching the Web of Data requires demand-driven service composition Composition abstractions should emphasize few elements: service

Prof. Stefano CeriDatabase Management

LIQUID QUERY INTERFACE

16

Page 16: Presentazione di PowerPoint - Informatics Europe · Searching the Web of Data requires demand-driven service composition Composition abstractions should emphasize few elements: service

Prof. Stefano CeriDatabase Management

Concert

Artist

ExhibitionRestaurant

Hotel

Movie

Metro Station

Theatre

Photo

Landmark

News

...

Piece

...

...

...

...

ShoppingCenter...

...

...

Photo

Liquid query definition

Concert

It consists of subsetting and parametrizing the resource

graph...

Metro Station

RestaurantNewsExhibition

Artist

Hotel

= inputs, outputs + GR = global ranking

Page 17: Presentazione di PowerPoint - Informatics Europe · Searching the Web of Data requires demand-driven service composition Composition abstractions should emphasize few elements: service

Prof. Stefano CeriDatabase Management

Photo

Liquid query definition

Concert

... And then characterizing the user interaction

Metro Station

RestaurantNewsExhibition

Artist

Hotel

Plus:

• Parametrization of global ranking

• Data visualization options

• .. and so on

Expand

Page 18: Presentazione di PowerPoint - Informatics Europe · Searching the Web of Data requires demand-driven service composition Composition abstractions should emphasize few elements: service

Prof. Stefano CeriDatabase Management

Exploration of the Service Space

Entity

Selection

Page 19: Presentazione di PowerPoint - Informatics Europe · Searching the Web of Data requires demand-driven service composition Composition abstractions should emphasize few elements: service

Prof. Stefano CeriDatabase Management

Exploration of the Service Space

Entity Selection

Service

Selection

Page 20: Presentazione di PowerPoint - Informatics Europe · Searching the Web of Data requires demand-driven service composition Composition abstractions should emphasize few elements: service

Prof. Stefano CeriDatabase Management

Exploration of the Service Space

Entity Selection

Service Selection

Query !!

Page 21: Presentazione di PowerPoint - Informatics Europe · Searching the Web of Data requires demand-driven service composition Composition abstractions should emphasize few elements: service

Prof. Stefano CeriDatabase Management

Result Presentation

Tabular Representation

Ranking

Bar Local

Order

Filter

Projection

Page 22: Presentazione di PowerPoint - Informatics Europe · Searching the Web of Data requires demand-driven service composition Composition abstractions should emphasize few elements: service

Prof. Stefano CeriDatabase Management

Result Presentation (Map) 23

Page 23: Presentazione di PowerPoint - Informatics Europe · Searching the Web of Data requires demand-driven service composition Composition abstractions should emphasize few elements: service

Prof. Stefano CeriDatabase Management

Exploration options from a given state24

Related Entities

Page 24: Presentazione di PowerPoint - Informatics Europe · Searching the Web of Data requires demand-driven service composition Composition abstractions should emphasize few elements: service

Prof. Stefano CeriDatabase Management

Result Presentation (Atom View) 25

Association

Real Estate

ServiceDoctor

Service

Page 25: Presentazione di PowerPoint - Informatics Europe · Searching the Web of Data requires demand-driven service composition Composition abstractions should emphasize few elements: service

Prof. Stefano CeriDatabase Management

Result Visualization – Combinations on Maps 26

Page 26: Presentazione di PowerPoint - Informatics Europe · Searching the Web of Data requires demand-driven service composition Composition abstractions should emphasize few elements: service

Prof. Stefano CeriDatabase Management

SERVICE REGISTRATION

27

Page 27: Presentazione di PowerPoint - Informatics Europe · Searching the Web of Data requires demand-driven service composition Composition abstractions should emphasize few elements: service

Prof. Stefano CeriDatabase Management

Rationale of Service Registration

Concert

• Providing a “Semantic Resource Framework” (SRF) where concepts of the

real world are mapped to entities and interconnected by relationships

• Along the idea of the “web of objects” instead of the “web of pages”

Artist

ExhibitionRestaurant

Hotel

Movie

Metro Station

Theatre

Photo

Landmark

News

...

Piece

...

...

...

...

ShoppingCenter...

...

...

Page 28: Presentazione di PowerPoint - Informatics Europe · Searching the Web of Data requires demand-driven service composition Composition abstractions should emphasize few elements: service

Prof. Stefano CeriDatabase Management

Under the scene... 29

Page 29: Presentazione di PowerPoint - Informatics Europe · Searching the Web of Data requires demand-driven service composition Composition abstractions should emphasize few elements: service

Prof. Stefano CeriDatabase Management

Service Framework in SeCo

Service Interface

Access Pattern

Service Mart

Domain Diagram

Reference KB

Serv

ice D

escription F

ram

ew

ork

Sem

antic A

nn

ota

tio

n F

ram

ew

ork

Conceptual representation:

group services by core entities

Logical representation:

i/o fields and transitions as

domain entities/relationships

Physical representation:

as shipped by data provider

Capturing of service

semantics via

Knowledge Base

lookup

Entities/relationships

“mentioned” in SDF

• A Service Description Framework coupled with a Semantic Annotation Framework

Page 30: Presentazione di PowerPoint - Informatics Europe · Searching the Web of Data requires demand-driven service composition Composition abstractions should emphasize few elements: service

Prof. Stefano CeriDatabase Management

TheaterByFLD

Semantic Framework:

Domain Diagram and Access Patterns 32

TheatrebyMovie

ActorByTitle

PrizeByDirector

Actor TheatreMovie

Film_Director Prize

Current

Position

Current

Date

Domain

concept

Access

Pattern

MovieByTitle

Page 31: Presentazione di PowerPoint - Informatics Europe · Searching the Web of Data requires demand-driven service composition Composition abstractions should emphasize few elements: service

Prof. Stefano CeriDatabase Management

SECO ENGINE

33

Page 32: Presentazione di PowerPoint - Informatics Europe · Searching the Web of Data requires demand-driven service composition Composition abstractions should emphasize few elements: service

Prof. Stefano CeriDatabase Management

The Query Processor in the Big Picture

Page 33: Presentazione di PowerPoint - Informatics Europe · Searching the Web of Data requires demand-driven service composition Composition abstractions should emphasize few elements: service

Prof. Stefano CeriDatabase Management

The Query Processor in the Big Picture

Query

Processor

Workbench

testing

tool

Page 34: Presentazione di PowerPoint - Informatics Europe · Searching the Web of Data requires demand-driven service composition Composition abstractions should emphasize few elements: service

Prof. Stefano CeriDatabase Management

The Query Processor in the Big Picture

Query

Processor

Workbench

testing

tool

Logical Level

Panta Rhei

Physical Level

Restaurant

Movie

Theatre

OUTIN

SeCoQL

Page 35: Presentazione di PowerPoint - Informatics Europe · Searching the Web of Data requires demand-driven service composition Composition abstractions should emphasize few elements: service

Prof. Stefano CeriDatabase Management

SeCoQL

An old drama movie showing tonight in a theatre close to a good

restaurant

NightOut(Piccadilly, London, UK)

Page 36: Presentazione di PowerPoint - Informatics Europe · Searching the Web of Data requires demand-driven service composition Composition abstractions should emphasize few elements: service

Prof. Stefano CeriDatabase Management

First step:

from conjunctive queries to logical plans

Generation of a logical

plan

Restaurant

Movie

TheatreOUTIN

Page 37: Presentazione di PowerPoint - Informatics Europe · Searching the Web of Data requires demand-driven service composition Composition abstractions should emphasize few elements: service

Prof. Stefano CeriDatabase Management

Second step:

from logical to physical query plans

Then the planner generates a physical,

executable query plan, expressed in Panta Rhei

Movie

Theater

Restaurant

(1,10,R)

(1,1,T)

Restaurant

Movie

Theatre

OUTIN

Page 38: Presentazione di PowerPoint - Informatics Europe · Searching the Web of Data requires demand-driven service composition Composition abstractions should emphasize few elements: service

Prof. Stefano CeriDatabase Management

Workbench 40

Query

Lifecycle

Panta Rhei Plan

Query Execution

Report

Page 39: Presentazione di PowerPoint - Informatics Europe · Searching the Web of Data requires demand-driven service composition Composition abstractions should emphasize few elements: service

Prof. Stefano CeriDatabase Management

THEORY

41

Page 40: Presentazione di PowerPoint - Informatics Europe · Searching the Web of Data requires demand-driven service composition Composition abstractions should emphasize few elements: service

Prof. Stefano CeriDatabase Management

Proximity Rank Join (Martinenghi-Tagliasacchi, VLDB 2010)

42HotelsRestaurants

Rating Stars

Join

Too many results !

Order by

-individual scores

-proximity from query

-Inter-object proximity

and return the top-K results

Page 41: Presentazione di PowerPoint - Informatics Europe · Searching the Web of Data requires demand-driven service composition Composition abstractions should emphasize few elements: service

Prof. Stefano CeriDatabase Management

Ranking with uncertain scoring functions

(Soliman, Ilyas, Martinenghi, Tagliasacchi, Sigmod 2011)

I

D

Ratin

g

Star

s

τ1 2 6

τ2 7 5

τ3 4 7

τ4 5 2wR

wH

(0.1,0.9)

τ3

τ1

τ2

τ4

(0.7,0.3)

τ2

τ3

τ4

τ1

(0.4,0.6)τ2

τ3

τ1

τ4

43

How to select the right weights?

SELECT *

FROM Restaurants R, Hotels H

WHERE R.Location = H.Location

ORDER BY wR· R.Rating + wH· H.Stars

LIMIT K

Page 42: Presentazione di PowerPoint - Informatics Europe · Searching the Web of Data requires demand-driven service composition Composition abstractions should emphasize few elements: service

Prof. Stefano CeriDatabase Management

CURRENT STANDING

44

Page 43: Presentazione di PowerPoint - Informatics Europe · Searching the Web of Data requires demand-driven service composition Composition abstractions should emphasize few elements: service

Prof. Stefano CeriDatabase Management

Results after 36 months

Concepts– Service marts, join methods, panta rhei, liquid query

Research results– LNCS: Search Computing Challenges and Directions (2010)

– LNCS: New Trends in Search Computing (2011)

– Publications (incl. VLDB, WWW, SIGMOD, TODS)

– US Patent filed (top-k method, random & sequential services)

Site: www.search-computing.eu

Blog: http://blog.search-computing.com/

Temporary research positions (3 phd, 5 post-ms, 3 post-doc)

45

Page 44: Presentazione di PowerPoint - Informatics Europe · Searching the Web of Data requires demand-driven service composition Composition abstractions should emphasize few elements: service

Prof. Stefano CeriDatabase Management

Accesses to Web Site & Blog (2010) 46

Visits: 20% USA, 18% Italy, 6% UK, 4% India, 4% Canada

Page 45: Presentazione di PowerPoint - Informatics Europe · Searching the Web of Data requires demand-driven service composition Composition abstractions should emphasize few elements: service

Prof. Stefano CeriDatabase Management

Accesses to Web Site & Blog (2011) 47

Visits: 27% USA, 10% India, 8% Italy, 4% Germany – UK - France

Page 46: Presentazione di PowerPoint - Informatics Europe · Searching the Web of Data requires demand-driven service composition Composition abstractions should emphasize few elements: service

Prof. Stefano CeriDatabase Management

Events in 2011

Functionality Demo at WWW 2011 (Bangalore)

Engine Demo at ACM-Sigmod (Athens)

48

Page 47: Presentazione di PowerPoint - Informatics Europe · Searching the Web of Data requires demand-driven service composition Composition abstractions should emphasize few elements: service

Prof. Stefano CeriDatabase Management

Events in 2011-12

Workshops:– ExploreWeb (Brambilla, Fraternali, Schwabe)

http://exploreweb.search-computing.org/

– DBRank Workshop (Chakrabarti, Martinenghi)

– Very Large Data Search (VLDS) (Brambilla, Casati, Ceri)http://vlds2011.search-computing.net/

– Ordering and Reasoning (OrdRing) (Bozzon, Della Valle, Horrocks)http://ordring2011.search-computing.org/

– DataView 2011 (Bozzon, Comai, Norrie)http://dataview.como.polimi.it/2011/

Third LNCS Book ?

Planned VLDB Journal special issue on “Web search over structured information and crowds” (tent. Title, Brambilla/Ceri/Halevy, Sept. 2012)

49

Page 48: Presentazione di PowerPoint - Informatics Europe · Searching the Web of Data requires demand-driven service composition Composition abstractions should emphasize few elements: service

Prof. Stefano CeriDatabase Management

Future Research Directions

NLP or keyword-based queries

– focus on subquery partitioning & mapping to domains

Social dimension

– crowd-searching using social platforms

Verticals

– joint work with Diadem (Gottlob) on London real-estate

– regional platforms for “quality of life”

50

Page 49: Presentazione di PowerPoint - Informatics Europe · Searching the Web of Data requires demand-driven service composition Composition abstractions should emphasize few elements: service

Prof. Stefano CeriDatabase Management

And finally.... 51

问题?