ad hoc data integration for mobile gi s a pplications

63
AD HOC DATA INTEGRATION FOR MOBILE GIS APPLICATIONS - Ramya Venkateswaran - ([email protected]) 1

Upload: hamilton-kasen

Post on 13-Mar-2016

36 views

Category:

Documents


0 download

DESCRIPTION

Ad hoc data integration for mobile GI S a pplications. Ramya Venkateswaran ([email protected]). Contents. Scenario Research Objective Introduction: Overview of the GenW2 project Motivation: Why is Ad hoc Data Integration needed? State of the Art - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Ad hoc  data integration for mobile GI S a pplications

AD HOC DATA INTEGRATION FOR MOBILE GIS

APPLICATIONS

-Ramya Venkateswaran-([email protected])

1

Page 2: Ad hoc  data integration for mobile GI S a pplications

Contents1. Scenario2. Research Objective3. Introduction: Overview of the GenW2 project4. Motivation: Why is Ad hoc Data Integration needed?5. State of the Art6. Research Questions: Discuss 3 research questions7. Methods: TourGuide and friends8. Next Steps: Data Enrichment and Quality control

2

Page 3: Ad hoc  data integration for mobile GI S a pplications

Scenario1

Page 4: Ad hoc  data integration for mobile GI S a pplications

Scenario of UsageI will be vacationing in Paris and I want to visit some of the famous palaces, History related places and other tourist locations in Paris

Other Sources ?

Recommendations from

People

Tourist Guides

Albums & Images

Tourist & Travel

Websites

Page 5: Ad hoc  data integration for mobile GI S a pplications

Scenario of UsageI’d still like to go to Paris..

Other Sources ?

People

Tourist Guides

Albums & Images

Tourist & Travel

Websites

Tourguide

Recommendations from

Page 6: Ad hoc  data integration for mobile GI S a pplications

Research Objective2

Page 7: Ad hoc  data integration for mobile GI S a pplications

Objective of my research

Data Integration

•Flavour Based integration

• Ad hoc DI vs. Traditional DI

• TourGuide

Data enrichment

• POI Enrichment

• Website credibility

Data quality control

• Completeness

• Correctness

• Credibility

• User feedback

Ad hoc Data Integration

7

Page 8: Ad hoc  data integration for mobile GI S a pplications

Overview and Introduction3

Page 9: Ad hoc  data integration for mobile GI S a pplications

Overview of the GenW2 Project Short for: Generalization for portrayal in Web

and Wireless mapping

Develop new methods for web and wireless mapping

Focus on ad hoc integration of heterogeneous information on-the-fly map generalization in a mobile context.

9

Page 10: Ad hoc  data integration for mobile GI S a pplications

The GenW2 Framework10

Web

Result

Internal Database

Information retrieval component

ParserRuleset & Association Component

Spatio-Temporal

Event handler

User

Privacy Controller and Firewall

VisualizationFilter &

Relevance Component

Genera-lization

Query

ParsedQuery

Unranked dataset

Ranked dataset

Data

Static datasets

Facts DB

MRDB

Intelligent Ranker

Data Integrator

Data sources

1

1

3

2

Page 11: Ad hoc  data integration for mobile GI S a pplications

The GenW2 Framework11

Web

Result

Internal Database

Information retrieval component

ParserRuleset & Association Component

Spatio-Temporal

Event handler

User

Privacy Controller and Firewall

VisualizationFilter &

Relevance Component

Genera-lization

Query

ParsedQuery

Unranked dataset

Ranked dataset

Data

Static datasets

Facts DB

MRDB

Intelligent Ranker

Data Integrator

Data sources

1

1

3

2

Page 12: Ad hoc  data integration for mobile GI S a pplications

The GenW2 Framework12

ParserParsedQuery

Unranked dataset

Ranked dataset

Data

Static datasets

Facts DB

MRDB

Intelligent Ranker

Data sources

1

2

Data Integrator1

Web Interaction Component

Wrapper Generator

Data Transformer

Web

Image metadata

Webservices

Webpages

3

Page 13: Ad hoc  data integration for mobile GI S a pplications

MRDBFacts

DB

Image metada

ta

Types of Data sourcesWebservice

s

13

Web pages

Staticdataset

s

Page 14: Ad hoc  data integration for mobile GI S a pplications

Motivation - Why is Ad hoc Data Integration needed?4

Page 15: Ad hoc  data integration for mobile GI S a pplications

Motivation So many data sources and so little structure

Web as a database – Too much information to ignore!

Ad hoc integration – Need based according to scenario and flavour, unlike search engines.

Importance of recording certain facts that can enrich the MRDB and the integration process.

15

Page 16: Ad hoc  data integration for mobile GI S a pplications

State of the art5

Page 17: Ad hoc  data integration for mobile GI S a pplications

Relevant DomainsRecommend

ation Systems

Information Filtering

Information Retrieval

Collaborative Filtering

17

Ad hoc Data Integration

Page 18: Ad hoc  data integration for mobile GI S a pplications

State of Art

Data Integration

•Flavour Based integration

• Ad hoc DI vs. Traditional DI

• TourGuide

Data enrichment

• POI Enrichment

• Website credibility

Data quality control

• Completeness

• Correctness

• Credibility

• User feedback

Ad hoc Data Integration

18

Page 19: Ad hoc  data integration for mobile GI S a pplications

Integration, IR and decision systems Different concepts and methods in Data

Integration Data Integration from multiple sources Geospatial data mining and integration. (Knoblock

et al. 2001, Michalowski et al., 2004) Mashup web data for overall importance of

landmarks. (Grabler et al., 2008) SPIRIT – Design, techniques and implementation

(Purves et al., 2007, Jones et al., 2002, Bucher et al., 2005) Geo parsing, geo coding and IR techniques (Clough

et al., 2005)

19

Page 20: Ad hoc  data integration for mobile GI S a pplications

Integration, IR and decision systems Methods for marking tourist locations and

a guide that is 'context aware'. (Abowd et al., 2004)

Activity based model of decisions that are affected based on activity-travel behavior and also predict the activities. (Arentze and Timmermans, 2004)

Voluntary information from a community, collaborative semantics, recommendation systems (Schlieder , 2007)

20

Page 21: Ad hoc  data integration for mobile GI S a pplications

Data Enrichment Methods and algorithms for the provision

of auxiliary data and its use for controlling an automated adaptive generalization process (Neun, 2007)

21

Page 22: Ad hoc  data integration for mobile GI S a pplications

Data quality and assessment Framework for efficient and accurate

integration of geospatial data from a large number of sources

Positional accuracy, completeness (Thakker et al., 2007)

VGI (Volunteered Geographic Information) Trust models for Gazetteers (Keßler et al., 2009)

22

Page 23: Ad hoc  data integration for mobile GI S a pplications

Observations from literature Considerable work and methods for traditional

data integration, variety of methods in IR and GIR

Lesser work and methods for data integration from multiple and dynamic sources (Focus on semantics rather than data and context) and recording reusable facts.

Considerable work on user modeling, activities and activity recommendation

Data enrichment work for improving generalization

23

Page 24: Ad hoc  data integration for mobile GI S a pplications

Challenges Datasets are not static and are dynamic

and heterogeneous Auxiliary data Determining parameters (user

categories, activities habits etc, not a single user or set of preferences)

Point of complete integration Methods to test and evaluate the

effectiveness

24

Page 25: Ad hoc  data integration for mobile GI S a pplications

?Research Questions6

Page 26: Ad hoc  data integration for mobile GI S a pplications

RQ1 – Flavour Based Integration

Given an activity and unrelated data that is heterogeneous and dynamic, what is an effective method of data integration, so that the results are streamlined towards information about events and places for a set of users? Flavour based data integration from various

sources Ad hoc DI vs. Traditional DI Tour guide – An example of web data integration

26

Page 27: Ad hoc  data integration for mobile GI S a pplications

RQ2 – Data Enrichment How can the Generalization for portrayal

in Web and Wireless mapping (GenW2) framework record and exploit valuable reusable information, obtained from the preceding data integration? Facts DB Activity-Location pairs Data source credibility (Keßler et al., 2009) User feedback

27

Page 28: Ad hoc  data integration for mobile GI S a pplications

RQ3 – Quality of data What are the different metrics that can be

used to control and/or assess the quality of the integrated data? Measurement of Quality?

Quality of data by completeness (Thakkar et al., 2007)

Quality of data by correctness (Thakkar et al., 2007) Another metric for Quality Assessment

Quality of data by collective user feedback Credibility rank of information sources (Keßler et al.,

2009) Evaluation Methodology

28

Page 29: Ad hoc  data integration for mobile GI S a pplications

Methods7

Page 30: Ad hoc  data integration for mobile GI S a pplications

Flavour Based Data Integration

Recommendation

SystemsInformation

FilteringInformation

RetrievalCollaborative

Filtering

30

Page 31: Ad hoc  data integration for mobile GI S a pplications

Definition - Flavour Based Data Integration

Recommendation

SystemsInformation

FilteringInformation

RetrievalCollaborative

Filtering

“The central idea here is to base personalized recommendations for users on information obtained from other, ideally likeminded, users.” (Billsus and Pazzani, 1998).

“use the opinions of a community of users to help individuals in that community more effectively identify content of interest from a potentially overwhelming set of choices” (Resnick and Varian 1997).

“a field of study designed for creating a systematic approach to extracting information that a particular person finds important from a larger stream of information” (Canavese, 1994).

“the goal of an information [retrieval] system is for the user to obtain information from the knowledge resource which helps her/him in problem management” (Belkin, 1984)

31

Page 32: Ad hoc  data integration for mobile GI S a pplications

Definition - Flavour Based Data Integration

Recommendation

SystemsInformation

FilteringInformation

RetrievalCollaborative

Filtering

“The central idea here is to base personalized recommendations for users on information obtained from other, ideally likeminded, users.” (Billsus and Pazzani, 1998).

“use the opinions of a community of users to help individuals in that community more effectively identify content of interest from a potentially overwhelming set of choices” (Resnick and Varian 1997).

“a field of study designed for creating a systematic approach to extracting information that a particular person finds important from a larger stream of information” (Canavese, 1994).

“the goal of an information [retrieval] system is for the user to obtain information from the knowledge resource which helps her/him in problem management” (Belkin, 1984)

32

Page 33: Ad hoc  data integration for mobile GI S a pplications

Recommendation

SystemsInformation

FilteringInformation

RetrievalCollaborative

Filtering

“The central idea here is to base personalized recommendations for users on information obtained from other, ideally likeminded, users.” (Billsus and Pazzani, 1998).

“use the opinions of a community of users to help individuals in that community more effectively identify content of interest from a potentially overwhelming set of choices” (Resnick and Varian 1997).

“a field of study designed for creating a systematic approach to extracting information that a particular person finds important from a larger stream of information” (Canavese, 1994).

“the goal of an information [retrieval] system is for the user to obtain information from the knowledge resource which helps her/him in problem management” (Belkin, 1984)

33

Definition - Flavour Based Data Integration

Page 34: Ad hoc  data integration for mobile GI S a pplications

Recommendation

SystemsInformation

FilteringInformation

RetrievalCollaborative

Filtering

“The central idea here is to base personalized recommendations for users on information obtained from other, ideally likeminded, users.” (Billsus and Pazzani, 1998).

“use the opinions of a community of users to help individuals in that community more effectively identify content of interest from a potentially overwhelming set of choices” (Resnick and Varian 1997).

“a field of study designed for creating a systematic approach to extracting information that a particular person finds important from a larger stream of information” (Canavese, 1994).

“the goal of an information [retrieval] system is for the user to obtain information from the knowledge resource which helps her/him in problem management” (Belkin, 1984)

34

Definition - Flavour Based Data Integration

Page 35: Ad hoc  data integration for mobile GI S a pplications

Recommendation

SystemsInformation

FilteringInformation

RetrievalCollaborative

Filtering

“The central idea here is to base personalized recommendations for users on information obtained from other, ideally likeminded, users.” (Billsus and Pazzani, 1998).

“use the opinions of a community of users to help individuals in that community more effectively identify content of interest from a potentially overwhelming set of choices” (Resnick and Varian 1997).

“a field of study designed for creating a systematic approach to extracting information that a particular person finds important from a larger stream of information” (Canavese, 1994).

“the goal of an information [retrieval] system is for the user to obtain information from the knowledge resource which helps her/him in problem management” (Belkin, 1984)

35

Definition - Flavour Based Data Integration

Page 36: Ad hoc  data integration for mobile GI S a pplications

Flavour Based Data Integration

Recommendation

SystemsInformation

FilteringInformation

RetrievalCollaborative

Filtering

“The central idea here is to base personalized recommendations for users on information obtained from other, ideally likeminded, users.” (Billsus and Pazzani, 1998).

“use the opinions of a community of users to help individuals in that community more effectively identify content of interest from a potentially overwhelming set of choices” (Resnick and Varian 1997).

“a field of study designed for creating a systematic approach to extracting information that a particular person finds important from a larger stream of information” (Canavese, 1994).

“the goal of an information [retrieval] system is for the user to obtain information from the knowledge resource which helps her/him in problem management” (Belkin, 1984)

36

Page 37: Ad hoc  data integration for mobile GI S a pplications

Keyphrases in FBDI Systematic approach to extracting

information Obtain information from one or many

knowledge resource/s Recommendations for user groups or

user categories Opinions of a community of users Keyword, flavour or activity such as tourism,

history, sport, culture, shopping etc

37

Page 38: Ad hoc  data integration for mobile GI S a pplications

Definition of FBDI FBDI is an activity based, systematic

approach to extract and integrate information from multiple knowledge sources depending on habits of certain user groups or user categories, capable of learning over time.

Flavour = typical activities of a certain user group

Examples – Tourism, Shopping, Sports, Historical excursions, Cultural excursions etc

38

Page 40: Ad hoc  data integration for mobile GI S a pplications

The GenW2 Framework40

ParserParsedQuery

Unranked dataset

Ranked dataset

Data

Static datasets

Facts DB

MRDB

Intelligent Ranker

Data sources

1

2

Data Integrator1

Web Interaction Component

Wrapper Generator

Data Transformer

Web

Image metadata

Webservices

Webpages

3

Page 41: Ad hoc  data integration for mobile GI S a pplications

ParserParsedQuery

Unranked dataset

Ranked dataset

Data

Static datasets

Facts DB

MRDB

Intelligent Ranker

Data sources

1

2

Data Integrator1

Web Interaction Component

Wrapper Generator

Data Transformer

Web

Image metadata

Webservices

Webpages

3

The GenW2 Framework41

Page 42: Ad hoc  data integration for mobile GI S a pplications

Adaptive tour guide for Paris Flavour Based Integration with web as

datasource Only web as the

database (Grabler et al.,2008 )

Integration of data on Tourism Transport User feedback User Rating Facebook profile Dopplr profile

Scheduler

42

Page 43: Ad hoc  data integration for mobile GI S a pplications

Data Integrator Example of web data integration Functional components (Baumgartner et

al., 2009) Web interaction component

Lonelyplanet, wikitravel, virtualtourist, tripadvisor and official tourist website

Wrapper generator OpenKapow Robomaker

Data transformer DOM parser for RSS and XML formats

43

Page 44: Ad hoc  data integration for mobile GI S a pplications

The GenW2 Framework44

ParserParsedQuery

Unranked dataset

Ranked dataset

Data

Static datasets

Facts DB

MRDB

Intelligent Ranker

Data sources

1

2

Data Integrator1

Web Interaction Component

Wrapper Generator

Data Transformer

Web

Image metadata

Webservices

Webpages

3

Page 45: Ad hoc  data integration for mobile GI S a pplications

Data Integrator Example of web data integration Functional components (Baumgartner et

al., 2009) Web interaction component

Lonelyplanet, wikitravel, virtualtourist, tripadvisor and official tourist website

Wrapper generator OpenKapow Robomaker

Data transformer DOM parser for RSS and XML formats

45

Page 46: Ad hoc  data integration for mobile GI S a pplications

The GenW2 Framework46

ParserParsedQuery

Unranked dataset

Ranked dataset

Data

Static datasets

Facts DB

MRDB

Intelligent Ranker

Data sources

1

2

Data Integrator1

Web Interaction Component

Wrapper Generator

Data Transformer

Web

Image metadata

Webservices

Webpages

3

Page 47: Ad hoc  data integration for mobile GI S a pplications

Web data Extraction Semi automatic wrappers

Automatic wrapper Induction WIEN (Kushmerick et al., 1997) Stalker (Muslea et al., 2001) DEBye (Laender et al., 2000)

47

Academic XWARP (Liu et al., 2000) Lixto (Baumgartner et al.,

2001) Wargo (Pan et al., 2002)

Commercial RoboMaker

(Kapow Technologies) WebQL

(QL2 Software Inc.)

Page 48: Ad hoc  data integration for mobile GI S a pplications

Data Integrator Example of web data integration Functional components (Baumgartner et

al., 2009) Web interaction component

Lonelyplanet, wikitravel, virtualtourist, tripadvisor and official tourist website

Wrapper generator OpenKapow Robomaker

Data transformer DOM parser for RSS and XML formats

48

Page 49: Ad hoc  data integration for mobile GI S a pplications

The GenW2 Framework49

ParserParsedQuery

Unranked dataset

Ranked dataset

Data

Static datasets

Facts DB

MRDB

Intelligent Ranker

Data sources

1

2

Data Integrator1

Web Interaction Component

Wrapper Generator

Data Transformer

Web

Image metadata

Webservices

Webpages

3

Page 50: Ad hoc  data integration for mobile GI S a pplications

Data Integrator Example of web data integration Google as a first part of integration Second Part - Functional components

(Baumgartner et al., 2009) Web interaction component

lonelyplanet, wikitravel, virtualtourist, tripadvisor and official tourist website

Wrapper generator OpenKapow Robomaker

Data transformer DOM parser for RSS and XML formats

50

Page 51: Ad hoc  data integration for mobile GI S a pplications

The GenW2 Framework51

ParserParsedQuery

Unranked dataset

Ranked dataset

Data

Static datasets

Facts DB

MRDB

Intelligent Ranker

Data sources

1

2

Data Integrator1

Web Interaction Component

Wrapper Generator

Data Transformer

Web

Image metadata

Webservices

Webpages

3

Page 52: Ad hoc  data integration for mobile GI S a pplications

Intelligent Ranker and Scheduler Third step of integration. Applies different profiles to the data, like

Facebook and Dopplr. Arranges the data in a ranked form depending

on matches from user interests and activities. Brute force cumulative ranking algorithm

3 – Explicitly mentioned 2 – Description match 1 – Suggested by other users

Merges data from public transport

52

Page 53: Ad hoc  data integration for mobile GI S a pplications

The GenW2 Framework53

ParserParsedQuery

Unranked dataset

Ranked dataset

Data

Static datasets

Facts DB

MRDB

Intelligent Ranker

Data sources

1

2

Data Integrator1

Web Interaction Component

Wrapper Generator

Data Transformer

Web

Image metadata

Webservices

Webpages

3

Page 54: Ad hoc  data integration for mobile GI S a pplications

Facts DB Location information from the MRDB and

map LOD with place Activity Location pairs Fact DB structure

54

Page 55: Ad hoc  data integration for mobile GI S a pplications

Facts DB Structure High Level Structure

Lower level structure – Database Object maps to more locations

Limit to two levels Inverse Page Lookup

55

Activity

LocationFrom

LocationTo

Name Rank

User Feedback

Shopping

47°22′40″N, 8°32′25″E

47.3671°N , 8.5409°E

Bahnhofstrasse

3 Shop for watches, jewelry, clothes

Database Object

Page 56: Ad hoc  data integration for mobile GI S a pplications

Data Quality Evaluation through completeness and

correctness Example : Shopping stores in Bahnofstrasse

Extract lat-lng Shop name, website, details and contact details Shop opening and closing times Evaluate against manually collected data for

completeness and correctness.

56

Page 57: Ad hoc  data integration for mobile GI S a pplications

Next steps8

Page 58: Ad hoc  data integration for mobile GI S a pplications

Next Steps Formalizing parameters and methods for

integration (Link) Improve scoring algorithm for places Structure of Facts DB for efficient

storage and retrieval Develop on quality control methods like

considering user feedback and credibility

58

Page 59: Ad hoc  data integration for mobile GI S a pplications

Open Questions At what point is the data integrated? When is it complete? Qualitative vs.

Quantitative Error recovery and correction

mechanism in FactsDB? Mapping of place’s score to LOD?

59

Page 60: Ad hoc  data integration for mobile GI S a pplications

Fall 2008Year 1Spring2009

Fall 2009Year 2Spring 2010

Fall 2010Year 3 Spring 2011

• Literature review

• Develop overall framework

• Start to develop research questions and focus area.

• Literature review

• Develop research questions

• Define use cases

• Make a prototype of one use case - TourGuide

• Develop concept and methods for RQ1

• Implement parts of TourGuide

• Develop user tests for input to RQ2 and RQ3

• Continue work on RQ1. Formalise parameters.

• Analyse input from user tests and combine with other parameters for RQ2

• Continue work with RQ2 and start RQ3

• Formalise parameters for data quality control

• Perform evaluation of data, define and implement quality assessing/controlling parameters for FBDI

• Finalize publications

• Thesis write-up

Milestones60

Page 61: Ad hoc  data integration for mobile GI S a pplications

Summary: Expected contributions Working system and framework for ad hoc data

integration, that will work for certain flavours Methodology of Flavour based data integration

(RQ1) Structure Algorithm for efficient data source selection depending on “flavour” Algorithm for scoring different places depending on number of parameters.

Concept and structure of FactsDB that will work with data from the MRDB for enrichment (RQ2)

Improved and adapted parameters and a mechanism for checking the quality of the integrated data and some test cases (RQ3)

61

Page 62: Ad hoc  data integration for mobile GI S a pplications

The GenW2 Framework62

ParserParsedQuery

Unranked dataset

Ranked dataset

Data

Static datasets

Facts DB

MRDB

Intelligent Ranker

Data sources

Data Integrator

Web Interaction Component

Wrapper Generator

Data Transformer

Web

Image metadata

Webservices

Webpages

Page 63: Ad hoc  data integration for mobile GI S a pplications

Thank you! Questions/Feedback?Ramya Venkateswaran ([email protected])

Demo and slides at http://www.geo.uzh.ch/~ramya/k

olloquium/

63