functions of a web warehouse kai cheng, yahiko kambayashi, seok tae lee graduate school of...

24
Functions of a Web Warehouse Kai Cheng, Yahiko Kambayashi, Seok Tae Lee Graduate School of Informatics, Kyoto Universit y, Japan and Mukesh Mohania Western Michigan University, USA

Upload: kerry-boone

Post on 18-Jan-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Functions of a Web Warehouse Kai Cheng, Yahiko Kambayashi, Seok Tae Lee Graduate School of Informatics, Kyoto University, Japan and Mukesh Mohania Western

Functions of a Web Warehouse

Kai Cheng, Yahiko Kambayashi, Seok Tae Lee Graduate School of Informatics, Kyoto University, Japan

and Mukesh MohaniaWestern Michigan University, USA

Page 2: Functions of a Web Warehouse Kai Cheng, Yahiko Kambayashi, Seok Tae Lee Graduate School of Informatics, Kyoto University, Japan and Mukesh Mohania Western

13-16 November 2000

ICDL 2000 2

Table of ContentsSurvival from “Information Explosion”Warehouse-Mediated Content DeliveryCommunity-Oriented Web WarehousesTechnical IssuesWarehouse Enhanced Web CachingRelated Work Concluding Remarks

Page 3: Functions of a Web Warehouse Kai Cheng, Yahiko Kambayashi, Seok Tae Lee Graduate School of Informatics, Kyoto University, Japan and Mukesh Mohania Western

13-16 November 2000

ICDL 2000 3

Survival from “Information Explosion”

Web Traffic Doubled Every 3-6 Months Exponential Growth of the Web

– 1 Billion Pages , January 2000– 2 Billion Pages , June 2000 – 100 Times Increase in the Next 2 Years

Information Overloadfor both Nets and Users

Page 4: Functions of a Web Warehouse Kai Cheng, Yahiko Kambayashi, Seok Tae Lee Graduate School of Informatics, Kyoto University, Japan and Mukesh Mohania Western

13-16 November 2000

ICDL 2000 4

Scale up the Web and Internet

More Bandwidth– Never Keep Pace with the Traffic Growth

More Server Capacity– How to Deal with “Hot-Spots” ?

Site Replication– Only Benefit Replicated Servers

?

Page 5: Functions of a Web Warehouse Kai Cheng, Yahiko Kambayashi, Seok Tae Lee Graduate School of Informatics, Kyoto University, Japan and Mukesh Mohania Western

13-16 November 2000

ICDL 2000 5

Our Approach

Tame the Chaotic Info. Streams

Saving Redundant Data

Transfers

Unite the Individual Users

Sharing Findings and Efforts of Each Other

Page 6: Functions of a Web Warehouse Kai Cheng, Yahiko Kambayashi, Seok Tae Lee Graduate School of Informatics, Kyoto University, Japan and Mukesh Mohania Western

13-16 November 2000

ICDL 2000 6

Warehouse-Mediated Content Delivery

Direct Delivery

QoS: Server, Network Overloaded Personalized Services Unrealistic Information Hunting Difficult

InternetInternet

Page 7: Functions of a Web Warehouse Kai Cheng, Yahiko Kambayashi, Seok Tae Lee Graduate School of Informatics, Kyoto University, Japan and Mukesh Mohania Western

13-16 November 2000

ICDL 2000 7

Indirect Content Delivery

StorageOutput

AnalysisAnalysis

NotificationNotification

TransformationTransformation

BufferingBuffering

WWWWWW

Input

Resource DiscoveryResource Discovery

Clu

steri

ng

Clu

steri

ng SearchingSearching

NavigationNavigationFilt

eri

ng

Filt

eri

ng

Web Warehouse

Web Warehouse

Page 8: Functions of a Web Warehouse Kai Cheng, Yahiko Kambayashi, Seok Tae Lee Graduate School of Informatics, Kyoto University, Japan and Mukesh Mohania Western

13-16 November 2000

ICDL 2000 8

Community-Oriented Web Warehousing

Sharing

Contribution

The Community of Users* People with Special

Information Needs/Interests

Page 9: Functions of a Web Warehouse Kai Cheng, Yahiko Kambayashi, Seok Tae Lee Graduate School of Informatics, Kyoto University, Japan and Mukesh Mohania Western

13-16 November 2000

ICDL 2000 9

Examples of User Community

Sports FanPatients

BusinessmanResearchers

Page 10: Functions of a Web Warehouse Kai Cheng, Yahiko Kambayashi, Seok Tae Lee Graduate School of Informatics, Kyoto University, Japan and Mukesh Mohania Western

13-16 November 2000

ICDL 2000 10

Real/Cyber Communities

(a) Real CommunitiesDependent on Location

(b) Cyber CommunitiesIndependent on Location

Page 11: Functions of a Web Warehouse Kai Cheng, Yahiko Kambayashi, Seok Tae Lee Graduate School of Informatics, Kyoto University, Japan and Mukesh Mohania Western

13-16 November 2000

ICDL 2000 11

Technical Issues

Functions of a Web Warehouse Web Caching vs. Web WarehousingData Warehousing vs. Web

Warehousing Dynamic Hierarchical Web

Warehouses

Page 12: Functions of a Web Warehouse Kai Cheng, Yahiko Kambayashi, Seok Tae Lee Graduate School of Informatics, Kyoto University, Japan and Mukesh Mohania Western

13-16 November 2000

ICDL 2000 12

Functions of a Web Warehouse

Buffering

Transformation1. Transcoding2. Summarizing

Content Analysis

Notification

Resource DiscoveryResource Discovery StorageStorage ReusingReusing

TransformTransformFormat AFormat A Format BFormat B

Content AContent A TransformTransform Content BContent B

Data/InformationData/Information AnalysisAnalysis KnowledgeKnowledge

Page 13: Functions of a Web Warehouse Kai Cheng, Yahiko Kambayashi, Seok Tae Lee Graduate School of Informatics, Kyoto University, Japan and Mukesh Mohania Western

13-16 November 2000

ICDL 2000 13

Web Caching

Research Program

Content Analysis

Transformation

Warehousing

Page 14: Functions of a Web Warehouse Kai Cheng, Yahiko Kambayashi, Seok Tae Lee Graduate School of Informatics, Kyoto University, Japan and Mukesh Mohania Western

13-16 November 2000

ICDL 2000 14

From Web Caching to Web Warehousing

Web Caching Web Warehousing

Object Data Information

Objective Reusing Sharing

Storage Bounded Bound-Free

Population Responses Web View

Model FS Dependent Hypermedia

Page 15: Functions of a Web Warehouse Kai Cheng, Yahiko Kambayashi, Seok Tae Lee Graduate School of Informatics, Kyoto University, Japan and Mukesh Mohania Western

13-16 November 2000

ICDL 2000 15

From Data Warehousing to Web Warehousing

Items Data WH Web WH1 Objective Decision Support Information Sharing

2 Model RDB/OORDB Hypermedia

3 Population View Materialization

Resource Discovery

Content Localization

4 Resource Operational Data Web Documents

5 Data Type Structured Semi-/Un-structured

6 Tie to Web DWH Web WWHWeb

Page 16: Functions of a Web Warehouse Kai Cheng, Yahiko Kambayashi, Seok Tae Lee Graduate School of Informatics, Kyoto University, Japan and Mukesh Mohania Western

13-16 November 2000

ICDL 2000 16

Warehouse as Shared Information Repository

Real Communities – Centralized Management of Warehouses– Unicast Data Transfer

Cyber Communities – Distributed Management of Warehouse– Multicast Data Transfer

Page 17: Functions of a Web Warehouse Kai Cheng, Yahiko Kambayashi, Seok Tae Lee Graduate School of Informatics, Kyoto University, Japan and Mukesh Mohania Western

13-16 November 2000

ICDL 2000 17

Hierarchy of Web Warehouses

HP DesignHP Design

SportsSports

SkiingSkiingTennisTennis

Mr. A, Ms. C Mrs. D …

Mr. A, Ms. C Mrs. D …

Mr. A. Mr. D…..

Mr. A. Mr. D…..

Page 18: Functions of a Web Warehouse Kai Cheng, Yahiko Kambayashi, Seok Tae Lee Graduate School of Informatics, Kyoto University, Japan and Mukesh Mohania Western

13-16 November 2000

ICDL 2000 18

Dynamic Formation of Web Warehouses (Split)

Tennis Skiing

A B

SportsSports

TennisTennis Skiing

AA BB

Page 19: Functions of a Web Warehouse Kai Cheng, Yahiko Kambayashi, Seok Tae Lee Graduate School of Informatics, Kyoto University, Japan and Mukesh Mohania Western

13-16 November 2000

ICDL 2000 19

Dynamic Formation of Web Warehouses (Union)

PaintingPaintingDrawing

AA BB

Painting & DrawingPainting & Drawing

AABB

Page 20: Functions of a Web Warehouse Kai Cheng, Yahiko Kambayashi, Seok Tae Lee Graduate School of Informatics, Kyoto University, Japan and Mukesh Mohania Western

13-16 November 2000

ICDL 2000 20

Current Status:Content-Sensitive Caching

Web Caching

Warehousing

Content SensitiveCaching

Content-Sensitive Caching

Page 21: Functions of a Web Warehouse Kai Cheng, Yahiko Kambayashi, Seok Tae Lee Graduate School of Informatics, Kyoto University, Japan and Mukesh Mohania Western

13-16 November 2000

ICDL 2000 21

Content-Sensitive Cache Replacement Policy

Cache Replacement : Keep? Replace?Traditional Caching

Long Time Observation Replacement Decision60% One-Access Objects How Differentiate ?

Content-Sensitive Caching LRU-SP+

Page 22: Functions of a Web Warehouse Kai Cheng, Yahiko Kambayashi, Seok Tae Lee Graduate School of Informatics, Kyoto University, Japan and Mukesh Mohania Western

13-16 November 2000

ICDL 2000 22

LRU-SP+: Content-SensitiveSize-Adjusted & Popularity-Aware LRU

Daily Indexing: Cache Content Indices

Indices Popular Topics How Similar?

New Document Popular Topics Benefit/Size Model

“Observed” Pop. + “Inherent” Pop. Implement this Model

Page 23: Functions of a Web Warehouse Kai Cheng, Yahiko Kambayashi, Seok Tae Lee Graduate School of Informatics, Kyoto University, Japan and Mukesh Mohania Western

13-16 November 2000

ICDL 2000 23

Related Work

LSAM’s Proxy Cache (Push)– Multicast-Based Virtual Cache– Affinity Groups and Push Channels

INTELSAT’s Wormhole Content Delivery – Warehouse-Koisk Model– Satellite-Based Delivery Platform

Page 24: Functions of a Web Warehouse Kai Cheng, Yahiko Kambayashi, Seok Tae Lee Graduate School of Informatics, Kyoto University, Japan and Mukesh Mohania Western

13-16 November 2000

ICDL 2000 24

Concluding Remarks

Proposed to Cope with the Scaling Problems by Web Warehouse-Mediated Content Delivery

Discussed the Basic Functions of a Web Warehouse: Buffering, Transformation, Notification and Content Analysis

Introduced our Current Work: Warehouse-Enhanced Web Caching