icps 2006, lyon, france, june 26, 2006 a decentralized content-based aggregation service for...

28
ICPS 2006, Lyon, France, June 26, 2006 A Decentralized Content- based Aggregation Service for Pervasive Environments Nanyan Jiang, Cristina Schmidt, Manish Parashar The Applied Software Systems Laboratory Rutgers, The State University of New Jersey Piscataway, NJ USA ICPS 2006, Lyon, France

Upload: myra-york

Post on 06-Jan-2018

215 views

Category:

Documents


0 download

DESCRIPTION

ICPS 2006, Lyon, France, June 26, Motivation Ubiquity of devices with embedded computing and communications capabilities Emergence of pervasive Grid environments and applications Realize information-driven, context-aware, computationally intensive end-to-end distributed applications System, application, information uncertainty! Programming abstractions and middleware services Scalable, self-managing, adaptive Based on content Asynchronous and decoupled (opportunistic) interactions Efficient and flexible information access, aggregation, assimilation

TRANSCRIPT

Page 1: ICPS 2006, Lyon, France, June 26, 2006 A Decentralized Content-based Aggregation Service for Pervasive Environments Nanyan Jiang, Cristina Schmidt, Manish

ICPS 2006, Lyon, France, June 26,

2006

A Decentralized Content-based Aggregation Service for Pervasive Environments

Nanyan Jiang, Cristina Schmidt, Manish ParasharThe Applied Software Systems Laboratory

Rutgers, The State University of New JerseyPiscataway, NJ USA

ICPS 2006, Lyon, France

Page 2: ICPS 2006, Lyon, France, June 26, 2006 A Decentralized Content-based Aggregation Service for Pervasive Environments Nanyan Jiang, Cristina Schmidt, Manish

ICPS 2006, Lyon, France, June 26, 2006

2

Outline Motivation, requirements, challenges Meteor: Content-based

interaction/messaging middleware Content-based decentralized aggregation

service Deployment and evaluation Related work Conclusion

Page 3: ICPS 2006, Lyon, France, June 26, 2006 A Decentralized Content-based Aggregation Service for Pervasive Environments Nanyan Jiang, Cristina Schmidt, Manish

ICPS 2006, Lyon, France, June 26, 2006

3

Motivation Ubiquity of devices with embedded computing and

communications capabilities Emergence of pervasive Grid environments and

applications Realize information-driven, context-aware, computationally

intensive end-to-end distributed applications System, application, information uncertainty! Programming abstractions and middleware services

Scalable, self-managing, adaptive Based on content Asynchronous and decoupled (opportunistic) interactions Efficient and flexible information access, aggregation,

assimilation

Page 4: ICPS 2006, Lyon, France, June 26, 2006 A Decentralized Content-based Aggregation Service for Pervasive Environments Nanyan Jiang, Cristina Schmidt, Manish

ICPS 2006, Lyon, France, June 26, 2006

5

Data Aggregation and Assimilation in Pervasive Grid Environments Large volumes of heterogeneous and

continuous data streams Centralized, batch-processing solutions not

feasible Dynamic system behavior

Availability, quality of data Dynamic application requirements

Page 5: ICPS 2006, Lyon, France, June 26, 2006 A Decentralized Content-based Aggregation Service for Pervasive Environments Nanyan Jiang, Cristina Schmidt, Manish

ICPS 2006, Lyon, France, June 26, 2006

6

Project Meteor: Middleware Stack A self-organizing tiered overlay

network Bottom tier is a structured overlay, e.g.,

Chord, CAN, etc. Content-based routing engine (Squid)

Flexible content-based routing and querying with guarantees and bounded costs

Decentralized information discovery Associative Rendezvous Messaging

Symmetric post programming primitive Content-based decoupled interaction

with programmable reactive behavior Decentralized content-based in-

network aggregation service

AR messaging

Content-based Routing

Self-organizing Overlay

Pervasive Grid Environment

Pervasive Application

Agg

rega

tion

Page 6: ICPS 2006, Lyon, France, June 26, 2006 A Decentralized Content-based Aggregation Service for Pervasive Environments Nanyan Jiang, Cristina Schmidt, Manish

ICPS 2006, Lyon, France, June 26, 2006

7

Associative Rendezvous (AR)

Content-based decoupled interactions: All interactions are based on content, rather than names or

addresses The participants (e.g. senders and receivers) communicate

through an intermediary, the rendezvous point The communication is asynchronous. The participants can be

decoupled both in space and time. Programmable reactive behaviors:

The reactive behaviors at the rendezvous points are encapsulated within messages => flexibility, expressivity, and multiple interaction semantics

Page 7: ICPS 2006, Lyon, France, June 26, 2006 A Decentralized Content-based Aggregation Service for Pervasive Environments Nanyan Jiang, Cristina Schmidt, Manish

ICPS 2006, Lyon, France, June 26, 2006

8

The Semantics of Associative Rendezvous Interactions

Messages: (header, action, data) Symmetric post primitive: does not

differentiating between interest/data Associative selection

match between interest and data profiles

Reactive behavior Execute action field upon matching

profile credentials

message context

TTL (time to live)

Action

[Data]

Header

• store• retrieve• notify• delete

Profile = list of (attribute, value) pairs:Example: <(sensor_type, temperature), (latitude, 10), (longitude, 20)>

C1

C2

post (<p1, p2>, store, data)

notify_data(C2)

notify(C2)

(1)

(3)

(2) post(<p1, *>, notify_data(C2) )

<p1, p2> <p1, *>

match

Page 8: ICPS 2006, Lyon, France, June 26, 2006 A Decentralized Content-based Aggregation Service for Pervasive Environments Nanyan Jiang, Cristina Schmidt, Manish

ICPS 2006, Lyon, France, June 26, 2006

11

Content-based Aggregation Services Extends the AR programming abstraction to enable content-

based aggregation services AR post(<L, speed, 3600>, retrieve(AVERAGE, 300))

Region L: (35-60, 40-80) find the average speed in the stretch of road specified by region

L every 5 minutes for the next 1 hour

Compute the traffic bottleneck along Route 95 from East Brunswick to NYC in the next one hour

Estimate the time needed to enter NYC

1. Anyone exceed speed limit2. Where is the maximum speed on road3. Minimum speed on road and where

Page 9: ICPS 2006, Lyon, France, June 26, 2006 A Decentralized Content-based Aggregation Service for Pervasive Environments Nanyan Jiang, Cristina Schmidt, Manish

ICPS 2006, Lyon, France, June 26, 2006

12

Self-organizing Overlay (Chord) A self-organizing P2P ring overlay Nodes and data have unique identifiers (keys), from a circular key

space (0 to 2m) Each node maintains a routing table, called “finger table” A key is stored at the first node whose identifier is greater or equal

with the key The request is routed to the neighbor node closer to the destination Routes in O(log n) hops

5

1

2

3

4

6

7

0

Finger = the successor of (this node id + 2i-1) mod 2m, 0 <= i <= m

3 + 1 53 + 2 5 3 + 4 0

1 + 1 31 + 2 31 + 4 5

0 + 1 10 + 2 3 0 + 4 5

Routing from node 1 to node 6

Page 10: ICPS 2006, Lyon, France, June 26, 2006 A Decentralized Content-based Aggregation Service for Pervasive Environments Nanyan Jiang, Cristina Schmidt, Manish

ICPS 2006, Lyon, France, June 26, 2006

13

SquidTON: Reliability and Fault Tolerance Pervasive Grid systems are dynamic, with nodes joining, leaving

and failing relatively often => data loss and temporarily inconsistent overlay structure => the system cannot offer guarantees

Build redundancy into the overlay network Replicate the data

SquidTON = Squid Two-tier Overlay Network Consecutive nodes form unstructured groups, and at the same time are

connected by a global structured overlay (e.g. Chord) Data is replicated in the group

3325

22

82

86

9194

128132150

157

141

231

249240

185

192

1710

Groups of nodes Group identifier

250231

249 240

185192

128

132

150

157141

161

8286

9194

3325

22

1710

102

41

Page 11: ICPS 2006, Lyon, France, June 26, 2006 A Decentralized Content-based Aggregation Service for Pervasive Environments Nanyan Jiang, Cristina Schmidt, Manish

ICPS 2006, Lyon, France, June 26, 2006

14

Content-based Routing (Squid)

Route based on semantic headers Use a dimension-reducing mappings to map from

multidimensional information space => 1-dimensional index space can route based on keywords, partial keywords, wildcards and

ranges Builds on an underlying DHT lookup mechanism

Offers guarantees: all destinations matched by the content will be identified with bounded costs (messages, intermediate nodes, etc.)

Semantic Header(kw1, kw2, …, kwD)

SFCPoint in a D-dimensional space

Point in a 1-dimensional index space

Rendezvous Peers

(P1, P2, …Pk, …)

Page 12: ICPS 2006, Lyon, France, June 26, 2006 A Decentralized Content-based Aggregation Service for Pervasive Environments Nanyan Jiang, Cristina Schmidt, Manish

ICPS 2006, Lyon, France, June 26, 2006

15

Squid – Definitions Keyword tuple (used to specify the destination)

List of d keywords, wildcards and/or ranges Example:

(temperature, celsius), (temp*, *), (*, 10, 20-25)

Simple keyword tuple Contains only complete keywords

Complex keyword tuple Contains wildcards and/or ranges

Sensor type

Celsius

Simple keyword tuple

Temperature

Uni

ts

Latitude

Sens

or ty

pe

Longitude

2025

10

Complex keyword tuple

Page 13: ICPS 2006, Lyon, France, June 26, 2006 A Decentralized Content-based Aggregation Service for Pervasive Environments Nanyan Jiang, Cristina Schmidt, Manish

ICPS 2006, Lyon, France, June 26, 2006

16

Content Indexing: Hilbert SFC f: Nd N, recursive generation

Properties: Digital causality Locality preserving Clustering

1

0

10 00 01 10 11

11

10

01

0000 11

01 10

0000

0010

0001

0011

0100

01010110

011110001001

101010111100

11111110

1101

Cluster: group of cells connected by a segment of the curve

Page 14: ICPS 2006, Lyon, France, June 26, 2006 A Decentralized Content-based Aggregation Service for Pervasive Environments Nanyan Jiang, Cristina Schmidt, Manish

ICPS 2006, Lyon, France, June 26, 2006

17

Content Indexing and Routing

4

4 70

3

Generate the clusters associated with the content profile

altit

ude

longitude

0

13

33

47

51

40

(*, 4)Content profile 3:

Rout to the nodes that store the clusters

Content profile 2

Content profile 3

(4-7,0-3)Content profile 2:

Matching messages

Send the results to the requesting node

2

1 7

7

(2,1)Content profile 1:

Content profile1

Page 15: ICPS 2006, Lyon, France, June 26, 2006 A Decentralized Content-based Aggregation Service for Pervasive Environments Nanyan Jiang, Cristina Schmidt, Manish

ICPS 2006, Lyon, France, June 26, 2006

18

Squid: Routing Optimization

More than one cluster are typically stored at a node

Not all clusters that are generated for a query exist in the network => optimize!

SFC generation recursive => clusters generation is recursive => the process of cluster generation can be viewed as a prefix tree (trie)

Optimization: embed the tree into the overlay, and prune nodes during the construction phase

Page 16: ICPS 2006, Lyon, France, June 26, 2006 A Decentralized Content-based Aggregation Service for Pervasive Environments Nanyan Jiang, Cristina Schmidt, Manish

ICPS 2006, Lyon, France, June 26, 2006

19

Decentralized In-network Aggregation Post AR messages with aggregation behaviors in the system

Content indexing: from n-dimension to 1-dimension using Hilbert SFC Content-based routing: prefix trie construction Decentralized in-network aggregation

Use aggregation tries for back-propagating and aggregating matching data items

Post (<p1, p2, …, pN>, action (aggrBehavior))

Area in the N-dimensional semantic space

Segments in the 1-dimensional index space

Matched data itemsTrie-based In-network

AggregationTrie-based Content Routing

Indexing using SFC

Trie constructionBack-propagation of matched data

Page 17: ICPS 2006, Lyon, France, June 26, 2006 A Decentralized Content-based Aggregation Service for Pervasive Environments Nanyan Jiang, Cristina Schmidt, Manish

ICPS 2006, Lyon, France, June 26, 2006

20

Trie Construction111

110

101

100

011

010

001

000

speed

Long

itude

0

1

1100

01 10

00

01

01

10

11

0000 0001

0011

0100

0101 0110

0111 1000

1001 1010

1011

11001101

1110 1111

0010

post(<011, 010-110>, retrieve (AVG, 300))

000101

011011

011111001010

0010*

0*

011*

011100

0111*

011011

011100

011111

001011

001010

Recursive resolution of message profile to construct prefix trie

Page 18: ICPS 2006, Lyon, France, June 26, 2006 A Decentralized Content-based Aggregation Service for Pervasive Environments Nanyan Jiang, Cristina Schmidt, Manish

ICPS 2006, Lyon, France, June 26, 2006

21

Trie-based Content Routing

Embed the tree into overlay network Node 100001 is responsible for storing the subtree

routed at 011*

000000

000100

001001

001111

100001

1110000*

00

011*0010*

0110110111*

001010001001

Page 19: ICPS 2006, Lyon, France, June 26, 2006 A Decentralized Content-based Aggregation Service for Pervasive Environments Nanyan Jiang, Cristina Schmidt, Manish

ICPS 2006, Lyon, France, June 26, 2006

22

Trie-based In-Network Aggregation (I)

AR message was issued at node 111000 First recursion level results in clusters with prefix 0 At node 000000, further level recursion, the cluster 011 and 0010 are

identified and padded with 0, for example; Node receive the sub-queries generate next level recursion, …

1110000*

<profile_entry> <query>(011,010-110)</query> <TTL>20</TTL> <source>111000</source> <prefix_list> <prefix>0</prefix> <prefix>0010</prefix> <prefix_list><profile_entry>

000000

001001

0010*

100001

001111001010

011*

<profile_entry> ...... <prefix_list> <prefix>0</prefix> <prefix>0010</prefix> <prefix>001010</prefix> <prefix_list><profile_entry>

Page 20: ICPS 2006, Lyon, France, June 26, 2006 A Decentralized Content-based Aggregation Service for Pervasive Environments Nanyan Jiang, Cristina Schmidt, Manish

ICPS 2006, Lyon, France, June 26, 2006

23

Trie-based In-Network aggregation (II)

Caching to improve latencies Managing system dynamics

Joins, Leaves, Fails Load balancing Building geographical awareness into the overlay

1110000*

<message> <profile>(011,010-110)</profile> ...... <aggr_val>...<aggr_value> <source>111000</source> <prefix_list> <prefix>0</prefix> <prefix>0010</prefix> <prefix_list><message>

000000

001001

0010*

100001

001111001010

011*

<profile_entry> ...... <source>111000</source><profile_entry>

Page 21: ICPS 2006, Lyon, France, June 26, 2006 A Decentralized Content-based Aggregation Service for Pervasive Environments Nanyan Jiang, Cristina Schmidt, Manish

ICPS 2006, Lyon, France, June 26, 2006

24

Miscellaneous Issues and Optimizations Varied Link Latencies Caching to reduce routing latencies Load balacing Addressing churn

Join, leave Failures

Leaf peers, intermediate peers, source peer Geographical locality (to an extent)

Page 22: ICPS 2006, Lyon, France, June 26, 2006 A Decentralized Content-based Aggregation Service for Pervasive Environments Nanyan Jiang, Cristina Schmidt, Manish

ICPS 2006, Lyon, France, June 26, 2006

25

Meteor: Implementation Overview Current implementation builds on JXTA Chord, Squid and AR Messaging layer are

implemented as event-driven JXTA services Deployments

Local area network 64 1.6 GHz Pentium IV nodes with 100 Mbps Ethernet

interconnection network Orbit sensor network testbed

400 802.11 radio nodes with 802.11 wireless connections PlanetLab wide-area testbed

Page 23: ICPS 2006, Lyon, France, June 26, 2006 A Decentralized Content-based Aggregation Service for Pervasive Environments Nanyan Jiang, Cristina Schmidt, Manish

ICPS 2006, Lyon, France, June 26, 2006

26

Experimental Evaluation Aggregation services:

Complex queries issued from each of the three deployment environments

Post(34-444,temp*, retrieve(count)) Effectiveness of in-network aggregation

Aggregate range queries Number of messages processed in the network

Robustness in case of single peer failures

Note that the performance of Squid and AR have been separately evaluated

Page 24: ICPS 2006, Lyon, France, June 26, 2006 A Decentralized Content-based Aggregation Service for Pervasive Environments Nanyan Jiang, Cristina Schmidt, Manish

ICPS 2006, Lyon, France, June 26, 2006

27

Scalability of the Aggregation Services

Scalability of Meteor aggregation queries

0

500

1000

1500

2000

2500

3000

4 8 16 32 64

Number of peers

time

(ms) Wired LAN

Orbit Wireless Testbed

Wide-area PlanetLab

Aggregate query: post(<(100-300, 75-125), temperature>, retrieve(avg, 300))

Page 25: ICPS 2006, Lyon, France, June 26, 2006 A Decentralized Content-based Aggregation Service for Pervasive Environments Nanyan Jiang, Cristina Schmidt, Manish

ICPS 2006, Lyon, France, June 26, 2006

28

Effectiveness of In-Network Aggregation Simulation results

500 nodes AR aggregate message:

Post(<100-230, 35-120>, retrieve(avg))

Fig (a) number of messages with/without in-network aggregation

Fig (b) actual number of messages per peer

Page 26: ICPS 2006, Lyon, France, June 26, 2006 A Decentralized Content-based Aggregation Service for Pervasive Environments Nanyan Jiang, Cristina Schmidt, Manish

ICPS 2006, Lyon, France, June 26, 2006

29

Robustness of Aggregation Service

0

10

20

30

40

50

60

70

Time (milliseconds)

Num

ber

of in

clud

ed r

esul

ts

Time to aggregate all

Single node failure

Aggregate trie reconstruction

Reconstructed trie Assume single node failure An intermediate node fails

at about 205 time ticks On-going aggregation

operations are lost at this node

About 100 ms required to correct the aggregation trie in a LAN environment

Page 27: ICPS 2006, Lyon, France, June 26, 2006 A Decentralized Content-based Aggregation Service for Pervasive Environments Nanyan Jiang, Cristina Schmidt, Manish

ICPS 2006, Lyon, France, June 26, 2006

30

Related works Aggregation in P2P environments

Astrolabe, Cone, PHT Aggregation in sensor networks

Cougar, TAG Meteor

Content-based aggregation abstraction Asynchronous and decoupled

No persistent state maintained in the network Trie reused across aggregations Guarantees

Page 28: ICPS 2006, Lyon, France, June 26, 2006 A Decentralized Content-based Aggregation Service for Pervasive Environments Nanyan Jiang, Cristina Schmidt, Manish

ICPS 2006, Lyon, France, June 26, 2006

31

Summary Content-based, decoupled interactions and

aggregations are essential for pervasive Grid applications

Meteor provides unified programming abstractions for content-based decoupled interactions and decentralized in-network aggregation

Recurrent and spatially constrained queries Scalable and efficient trie-based design and implementation

Deployments on LAN, Orbit and PlanetLab Scalability, performance, robustness