race: time series compression with rate adaptivity and error bound for sensor networks huamin chen,...

27
RACE: Time Series Compression with Rate Adaptivity and Error Bound for Sensor Networks Huamin Chen, Jian Li, and Prasant Mohapatra Presenter: Jian Li

Post on 22-Dec-2015

222 views

Category:

Documents


2 download

TRANSCRIPT

RACE: Time Series Compression with Rate Adaptivity and Error Bound for Sensor Networks

Huamin Chen, Jian Li, and Prasant Mohapatra

Presenter: Jian Li

Networks Lab @ UC Davis [email protected]

Agenda

Motivation

Background

RACE Algorithm

Numerical Evaluation

Conclusion

Networks Lab @ UC Davis [email protected]

Motivation

Sensor Networks Limited energy source

Limited link bandwidth, may be time-varying

Monitoring processContinuous data generation and dissemination

Data rate may be large, and time-varying

How to disseminate efficiently?Compression and aggregation

Networks Lab @ UC Davis [email protected]

Data Quality: Impact factors

Sampling frequency

Number of sampling nodes

Data dissemination Compression

Aggregation

Networks Lab @ UC Davis [email protected]

Why Compress?

How to get “properly small” data rate? Lower sampling frequency

Reduce the number of sensors

Lossy/lossless compression

Low sampling frequency is not equivalent to (lossy) compression of higher-precision raw data. E.g.: whether detailed features along timeline can be retained?

Lossy compression is able to adapt to various link constraints.

Networks Lab @ UC Davis [email protected]

But, how about Error Bound?

Volatile physical process Data rate of time series could vary in a large range

Different compressibility at different time instances

Lossy compression cannot guarantee error bound, given a target output data rate

Consistency of data quality? Multihop network transmission

Multiple time series compression

Networks Lab @ UC Davis [email protected]

So, Our goal is …

Adaptive compression Compress time series into CBR/LBR flow

Trade-off: network capacity v.s. data quality

Improve data quality Exploit different compressibility along timeline to achieve certain

error bound

Consistency of data quality among multiple time series compression

Networks Lab @ UC Davis [email protected]

Error norm of time series

Data Quality: Error Norm

Normalized data element

Normalized data error

ei =

Networks Lab @ UC Davis [email protected]

Haar Wavelet Transformation

Compute neighboring elements’ average and difference Average: trend of time series

Difference: details of time series

An example: original time series is [2, 6, 5, 11], we get transformation output [6, -2, -2, -3].

Networks Lab @ UC Davis [email protected]

Wavelet coefficient tree

Time series: [3, 4, 3, 2, 6, 8, 9, 7, 2, 3, 1, 2, 10, 8, 7, 9]Output coefficients: [5.25, 0, -2.25, -3.25, 0.5, -0.5, 0.5, 0.5, -0.5, 0.5, -1, 1, -0.5, -0.5, 1, -1]

Networks Lab @ UC Davis [email protected]

Data Element Reconstruction

and, Cj is individual coefficient.

Networks Lab @ UC Davis [email protected]

Reconstruction: example

Calculation: +(5.25) +(0) -(-2.25) +(-0.5) +(-1) 6

Networks Lab @ UC Davis [email protected]

Magnitude-based zeroing

Given a threshold a if coefficient Cj < a, then this coefficient leaf is cut off

and does not participate in reconstruction process.

Networks Lab @ UC Davis [email protected]

RACE Algorithm

Generating gradient error tree

Error-based zeroing (i.e., compression process)

Smoothing error bound via patching process

Networks Lab @ UC Davis [email protected]

Gradient Error Tree

Gradient Error G(V) V is a coefficient in wavelet coefficient tree

G(V) is defined as the max error that is incurred when the sub-tree rooted from node V is cut off:

Gradient Error Tree Computed from corresponding wavelet coefficient tree

Networks Lab @ UC Davis [email protected]

Gradient Error Tree: an exampleTime series: [3, 4, 3, 2, 6, 8, 9, 7, 2, 3, 1, 2, 10, 8, 7, 9]

Coefficients: [5.25, 0, -2.25, -3.25, 0.5, -0.5, 0.5, 0.5, -0.5, 0.5, -1, 1, -0.5, -0.5, 1, -1]

Networks Lab @ UC Davis [email protected]

Error based zeroing

Using error bound as threshold value, according to gradient error tree, apply magnitude-based zeroing to wavelet coefficient tree

Use symbol “t” to represent a zero-ed subtree

Networks Lab @ UC Davis [email protected]

Error based zeroing

Example: threshold = 2 result in 8 symbols to encode

Networks Lab @ UC Davis [email protected]

Error based zeroing

Example: threshold = 4 results in 6 symbols to encode

Networks Lab @ UC Davis [email protected]

Important Properties

Error bound additivity Multihop network transmission

Multiple time series aggregation

Patch-ability Exploiting varying compressibility of input stream along timeline

Smoothing error range of output stream

Networks Lab @ UC Davis [email protected]

Numerical evaluation

Data set Real world data from TAO project (http://www.pmel.noaa.gov/tao

) Including air temperature and subsurface temperature at

different depths

Air temperature characteristics

Networks Lab @ UC Davis [email protected]

Adaptive Compression : Max normalized error

Networks Lab @ UC Davis [email protected]

Adaptive Compression:smoothed max normalized error

Networks Lab @ UC Davis [email protected]

Preservation of statistical interpretation How well to preserve multivariate correlationship?

Cross correlation between variables x and y is defined as:

Where d is temporal delay between x and y.

Networks Lab @ UC Davis [email protected]

Data sets

Subsurface temperatures at depths 25m and 50m

Networks Lab @ UC Davis [email protected]

Cross relation under different compression ratios

Networks Lab @ UC Davis [email protected]

Conclusion

Rate adaptive compression scheme

Improve error bound, achieving soft guarantee

Preservation of multivariate correlationship