race: time series compression with rate adaptivity and error bound for sensor networks huamin chen,...
Post on 22-Dec-2015
222 views
TRANSCRIPT
RACE: Time Series Compression with Rate Adaptivity and Error Bound for Sensor Networks
Huamin Chen, Jian Li, and Prasant Mohapatra
Presenter: Jian Li
Networks Lab @ UC Davis [email protected]
Agenda
Motivation
Background
RACE Algorithm
Numerical Evaluation
Conclusion
Networks Lab @ UC Davis [email protected]
Motivation
Sensor Networks Limited energy source
Limited link bandwidth, may be time-varying
Monitoring processContinuous data generation and dissemination
Data rate may be large, and time-varying
How to disseminate efficiently?Compression and aggregation
Networks Lab @ UC Davis [email protected]
Data Quality: Impact factors
Sampling frequency
Number of sampling nodes
Data dissemination Compression
Aggregation
Networks Lab @ UC Davis [email protected]
Why Compress?
How to get “properly small” data rate? Lower sampling frequency
Reduce the number of sensors
Lossy/lossless compression
Low sampling frequency is not equivalent to (lossy) compression of higher-precision raw data. E.g.: whether detailed features along timeline can be retained?
Lossy compression is able to adapt to various link constraints.
Networks Lab @ UC Davis [email protected]
But, how about Error Bound?
Volatile physical process Data rate of time series could vary in a large range
Different compressibility at different time instances
Lossy compression cannot guarantee error bound, given a target output data rate
Consistency of data quality? Multihop network transmission
Multiple time series compression
Networks Lab @ UC Davis [email protected]
So, Our goal is …
Adaptive compression Compress time series into CBR/LBR flow
Trade-off: network capacity v.s. data quality
Improve data quality Exploit different compressibility along timeline to achieve certain
error bound
Consistency of data quality among multiple time series compression
Networks Lab @ UC Davis [email protected]
Error norm of time series
Data Quality: Error Norm
Normalized data element
Normalized data error
ei =
Networks Lab @ UC Davis [email protected]
Haar Wavelet Transformation
Compute neighboring elements’ average and difference Average: trend of time series
Difference: details of time series
An example: original time series is [2, 6, 5, 11], we get transformation output [6, -2, -2, -3].
Networks Lab @ UC Davis [email protected]
Wavelet coefficient tree
Time series: [3, 4, 3, 2, 6, 8, 9, 7, 2, 3, 1, 2, 10, 8, 7, 9]Output coefficients: [5.25, 0, -2.25, -3.25, 0.5, -0.5, 0.5, 0.5, -0.5, 0.5, -1, 1, -0.5, -0.5, 1, -1]
Networks Lab @ UC Davis [email protected]
Data Element Reconstruction
and, Cj is individual coefficient.
Networks Lab @ UC Davis [email protected]
Reconstruction: example
Calculation: +(5.25) +(0) -(-2.25) +(-0.5) +(-1) 6
Networks Lab @ UC Davis [email protected]
Magnitude-based zeroing
Given a threshold a if coefficient Cj < a, then this coefficient leaf is cut off
and does not participate in reconstruction process.
Networks Lab @ UC Davis [email protected]
RACE Algorithm
Generating gradient error tree
Error-based zeroing (i.e., compression process)
Smoothing error bound via patching process
Networks Lab @ UC Davis [email protected]
Gradient Error Tree
Gradient Error G(V) V is a coefficient in wavelet coefficient tree
G(V) is defined as the max error that is incurred when the sub-tree rooted from node V is cut off:
Gradient Error Tree Computed from corresponding wavelet coefficient tree
Networks Lab @ UC Davis [email protected]
Gradient Error Tree: an exampleTime series: [3, 4, 3, 2, 6, 8, 9, 7, 2, 3, 1, 2, 10, 8, 7, 9]
Coefficients: [5.25, 0, -2.25, -3.25, 0.5, -0.5, 0.5, 0.5, -0.5, 0.5, -1, 1, -0.5, -0.5, 1, -1]
Networks Lab @ UC Davis [email protected]
Error based zeroing
Using error bound as threshold value, according to gradient error tree, apply magnitude-based zeroing to wavelet coefficient tree
Use symbol “t” to represent a zero-ed subtree
Networks Lab @ UC Davis [email protected]
Error based zeroing
Example: threshold = 2 result in 8 symbols to encode
Networks Lab @ UC Davis [email protected]
Error based zeroing
Example: threshold = 4 results in 6 symbols to encode
Networks Lab @ UC Davis [email protected]
Important Properties
Error bound additivity Multihop network transmission
Multiple time series aggregation
Patch-ability Exploiting varying compressibility of input stream along timeline
Smoothing error range of output stream
Networks Lab @ UC Davis [email protected]
Numerical evaluation
Data set Real world data from TAO project (http://www.pmel.noaa.gov/tao
) Including air temperature and subsurface temperature at
different depths
Air temperature characteristics
Networks Lab @ UC Davis [email protected]
Preservation of statistical interpretation How well to preserve multivariate correlationship?
Cross correlation between variables x and y is defined as:
Where d is temporal delay between x and y.
Networks Lab @ UC Davis [email protected]
Conclusion
Rate adaptive compression scheme
Improve error bound, achieving soft guarantee
Preservation of multivariate correlationship