![Page 1: Model-driven Data Acquisition in Sensor Networks](https://reader036.vdocuments.us/reader036/viewer/2022062807/56815044550346895dbe44de/html5/thumbnails/1.jpg)
Model-driven Data Acquisition in Sensor
Networks
Amol Deshpande1,4 Carlos Guestrin4,2 Sam Madden4,3
Joe Hellerstein1,4 Wei Hong4
1UC Berkeley 2Carnegie Mellon University 3MIT 4Intel Research - Berkeley
![Page 2: Model-driven Data Acquisition in Sensor Networks](https://reader036.vdocuments.us/reader036/viewer/2022062807/56815044550346895dbe44de/html5/thumbnails/2.jpg)
Sensor networks and distributed systems
A collection of devices that can sense, actuate, and communicate over a wireless network
Available resources 4 MHz, 8 bit CPU 40 Kbps wireless 3V battery (lasts days or months)
Sensors for temperature, humidity, pressure, sound, magnetic fields, acceleration, visible and ultraviolet light, etc.
Analogous issues in other distributed systems, including streams and the Internet
![Page 3: Model-driven Data Acquisition in Sensor Networks](https://reader036.vdocuments.us/reader036/viewer/2022062807/56815044550346895dbe44de/html5/thumbnails/3.jpg)
Leach's Storm Petrel
Real deployments
Great Duck Island
Redwoods
Precision agriculture
Fabrication monitoring
![Page 4: Model-driven Data Acquisition in Sensor Networks](https://reader036.vdocuments.us/reader036/viewer/2022062807/56815044550346895dbe44de/html5/thumbnails/4.jpg)
SERVER
LAB
KITCHEN
COPYELEC
PHONEQUIET
STORAGE
CONFERENCE
OFFICEOFFICE
Example: Intel Berkeley Lab deployment
SERVER
LAB
KITCHEN
COPYELEC
PHONEQUIET
STORAGE
CONFERENCE
OFFICEOFFICE50
51
52 53
54
46
48
49
47
43
45
44
42 41
3739
38 36
33
3
6
10
11
12
13 14
1516
17
19
2021
22
242526283032
31
2729
23
18
9
5
8
7
4
34
1
2
3540
![Page 5: Model-driven Data Acquisition in Sensor Networks](https://reader036.vdocuments.us/reader036/viewer/2022062807/56815044550346895dbe44de/html5/thumbnails/5.jpg)
Every time step
Analogy:Sensor net as a database
TinyDBQuery
Distributequery
Collectquery answer
or data
SQL-stylequery
Declarative interface: Sensor nets are not just for PhDs Decrease deployment time
Data aggregation: Can reduce communication
![Page 6: Model-driven Data Acquisition in Sensor Networks](https://reader036.vdocuments.us/reader036/viewer/2022062807/56815044550346895dbe44de/html5/thumbnails/6.jpg)
Every time step
Limitations of existing approach
TinyDBQuery
Distributequery
Collectdata
New QuerySQL-style
query
Redoprocesseverytimequery
changes
Query distribution: Every node must receive query
Data collection: Every node must wake up at every time step Data loss ignored No quality guarantees Data inefficient – ignoring correlations
![Page 7: Model-driven Data Acquisition in Sensor Networks](https://reader036.vdocuments.us/reader036/viewer/2022062807/56815044550346895dbe44de/html5/thumbnails/7.jpg)
Sensor net data is correlated
Spatial-temporal correlation
Inter-attributed correlation
Data is not i.i.d. shouldn’t ignore missing data
Observing one sensor information about other sensors (and future values)
Observing one attribute information about other attributes
![Page 8: Model-driven Data Acquisition in Sensor Networks](https://reader036.vdocuments.us/reader036/viewer/2022062807/56815044550346895dbe44de/html5/thumbnails/8.jpg)
10 20 300
0.1
0.2
0.3
0.4
t
SQL-style query
with desired confidence
Model-driven data acquisition: overview
Probabilistic Model
10 20 300
0.1
0.2
0.3
0.4
Query
Data gathering
plan
Conditionon new
observations
10 20 300
0.1
0.2
0.3
0.4
New Query
posterior belief
Strengths of model-based data acquisition Observe fewer attributes Exploit correlations Reuse information between queries Directly deal with missing data Answer more complex (probabilistic) queries
![Page 9: Model-driven Data Acquisition in Sensor Networks](https://reader036.vdocuments.us/reader036/viewer/2022062807/56815044550346895dbe44de/html5/thumbnails/9.jpg)
Probabilistic models and queries
User’s perspective:QuerySELECT nodeId, temp ± 0.5°C, conf(.95) FROM sensorsWHERE nodeId in {1..8}
System selects and observes subset of nodesObserved nodes: {3,6,8}
Query result
Node 1 2 3 4 5 6 7 8
Temp. 17.3
18.1 17.4 16.1 19.2 21.3 17.5 16.3
Conf. 98%
95% 100% 99% 95% 100% 98% 100%
![Page 10: Model-driven Data Acquisition in Sensor Networks](https://reader036.vdocuments.us/reader036/viewer/2022062807/56815044550346895dbe44de/html5/thumbnails/10.jpg)
Probabilistic models and queries
Joint distribution P(X1,…,Xn)
Probabilistic queryExample:
Value of X2± with prob. > 1- Prob. below 1-?
Observe attributes
Example: Observe X1=18
P(X2|X1=18)
Higher prob.,could answer query
Learn from historical data
![Page 11: Model-driven Data Acquisition in Sensor Networks](https://reader036.vdocuments.us/reader036/viewer/2022062807/56815044550346895dbe44de/html5/thumbnails/11.jpg)
Dynamic models: filteringJoint distribution
at time t Condition onobservations
t
Fewer obs. infuture queries
Example: Kalman filter Learn from historical data
![Page 12: Model-driven Data Acquisition in Sensor Networks](https://reader036.vdocuments.us/reader036/viewer/2022062807/56815044550346895dbe44de/html5/thumbnails/12.jpg)
Supported queries Value query
Xi ± with prob. at least 1-
SELECT and Range query Xi[a,b] with prob. at least 1- which sensors have temperature greater than 25°C ?
Aggregation average ± of subset of attribs. with prob. > 1- combine aggregation and selection probability > 10 sensors have temperature greater than
25°C ? Queries require solution to integrals
Many queries computed in closed-form Some require numerical integration/sampling
![Page 13: Model-driven Data Acquisition in Sensor Networks](https://reader036.vdocuments.us/reader036/viewer/2022062807/56815044550346895dbe44de/html5/thumbnails/13.jpg)
10 20 300
0.1
0.2
0.3
0.4
t
SQL-style query
with desired confidence
Model-driven data acquisition: overview
Probabilistic Model
10 20 300
0.1
0.2
0.3
0.4
Query
Data gathering
plan
Conditionon new
observations
10 20 300
0.1
0.2
0.3
0.4
posterior beliefWhat sensors do we observe ?How do we collect observations?
![Page 14: Model-driven Data Acquisition in Sensor Networks](https://reader036.vdocuments.us/reader036/viewer/2022062807/56815044550346895dbe44de/html5/thumbnails/14.jpg)
Acquisition costs Attributes have
different acquisition costs
Exploit correlation through probabilistic model
Must consider networking cost1
2
63
4 5
cheaper?
![Page 15: Model-driven Data Acquisition in Sensor Networks](https://reader036.vdocuments.us/reader036/viewer/2022062807/56815044550346895dbe44de/html5/thumbnails/15.jpg)
Network model and plan format
Assume known (quasi-static) network topology Define traversal using (1.5-approximate) TSP Ct(S ) is expected cost of TSP (lossy communication)
12
63
4 5
7 8
129
10 11
Cost of collecting subset S of sensor values:
C(S )= Ca(S )+ Ct(S )
Goal:Find subset S that is sufficient to answer query at minimum cost C(S )
![Page 16: Model-driven Data Acquisition in Sensor Networks](https://reader036.vdocuments.us/reader036/viewer/2022062807/56815044550346895dbe44de/html5/thumbnails/16.jpg)
Choosing observation plan
Is a subset S sufficient? Xi2[a,b] with prob. > 1-
If we observe S =s :Ri(s ) = max{ P(Xi2[a,b] | s ), 1-P(Xi2[a,b] | s )}
Value of S is unknown:Ri(S ) = P(s ) Ri(s ) dsOptimization problem:
![Page 17: Model-driven Data Acquisition in Sensor Networks](https://reader036.vdocuments.us/reader036/viewer/2022062807/56815044550346895dbe44de/html5/thumbnails/17.jpg)
10 20 300
0.1
0.2
0.3
0.4
t
SQL-style query
with desired confidence
BBQ system
Probabilistic Model
10 20 300
0.1
0.2
0.3
0.4
Query
Data gathering
plan
Conditionon new
observations
10 20 300
0.1
0.2
0.3
0.4
posterior belief
ValueRangeAverage
Multivariate GaussiansLearn from historical data
Equivalent to Kalman filterSimple matrix operations
Simple matrix operations
Exhaustive or greedy searchFactor 1.5 TSP approximation
![Page 18: Model-driven Data Acquisition in Sensor Networks](https://reader036.vdocuments.us/reader036/viewer/2022062807/56815044550346895dbe44de/html5/thumbnails/18.jpg)
Experimental results
Redwood trees and Intel Lab datasets Learned models from data
Static model Dynamic model – Kalman filter, time-indexed
transition probabilities Evaluated on a wide range of queries
SERVER
LAB
KITCHEN
COPYELEC
PHONEQUIET
STORAGE
CONFERENCE
OFFICEOFFICE50
51
52 53
54
46
48
49
47
43
45
44
42 41
3739
38 36
33
3
6
10
11
12
13 14
1516
17
19
2021
22
242526283032
31
2729
23
18
9
5
8
7
4
34
1
2
3540
![Page 19: Model-driven Data Acquisition in Sensor Networks](https://reader036.vdocuments.us/reader036/viewer/2022062807/56815044550346895dbe44de/html5/thumbnails/19.jpg)
Cost versus Confidence level
![Page 20: Model-driven Data Acquisition in Sensor Networks](https://reader036.vdocuments.us/reader036/viewer/2022062807/56815044550346895dbe44de/html5/thumbnails/20.jpg)
Obtaining approximate values
Query: True temperature value ± epsilon with confidence 95%
![Page 21: Model-driven Data Acquisition in Sensor Networks](https://reader036.vdocuments.us/reader036/viewer/2022062807/56815044550346895dbe44de/html5/thumbnails/21.jpg)
Approximate range queries
Query: Temperature in [T1,T2] with confidence 95%
![Page 22: Model-driven Data Acquisition in Sensor Networks](https://reader036.vdocuments.us/reader036/viewer/2022062807/56815044550346895dbe44de/html5/thumbnails/22.jpg)
Comparison to other methods
![Page 23: Model-driven Data Acquisition in Sensor Networks](https://reader036.vdocuments.us/reader036/viewer/2022062807/56815044550346895dbe44de/html5/thumbnails/23.jpg)
Intel Lab traversals
![Page 24: Model-driven Data Acquisition in Sensor Networks](https://reader036.vdocuments.us/reader036/viewer/2022062807/56815044550346895dbe44de/html5/thumbnails/24.jpg)
10 20 300
0.1
0.2
0.3
0.4
t
SQL-style query
with desired confidence
BBQ system
Probabilistic Model
10 20 300
0.1
0.2
0.3
0.4
Query
Data gathering
plan
Conditionon new
observations
10 20 300
0.1
0.2
0.3
0.4
posterior belief
ValueRangeAverage
Multivariate GaussiansLearn from historical data
Equivalent to Kalman filterSimple matrix operations
Simple matrix operations
Exhaustive or greedy searchFactor 1.5 TSP approximationExtensions
More complex queries Other probabilistic models More advanced planning Outlier detection Dynamic networks Continuous queries …
![Page 25: Model-driven Data Acquisition in Sensor Networks](https://reader036.vdocuments.us/reader036/viewer/2022062807/56815044550346895dbe44de/html5/thumbnails/25.jpg)
Conclusions Model-driven data acquisition
Observe fewer attributes Exploit correlations Reuse information between queries Directly deal with missing data Answer more complex (probabilistic)
queries
Basis for future sensor network systems