secure outsourced aggregation via one-way chains
DESCRIPTION
Secure Outsourced Aggregation via One-way Chains. Suman Nath , Microsoft Research Haifeng Yu, National Univ. of Singapore Haowen Chan, Carnegie Mellon University. Wide-area Shared Sensing. Lets users query sensors through the Web. Sensor Base. Internet. Aggregator. Portal. Sensors. - PowerPoint PPT PresentationTRANSCRIPT
Secure Outsourced Aggregationvia One-way Chains
Suman Nath, Microsoft ResearchHaifeng Yu, National Univ. of Singapore
Haowen Chan, Carnegie Mellon University
Wide-area Shared Sensing
Lets users query sensors through the Web
SensorBase
Sensors Gateway Portal
Internet
Aggregator
Unique Characteristics
Diverse queries– Min/max, Count/sum/mean, Random Sample, Top-K, Quantiles,
Frequent Readings, etc. Push-based data collection– Large number of sensors (e.g., >100K in SciScope)– Query rate higher than data rate
Outsourced aggregation (e.g., SensorMap, SciScope)– Scalability (network load at portal)– Network proximity– Privacy, economy
Unlike wireless
sensor-nets
Our goal: enable portal to verify whether and aggregate reported by aggregator is correct
Malicious Aggregator
A malicious/compromised/lazy aggregator can report incorrect aggregation result
10ft 12ft
3ft
Maximum water level: 3ft(Flood warning if level >= 10ft)
Aggregation service provider
FloodWatch
Malicious aggregator
Water level 9 10 8 10 11 12
Related Work Outsourced database [Li’06, Narasimha’05, Pang’05]– Does not consider aggregation queries
SIA [Chan’07]– Only one central aggregator; multiple rounds
SHIA [Chan’06]– Only Count; pull-based model
Proof-sketch [Garofalakis’07]– Only Count; aggregators can safely cheat
Not suitable for wide-
area sensing
Our Contribution
SECOA: a family of optimally secure aggregation protocols– Supports a strict superset of aggregates supported by
previous work (e.g., SIA, SHIA, Proof-sketch)• Min/max, Count/Sum/Mean, Top-K Readings, Random
Sample, Top-K Groups, Frequent Items, Popular Items, etc.– Supports a push-based model
We use conceptually simple one-way chains– We provide optimizations for up to 105x speedup
Evaluation with prototype and real dataset
Outline
Problem Statement System Model Secure Algorithms– Max– Beyond Max
Evaluation
System Model
Portal knows the list of sensors– Each sensor shares a symmetric key with portal
Sensors/portal loosely time synchronized Sensors/Aggregators/Portal can do RSA Sensor readings are integers
Sensors Gateway Portal
Internet
Aggregator
Aggregates + Verification object
Attack Model
Byzantine aggregator– Can fabricate, replay, duplicate, ignore readings
Malicious aggregators can collude Sensors are trusted– Fundamentally impossible to prevent– Most aggregates we consider are robust against a
small number of malicious sensors
Cryptographic Primitive Message Authentication Code (MAC)
Message mMAC
FunctionKey k
MAC M MAC MMAC
verifierKey k Integrity and
Authenticity of message m
One-way Chain Uses one way function F, e.g., MD5, SHA-1, RSA
Given F and Fk, one can compute Fi (i>k), but not Fi (i<k)
SEAL (Self Authenticating Value) at position k: Fk
0F0 = s
1F1(s)=F(s)
2F2(s)= F(F1(s))
3F3(s)= F(F2(s))
SEAL folding: Combine multiple SEALs into one– Folded SEALs can be verified – E.g., XOR of MD5 SEALs, Multiplication of RSA SEALs
Outline
Problem Statement System Model Secure Algorithms of SECOA– Max– Beyond Max
Evaluation
Secure Max (Sensor/Aggregator)
0 1One way chain
2 3 4 5
Aggregator outputValue = 5
2MAC
4MAC
5MAC
5Inflation-free proof
5
3 4 5
Deflation-free proof(Folded SEAL)
0 1One way chain
2
0 1One way chain
2 3 4
Water levels Flood warning if max > 4
Malicious aggregator can inflate result and report 10
Malicious aggregator can deflate result and report 2
Value = 5
Value = 4
Value = 2
Secure Max (Portal)
Aggregator reports (5, MAC, folded SEAL) Portal first checks if the MAC is valid Portal then computes a reference SEAL
Checks if the reference SEAL = folded SEAL
0 1 2 3 4 5
5
Reference folded SEAL
3 4 50 1 2
0 1 2 3 4
Theorem: the algorithm is optimally secure
Distributed Aggregator
Challenge: Roll folded SEALs forward ?
Portal
Sensors Sensors Sensors
Aggregator Aggregator
Aggregator
Local max: 5(Folded SEALat position 5)
Local max: 3(Folded SEAL at position 3)
Global max: 5(Folded SEAL At position 5) Folded at position 3
Fold at position 5??
Homomorphic Function Requirement
Necessary and sufficient condition:– F(x . y) = F(x) . F(y) and F(x . y) = F(y . x)• Homomorphic function
– Example: F = RSA encryption, = multiplication– (More expensive than MD5, but can be made cheaper
with clever optimizations)
0 1 2 3
0 1 2 3
02 3
1
0 1
Rolling → folding Rolling → folding → rolling
Outline
Problem Statement System Model Secure Algorithms– Max– Beyond Max
Evaluation
Secure Count
Adapt Alon-Matias-Szegedy Algorithm– Each sensor i picks a random value vi (aka sketch), s.t. x
chosen with probability 2-x
– Max v = Maxi(vi)– Est. Count = 2v (increase accuracy with more sketches)
Other aggregates: Count Distinct, Sum, Mean Problem: high overhead– Example: 100K sensors, 300 sketches
• 510 million rolling operations, 30 million folding operations• A single query: 7 hours for RSA, 9 minutes for MD5
Reducing Rolling Cost Folded Rolling: exploit homomorphism of RSA– Aggressively fold
0 1 2 3 4
3 40 1 2
0 1 2 3 4
0 1 2 3 4
0 1 2
0 1 23 4
0 1 2 3 4
3 40 1 2
0 1 2 3 4
0
0
0 1 2 3 4
Fold
At the portal
At aggregators
Reducing Folding Cost Portal still needs to fold many sensors per query
Tree (at portal): Index sensors as a tree (e.g., B-Tree)
0
0
0 1 2 3 4
Sensor1
Sensor2
Sensor3
Query
Logarithmic folding
Other Aggregates
Top-K Readings– Finds K sensors with maximum values– One pass solution challenging
• An aggregator may not know the global top-K• Locally produced proofs must be combined globally
Top-K Groups– Group sensors (based on dynamic properties) and find k
groups with maximum values– Significantly more complicated than top-k readings
• Portal does not know grouping, so verification is hard Details in paper
Other Aggregates
Uniformly random sample: Top-K– Many other statistical aggregates from random
sample Most popular items: Top-K Groups– Use item name as the group ID, AMS sketch as the
group value Items occurring above a threshold: Top-K Groups– Use item name as the group ID, AMS sketch as group
value, report groups above threshold
Outline
Problem Statement System Model Secure Algorithms– Max– Beyond Max
Evaluation
End-to-end Performance
Prototyped in SensorMap, using Crypto++ library Dataset: 16,106 stream gauge sensors from USGS 2.5GHz Pentium desktops
Query KB/query Computation time (ms/query)Portal Sensor Aggregator Portal
Max 0.5 0.84 11.97 1.05
Count 3 35.97 158.9 1.11
Top-10 Readings 1.5 1.09 10.9 1.12
Top-10 Groups 1.6 0.78 8.2 80.9
320KB without in-network aggregation
Effect of Optimizations
Computation costs (for Count)
At Portal At Aggregator
Additional results in the paper
Conclusion
SECOA: a framework for outsourced aggregation– Supports a large number of diverse queries– Supports push-based model– Optimally secure– Supports hierarchical aggregators– Has small computation/communication overhead
Future work: design a system without a centralized portal
Thank You
Backup slides
Distributed Aggregator
Challenge: Roll folded SEALs forward ?
Portal
Sensors Sensors Sensors
Aggregator Aggregator
Aggregator
max: 5 max: 3
max: 5
Folded at position 3
Fold at position 5??
523
Computed
by se
nsors
One-pass Top-KSolution: i’th top value has SEAL over all sensors
excluding top i-1 values 80
61
1210
75
26
20
18
F75
F26
F20
F80
F61
F12
80
75
612620
18
12
10
F80
F75
F61
Optimally secureCost proportional to the top value and independent of k
Top-K Readings
Challenge for a one-pass algorithm– An aggregator may not know the globally top-k items– Locally produced SEALs must be combined
– Solutions in the paper
Top-K Groups
Significantly more difficult that Top-K Readings– 2nd Top value should exclude all items in the top group– The portal may not know the group membership!
Solution in the paper
76
5
62
1
12 3
3 5 4 4