can censorship measurements be safe(r)? ben jones and nick feamster princeton university
TRANSCRIPT
Can Censorship Measurements Be Safe(r)?
Ben Jones and Nick FeamsterPrinceton University
2
Alice wants to measure censorship
AliceFacebook
3
What if someone is watching?
AliceFacebook
Governments have the ability and motive to retaliate against users
4
Defining censorship and surveillance
• Censorship– Practice of restricting access to content– Triggering censorship means automated response
• Surveillance– Practice of capturing and storing traffic to identify
users who measure censorship– Triggering surveillance means manual response– Surveillance system must discard traffic
5
Can we evade surveillance?
• This is hard:– Lives are at stake– Risk is difficult to define– We do not know what surveillance systems can do
• Existing solutions do not address this
• Our solution: maybe we can reduce risk
6
Can we evade surveillance?
-+
++ +
++
+
-
-
-
-
-
-
UninterestingTraffic
-
CensorshipMeasurements
SurveillanceClassifier
+
+ + +
--
-
-
We could mimic uninteresting traffic
++
7
Can we evade surveillance?
-+
++ +
++
+
-
-
-
---
UninterestingTraffic
-
CensorshipMeasurements
SurveillanceClassifier
+
+ + +
--
-
-
We could manipulate uninteresting traffic
++
-
-
8
Outline
• Mimicking uninteresting traffic• Manipulating uninteresting traffic
9
Mimicking uninteresting traffic
• Goal: can we get our traffic discarded?
• Crazy idea: can we measure censorship by mimicking malware?
• Why mimic malware?– Presence of malware will not differentiate users– Surveillance will not care about malware traffic – Surveillance will want to discard malware– Cheap for surveillance to discard malware
SYN scanning mimicry
10
MeasurementClient
1: <SYN, port 22>
Scanning target A
Scanning target B
3: <SYN, port 22>
4: <SYN/ACK, port 22>
2: <SYN/ACK, port 22>
SYN scanning censorship measurement
11
MeasurementClient
<SYN, port 80>Scanning target
BBC.com
<SYN, port 80>Censorship
System
<SYN/ACK, port 80>
12
How can we evaluate this?
• Goal: boost confidence in our measurements
• We create and test against reference systems– Why is this a faithful representation?
• We can get close to what surveillance systems would use to detect malware– Assume the use of COTS malware detection– We use a similar engine (Snort/Cisco)– We use similar rules (most rules are not user generated)
13
Outline
• Mimicking uninteresting traffic• Manipulating uninteresting traffic
14
Measuring DNS censorship
Client X
SurveillanceSystem
CensorshipSystem
<SRC=X, DNS Query>
<DST=X, DNS Response>
What if everyone measured censorship?
15
Measuring DNS censorship
Client X
Client AS SurveillanceSystem
Censorship
<SRC=X, DNS Query>
<DST=X, DNS Response>Client Y
Client Z
<SRC=Y, DNS Query><SRC=Z, DNS Query>
<DST=Y, DNS Response>
<DST=Z, DNS Response>
16
Evaluating manipulation
• Can we detect censorship?– Yes; we get to conduct a direct measurement
• Can we actually spoof?– Good news: difficult to detect spoofing at the edge– CAIDA Spoofer project showed that 77% of users
can spoof within their own /24
17
Summary
• Our contributions– Modeled censorship and surveillance– Showed that we may be able to reduce risk
• This is hard, but important– Plenty of room for future work– Feedback appreciated
• Questions?: [email protected]
18
Ethics
• Autonomy– Do not condone spoofing from other home users– How to accurately educate users?
• Beneficence– Reduce legal and physical risk– May interrupt user Internet service– Load equivalent to open resolver measurement
• Justice• Respect for law and public interest
– Spoofing is a violation of AUPs– Censorship measurement may be illegal
19
DifferencesDifference Surveillance Cost Censorship CostLong term storage Store as much as
possibleNo persistent storage needed
Triggering Human intervention Automatic intervention
False positive People get hurt No cat videosUser attribution Try to discard
automated trafficDo not care about traffic source
20
Evaluating mimicry
21
Assumptions
• Surveillance systems must discard traffic– 2009: NSA/GCHQ tapped 5920 Gbps, but only had
690 Gbps backhaul• Surveillance systems will use COTS
components when possible• If a large number of users measure censorship,
the surveillance system cannot arrest anyone
SYN scanning mimicry
22
MeasurementClient
<SYN, port 22>
Scanning target A
Scanning target B
<SYN, port 22>
<SYN/ACK, port 22>
SYN scanning mimicry
23
MeasurementClient
<SYN, port 80>
Scanning target
BBC.com
<SYN, port 80>
CensorshipSystem
<RST, port 80>