Two tales of privacy in OSNs
Claudia Diaz KU Leuven ESAT/COSIC
April 5, 2013
1 based on joint paper with Seda Gürses, to appear at IEEE S&P Magazine
Outline
• Two narraPves: the ac#vist and the consumer
• Two ways of framing privacy – Understanding and improving social privacy in OSNs – PETs for social networks: evading surveillance and censorship
• Comparison of approaches – Surveillance and social privacy problems treated as unrelated
• abstract away the complexity of the privacy problem – Challenges for integraPon of approaches? – Points for discussion
2
3
The posiPve narraPve
• Social media enabler for social change, for ciPzens to contest ruling insPtuPons, to foster democracy and human rights, … – Conveniently, the companies providing these social media services
originate from the USA
• One line of criPcism to this narraPve: role of SM is exaggerated, more credit to organizaPon and events on the ground
4
5
6
7
The negaPve narraPve • How governments exploit SM:
– Social media blocked during civil unrest to prevent communicaPon – Social media used to disseminate misinformaPon or propaganda – Social media used to spy on people
• InformaPon can be used to, eg, idenPfy (and arrest or kill) dissenters
• Collusion SM companies and governments – The “surveillant assemblage”
• Link to privacy technologies: – how to design technologies with which people can interact socially
online while being free from surveillance and interference (eg, censorship)?
8
Other perspecPves on the problem of privacy in OSNs
• Safety, protecPon from crime – The bad guys: malware, scammers, online thieves, predators, stalkers – The good guys: regulators, industry, and law enforcement – Technologies: data security, so`ware security, authenPcaPon/
idenPficaPon, access control, monitoring
• Data protecPon – Purposes for which informaPon is used – Informed consent – Subject access rights (eg, delePon)
• Social privacy – OSNs are spaces to socialize – unsurprisingly, all the privacy issues of
social relaPonships reappear, plus new ones that appear
9
10
11
12
Social privacy issues • context collision (family, friends, colleagues) • unintended (or “unexpected”) informaPon disclosures • informaPon taken out of context • “inappropriate” comments or content • Reasons:
– misconfiguraPon of privacy seengs (not usable) – open seengs overriding more restricPve seengs – so`ware bugs – unintended mistakes (upload wrong picture of video) – bad decisions: regrets (angry, not thinking)
• Other issues – coercion (to provide password) – noPce and choice (informed consent) model: difficulty to read / understand
privacy policies
13
Social privacy research
• Understand social privacy issues from a user / community perspecPve, and its interrelaPon with technology design
• Improve OSN design based on user values – system is intuiPve, easy to use – behaves according to user expectaPons – has appropriate privacy defaults – provides meaningful privacy controls – helps users make beger privacy decisions (e.g., “nudges” the
user towards beger behavior) – supports users and communiPes in developing “privacy
pracPces”
14
Privacy pracPces • “acPons that users collecPvely or individually take to negoPate their
boundaries with respect to disclosure, idenPty and temporality in technologically mediated environments” (Palen and Dourish)
• “privacy is a social construct that reflects the values and norms of everyday people” (boyd)
• tensions between privacy and publicity • negoPaPng boundaries between the private and the public • negoPaPng acceptable and unacceptable forms of behavior
• OSN architecture influences pracPces (boyd): persistence, replicability, visibility and searchability of content
15
PracPces and strategies • use of seengs (blocking content towards certain people who may
criPcize or make fun of it) • ePquege: bad taste to comment on pictures that were uploaded
years before • indicaPng who is the audience (through the use of language, based
on topic) – Social steganography: “encoded” messages that mean different things
to different people, obscure references, inside jokes • separate profiles (in one or several OSNs) • regular delePon of content • account deacPvaPon while offline
• How does OSNs design impact these pracPces and strategies?
16
Increasing transparency and improving privacy relevant decision-‐making
• Privacy is about “people being able to make informed decisions wrt informaPon disclosure”
• System behaves according to their expecta#ons
• Users have meaningful controls
• Users are nudged to be protecPve of their privacy (make it easy to be more private)
17
First decision: to join the OSN • Do users read the privacy policies?
– Mostly not, even less understand them – Warning: privacy policies used as disclaimer to then do whatever they want
with the data! once the user accepts the policy, she consents to its terms
– `
• How to improve the readability of privacy policies? – easy to find and interpret, to the point, standardized?
18
Make privacy informaPon salient
• Privacy policies of websites, apps, etc.
slide: Lorrie Cranor 19
Make it easy to segregate audiences
• Access control policies designed for sys admins – Now everyone must be able to configure privacy seengs (a type
of AC policy)
• Goal: reduce cogniPve load of user
• Beger interface designs for grouping friends – closer to the users mental models
• Automated grouping of friends – leverage user agributes, social graph properPes (eg, clustering),
past interacPons
20
Make audience visible • Current privacy seengs, access control seengs
• Consequences of sharing
21
Make it easy to select privacy
• Seengs that default to privacy • Usable privacy controls and tools • Add fricPon to privacy-‐reducing opPons – More clicks, scrolling, delay
Are you sure you want to make your photo public?
No Yes
slide: Lorrie Cranor 22
Regret studies
• Series of studies: interviews, diary study, surveys – Focus on American users of Facebook and Twiger
– Data collected from over 3000 social network users • Interviews with Pigsburgh residents • Large survey samples from Amazon Mechanical Turk
• Research quesPons – How common is it to have social network regrets?
– What do users regret doing on social networks?
– Why do users take regregable acPons? – What are the consequences of these regregable acPons?
– How do users avoid or repair regrets? – How are regrets different on social networks and in conversaPons?
slide: Lorrie Cranor 23
Overview of findings
• Most social network users reported regrets – 57% of FB users reported FB regrets – 51% of Twiger users reported Twiger regrets – 79% of Twiger users reported conversaPonal regrets
• Serious consequences – RelaPonship breakup, job loss – Less serious consequences sPll very upseeng
• Underlying causes o`en included being angry or upset, not thinking, or forgeeng who might read their posts
• Most regrets occurred within one day of posPng
Y. Wang, S. Komanduri, P.G. Leon, G. Norcie, A. AcquisP, L.F. Cranor. I regretted the minute I pressed share: A Qualitative Study of Regrets on Facebook. SOUPS 2011. hgp://cups.cs.cmu.edu/soups/2011/proceedings/a10_Wang.pdf Other papers under review
slide: Lorrie Cranor 24
What do people regret on FB?
• Posts about – Personal informaPon/issues about themselves or others – Sex – RelaPonships – Profanity – Alcohol and drug use – Jokes – Lies – InformaPon about work or company
• Friending and unfriending • Photo tagging • Using Facebook applicaPons
slide: Lorrie Cranor 25
Why did they do it?
• Excited state – NegaPve: Angry, bad mood, venPng – PosiPve: Happy, share good news
• Didn’t think • Unintended audience • Thought it was cool or funny • Under the influence of alcohol or drugs • Didn’t mean to post it
slide: Lorrie Cranor 26
Timer nudge (stop and think)
slide: Lorrie Cranor 27
SenPment nudge (content feedback)
slide: Lorrie Cranor 28
Profile picture nudge (audience feedback)
slide: Lorrie Cranor 29
Preliminary results
• Timer nudge – Overall perceived as useful – Users reported rephrasing/correcPng/canceling posts
• Profile picture nudge – Overall perceived as useful – Made users more aware of audience and number of FB friends – Reminded users to use the appropriate privacy seengs
• SenPment nudge – PosiPve senPment nudge was deemed useless – NegaPve senPment nudge annoyed people: missing context,
misinterprePng sarcasPc comments, judgmental, censoring – Need smarter senPment analysis algorithm and beger messaging
slide: Lorrie Cranor 30
Social privacy: methodology
• Research o`en based on user studies – QualitaPve (small scale) studies based on user interviews – QuanPtaPve (larger scale) studies, extract staPsPcs
• The studies help: – understand user expectaPons and concerns – study the impact of different design opPons
• Big issue: how representaPve is the user sample? – of collecPves with specific needs/situaPons – in other countries
31
Back to surveillance and censorship concerns…
Research in cryptography and computer security: Privacy Enhancing Technologies (PETs)
for OSNs
32
PETs methodology
• Model the system, make explicit assumpPons (eg, trust assumpPons, available building blocks)
• IdenPfy the threat model (knowledge, access, capabiliPes)
• IdenPfy the informaPon to protect (eg, content, traffic data) and the type of security property (eg, confidenPality, availability)
• Perform a security analysis of the system to test if the security properPes hold, and under which circumstances (assumpPons)
33
Accessing censored sites • Use of Tor (or other anonymous communicaPon networks) to
access blocked OSN sites – even beger if the circumvenPon is undetectable
34
ProtecPng content • Use encrypPon: diversity of tools
– Note: main difference with seengs is the protecPon from OSN provider
– FlyByNight: • Facebook app that protects user data by storing it encrypted • Relies on FB for key management
– Scramble • Browser plug-‐in that encrypts content prior to uploading • Key management done out of band
• Issues: – usability, flexibility of interface – key distribuPon (network effect – criPcal mass needed)
• Bonus – encrypPng the content makes censorship of content more difficult
35
OSN may not like encrypted content
• Q: should the law establish a right to encrypt the content users store/share in a service? – Or should the OSN provider have the right to say “If you use my service, I must be able to look into your content”? • Compare to offline services (e.g., having a conversaPon in a (semi-‐)public space)
– Issues: • “inappropriate” content (censorship?) • conflict with business model
37
Steganography? • Not possible for the OSN provider to realize that the content is encrypted
• NOYB (None Of Your Business ) – subsPtute (shuffle) user agribute values (age, locaPon, etc.) – only users with the right keys can ‘undo’ the shuffle and retrieve the real agribute values
• FaceCloak – symmetric key (shared only with audience of content) to encrypt user’s informaPon in
Facebook – encrypted data is stored in the FaceCloak server, and replaced in Facebook by random text
fetched from wikipedia. – The random text acts as an index to the encrypted data on the server.
• Issues – misrepresentaPon of user interests towards the OSN provider (who sPll performs profiling on
the noisy informaPon) and towards other users who might not be using the system – undetectability of the tool: double-‐edged sword
38
ProtecPng relaPonships and interacPons
• Even if content is encrypted, valuable intelligence can be extracted from analyzing the social graph and the fine-‐grained interacPons of users
• Is anonymity an opPon for online social networking?
• ObfuscaPon of relaPonships/interacPons with dummy traffic – content encrypted: hard to disPnguish encrypted content from
random data (dummies) – Dummy traffic expensive: how to opPmize dummy traffic generaPon?
• Metrics to assess the effecPveness of dummy traffic (eg, mutual informaPon)
39
AlternaPve centralized architectures
• HummingBird – privacy-‐enhanced alternaPve to Twiger – relies on a set of crypto protocols – “protects tweet contents, hashtags and follower interests from
the (potenPally) prying eyes of the centralized server”
• Use the OSN as a “dumb” data store for encrypted blobs – Client so`ware stores and retrieves blocks, and organizes info
for presentaPon to the user
• No protecPon against traffic analysis
40
Distributed architectures
• Adversary model in centralized OSNs is very strong: • global, potenPally acPve • protecPon against traffic analysis very hard
– Distributed architectures? • Diaspora, Safebook, Peerson • Challenges:
– informaPon availability, synchronizaPon, security of client so`ware
– adversary and traffic analysis guarantees difficult to model
41
IntegraPon of the different approaches to privacy in OSNs (?)
42
Challenges for integraPon: who? • Who defines what the privacy problem is?
– experts: based on their technical knowledge (techno-‐centric) • Plus: what is technically possible? How can informaPon be abused? • LimitaPon: how do these technical risks map to social/poliPcal analyses of surveillance
pracPces? – risk of over-‐relying on techno-‐centric assumpPons about how surveillance funcPons and what
may be the most appropriate strategies to counter it • LimitaPon: technical tools do not behave as predicted in different contexts (social
pracPces) • LimitaPon: no emphasis on usability, user needs
– users: based on their percepPons and experiences (user-‐centric) • Plus: take into account user perspecPve, context • LimitaPon: biased samples (o`en people in the US or EU), would a dissenter in Egypt
have the same concerns as college student in the US? • LimitaPon: no insight into organizaPonal pracPces • LimitaPon: users have a limited understanding of the technical infrastructure, may take
the technology as a given (hard to imagine alternaPves)
– regulaPon: based on legal norms (organizaPon-‐centric) • LimitaPon: compliance with data protecPon regulaPon does not necessarily imply privacy
protecPon 43
Challenges for integraPon: what data?
• What informaPon is in the scope of the ‘privacy problem’? – social privacy: emphasis on user-‐generated content, voliPonal acPons (no implicit data) • how to communicate to users issues derived from implicit data?
– PETs: in principle, all data is in the scope (voliPonal and implicit) • BUT: risks only with respect to the adversary (not ‘friends’) • Content-‐agnosPc: does not take into account the semanPcs of the content (semanPcs and context are however very relevant for social privacy)
44
Challenges for integraPon: how are privacy problems idenPfied/defined?
• Social privacy – focus on concrete harms in the user (social) environment – intuiPve causality between disclosures and consequences
• PETs – focus on risks that might lead to ‘abstract harms’ (worst-‐case
scenarios) • individual harms: being arrested, put under surveillance, inferences of
sensiPve informaPon, intrusion, manipulaPon • societal harms: discriminaPon, surveillance society, informaPon asymmetry,
upseeng exisPng checks and balances of power between individuals, state and private sector
• Issues – No informaPon (transparency) about what is actually being done with
the data – How to communicate abstract harms to users?
45
Further points for discussion • IncenPves of OSN providers wrt:
– social privacy? surveillance? censorship?
• Is privacy always about informaPon concealment (in social privacy / surveillance /censorship)? – Counter example: saying “I do not want to be disturbed”
• Censorship in PETs and in social privacy research – Privacy as establishing norms of respect vs. privacy as being able to break the norms
• Paradox of control – signaling that security is broken: false sense of security?
• RelaPonship (feedback) social privacy – surveillance – Surveillance -‐> social privacy problems: change of seengs policies, bugs – Social privacy problems -‐> surveillance: what others reveal about you, social tagging improving
idenPficaPon of anonymous protesters
46
Conclusion • Researchers in different subfields of CS frame the OSN privacy
problem in very different ways – so does the media
• The different privacy problems are tackled as if they were completely unrelated – abstract away the complexity in order to reduce the problem to one
that can be more easily addressed – some quesPons are le` unaddressed
• We argue that the different privacy problems are entangled, rather than unrelated – a more holisPc approach needed – integraPon of approaches extremely challenging
47