sso project paper

Upload: kartik-rishi

Post on 03-Apr-2018

220 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/28/2019 SSO Project Paper

    1/28

    Centralizing the Decentralized: TheValue Implications of Single Sign-onServices

    Abstract

    The nature of the internet is that its decentralized

    nature is the greatest strength to its continued growthand value; however a trend that is developing is the

    use of points of authority that house your identity and

    interface with other web services to authenticate your

    identity. This industry falls under the title of Single

    Sign-On (SSO) services that allow you to log in on

    many different sites. We take a look at major SSO

    integrators and see how they utilize SSO to provide

    value to users and see how they benefit from having

    that system in place. We also take a look at the data-

    use policies of SSO providers to understand how the

    industry in general treat users and their data. After that

    we follow-up with a study on the usage of SSOs

    through the lens of actual users and by combining all

    this data we develop a best practices for users to help

    them be more informed on how their data is used and

    how they can service their own personal values and

    interests.

    Keywords

    VSD, Single Sign-on Services, SSO, Values, SSO

    Integrators, SSO Providers, data, services, privacy,

    best practices

    Copyright is held by the author/owner(s).

    INFO 444, Autumn 2012

    School of Information

    University of Washington

    Kartik Rishi

    School of Information

    Informatics - HCI

    [email protected]

    Scott Kuehnert

    School of Information

    Informatics - HCI

    [email protected]

    Teresa Lam

    School of Information

    Informatics - HCI

    [email protected]

    Augustus Yuan

    School of Information

    Informatics - HCI

    [email protected]

  • 7/28/2019 SSO Project Paper

    2/28

    2

    Introduction

    The internet as we know it is growing at an incrediblepace, and with it, new services are popping up

    everywhere with a new solution to any and all of our

    old problems. Need to shop for clothes online? There is

    a website for that. Want to listen to a variety of music?

    There is a web application for that too! Are you

    interested in having a discussion with your friends and

    family? You bet there is a way to do it online! With the

    expanding role of the internet in our day-to-day lives,

    we develop manifestations of ourselves throughout the

    internet via user accounts tagged to emails that you

    may not even remember the passwords for! Wouldnt it

    be nice if all you had to remember was one account,one email, one password?

    The premise behind a Single Sign-On (SSO) service is

    that a user only has to establish their account in one

    place and is able to utilize it on many other sites! The

    user no longer has to provide different credentials for

    different sites, to ultimately establish a connection to

    their identity on that site and to access the service that

    it provides. In this day and age where users provide so

    much information about themselves, a single site can

    develop a significant idea of who the user is, and in

    that process becoming an SSO Provider, where the new

    service is that they can establish the users unique

    identity anywhere. For those that actually implement

    the other side of the relationship, SSO Integrators, are

    sites that offer a service that the user wants and will

    communicate with SSO Providers to provide a

    convenient authentication for who the user is and let

    them continue on with what they intended to do.

    What does this present to the user in terms of benefits?

    The user is now able to consolidate their various user

    accounts in to one convenient account that allows

    access to various services. On top of that, because theirinformation is shared, their preferences and trends

    carry over, making the services that SSO Integrators

    provide very personal to the user. To develop on the

    personality of services, Integrators can also utilize

    geographic and friend data to provide content that is

    dynamic and far more relevant to your immediate

    location and your friends. SSO also presents the

    opportunity for users to have various SSO Integrators

    work with each other to improve the level of service

    provided, simply because the user has a global

    identity shared among all of them.

    Our research began with a story about a man named

    Bogomil Shopov, an online IT marketing and

    community management professional from Bulgaria.

    This individual was able to purchase 1,500,000 entries

    of first and last name, email, and private Facebook

    profile IDs for $5 USD

    (http://talkweb.eu/openweb/1819). Thats five bucks,

    straight and simple. This brings us to the negative side

    of SSO services and that is that while a users data may

    be integrated with various other sites, what data is

    truly transferred and what actually happens to it?

    Our team intends to explore the various aspects behind

    Single Sign-On services and an understanding of those

    services can gain us insights in to the users that utilize

    them. We will begin our study by determining some of

    the direct and indirect stakeholders involved with SSO

    services, to determine the key players and motivations

    behind how these systems are setup. Following that we

    will take a glimpse in to some well-established web

    services that are SSO Integrators and how they utilize

    data provided by SSO Providers to service users. From

    http://talkweb.eu/openweb/1819http://talkweb.eu/openweb/1819http://talkweb.eu/openweb/1819http://talkweb.eu/openweb/1819
  • 7/28/2019 SSO Project Paper

    3/28

    3

    there we will expand to establish who the top three

    SSO Providers are and then discuss each one in detailto understand how their system works and synthesizing

    their data use policies. By establishing a profile and

    understanding on the top three Providers, we intend to

    compare and contrast their approach to SSO and come

    to an understanding of what user values are implicated

    by how those systems were developed. After some

    insight in to common Integrators and Providers, we

    intend to develop a common understanding of how

    users approach SSO services in their day-to-day life to

    establish a better idea of the relevance of the

    technology and prevalence in daily life. Upon

    completion of understanding a broad user base, weintend to analyze how common users utilize SSO

    services and what that also means in terms of values

    implicated.

    Now you may be asking, whats our true purpose

    behind all of this work? We hope to analyze both our

    technical study of SSO services and an empirical

    measurement of the penetration of SSO technology in

    our peer groups and develop a strong understanding of

    what values are truly at stake for users in this Provider-

    Integrator-User relationship. Once we understand what

    those values are we intend to develop a best practice

    guideline that users can quickly read up on and

    understand key aspects of SSO services and how they

    can better protect themselves. With those guidelines,

    users express more control on their information by

    having more knowledge on its spread and can improve

    their leverage in the Provider-Integrator-User

    relationship.

    Methodologies & Stakeholders

    The basis of our work will be rooted in the principles ofValue Sensitive Design (VSD), a tripartite

    methodology, consisting of iteratively applied

    conceptual, empirical, and technical investigations; an

    emphasis on considering indirect as well as direct

    stakeholders (that is, people who are affected by a

    technical system but dont use it directly, as well as

    those who do); and an interactional theory of the

    relationship between values and technology. (Borning)

    To begin the direct stakeholders include the users and

    providers of SSOs. The indirect stakeholders include

    SSO integrators such as deal sites like Groupon and

    LivingSocial, data-aggregation services, and marketingagencies.

    The benefits for users are that they get to use one

    service to sign into various different websites. This

    saves them time from having to create a new account

    each time they visit a new website. In addition, users

    only have to memorize one username and password

    rather than multiple ones which can get confusing at

    times. They also benefit from personalized ads which

    can be helpful for users. The harms for the users

    include the possibility of third party websites obtaining

    information from the user that they did not wish to

    provide. Another harm is that SSO integrators have

    permission to access all the information that you

    provide in the social network which could be more than

    what users want to provide to these sites.

    As for the SSO integrators, they benefit by creating

    more personalized ads targeted to users, which in turn

    increases the likelihood of a user buying a product on

    the site. They also perform analytics and conduct

    customer research. The SSO benefits from users

  • 7/28/2019 SSO Project Paper

    4/28

    4

    continuing to use their social media website which

    increases their traffic which means they can earn moremoney. Users benefit from simplicity, time efficiency,

    and personalization. Conflicting value tensions include

    lack of privacy and consent. SSO integrators benefit

    from gaining valuable information while SSOs benefit

    from popularity.

    SSO Integrators

    In this section, we will be investigating how certain

    websites integrate Single Sign-On from social media

    sites and use it to their advantage. Single Sign-On

    services such as Facebook Connect can carry a lot of

    data from a users Facebook account into the service.Data such as interests, gender, likes, and friends in a

    users network become a lot more transparent for the

    Integrator and while they may use this information for

    the users gain, they may also use it for their gain as

    well. For this reason, this section will look more deeply

    into the privacy/data use policies stated by SSO

    Integrators regarding the data they collect from users

    and how they provide benefits in exchange.

    One example of an information technology that makes

    use of SSO specifically is StackExchange, a large

    network comprised of 90 Q&A sites which are all linked

    together. We were interested in StackExchange

    because, despite having its own StackExchange account

    that gives you access to all ninety sites, they also

    integrate a variety of other social media sites to allow

    you to connect with the different sites, including:

    Google

    Facebook

    Yahoo MyOpenID LiveJournal

    WordPress

    Blogger Verisign ClaimID

    ClickPass

    Google-Profile

    AOL

    This brings up a lot of privacy issues to us as to how

    much data StackExchange is collecting from all these

    sites, and what they are using it for. Under

    StackExchanges privacy policy, they state that they

    will tell you how they are using the data and they will

    make the notice in clear and conspicuous language

    when you are asked to first provide [StackExchange]with personal information and that they will notify

    [the user] before [StackExchange] uses the information

    for something other than the purpose for which it was

    originally collected. StackExchange uses this

    information to their benefit, however, in ways they

    have listed in their privacy policy such as allowing the

    user to register to [StackExchange] websites, online

    communities, and other services, communicate with

    users effectively, and evaluate quality of their services.

    StackExchange also uses this information to help

    employers find or contact users who post profiles on

    the Careers site, and transfer information to others as

    described in this policy to satisfy our legal, regulatory,

    compliance, or auditing requirements. In exchange,

    the user gets access to many of the services

    StackExchange offers including its huge network, all of

    it being extremely accessible through one, simple, sign-

    on.

    Another example includes deal sites such as

    Jackthreads, PLNDR, and Zappos whom focus primarily

    on marketing clothes in general for very cheap deals.

  • 7/28/2019 SSO Project Paper

    5/28

    5

    They, too, have Single Sign-On services that allow

    users to connect via Facebook, or other social mediawebsites. The major benefit they gain from this is they

    get access to your social media profile and anything

    you allow through their application. Here is an example

    picture that is used by Groupon, a website focused on

    delivering coupon deals to its users:

    Groupons business model revolves around providing

    deals for a diverse range of local activities (includingrestaurants, events, fitness, health, education, etc) to

    their users. A majority of how Groupon profits from this

    is they make deals with different businesses to

    advertise so that those businesses can get more

    customers. With so many businesses doing different

    Figure 1: A prompt that informs

    the user of all the data pieces that

    Groupon requests from Facebook

    during the Facebook Connect

    session.

  • 7/28/2019 SSO Project Paper

    6/28

    6

    things and so many users, information about the users

    is extremely helpful for Groupon.

    Specifically, Groupon can use a variety of information

    you make available publicly to target which coupons

    they want to send to you. They use the information for

    things such as maintaining the website, providing

    personalized ads, evaluating you for certain offers, and

    performing analytics for customer research. In their

    privacy policy they state that if you want to limit the

    information they obtain, you may manage the sharing

    of certain Personal Information with [Groupon] when

    you connect through social networking platforms or

    applications and that adjusting permissions of thatpersonal information is dependent on the privacy policy

    of the social networking platform. In this situation,

    Single Sign-On has the main advantage of personalizing

    the Groupon experience and has less of an emphasis on

    convenience.

    One final example of how Single Sign-On is utilized is in

    Wolfram|Alpha, a computational knowledge engine that

    uses your Facebook account to deliver very precise

    analytics including habits, charts, graphs, and statistics.

    Wolfram|Alpha mentions that the main purpose they

    use the information is to help enhance and refine

    [Wolframs] content and that information collected

    about you through your experience and queries is used

    to better understand the entire population that is

    utilizing our website and how we might improve our

    services to improve the collective experience. They

    also make it explicitly clear that personally identifiable

    information Wolfram|Alpha is allowed to access is

    affected by the privacy settings you have established at

    the TPS and that the linkage between any TPS and

    Wolfram|Alpha is completely voluntary, and our ability

    to access your information at the TPS requires that

    linkage, you have a choice whether or not to disclosesuch information. It goes to show just how much

    information Wolfram|Alpha has at its disposal and

    many third party sites can potentially benefit from it.

    The user also benefits from this because he/she can

    gain knowledge of the different habits he/she exhibits

    and can focus on fixing them if necessary. Some

    snippets of Wolfram|Alpha analytics have been

    included:

    For example in the above image, you can see the users

    activity during the week. We can see in the second

    Figure 2: An example of a piece of

    analytic that Wolfram|Alpha

    develops off of Facebook data.

  • 7/28/2019 SSO Project Paper

    7/28

    7

    graph that there is a lot of time spent on Facebook

    around 2-3 AM on Friday morning. We can also see alarge variety of application usage in the first graph.

    Wolfram|Alpha also makes the information very

    accessible for the user by providing different ways to

    download the data. They also have a way to monetize

    by allowing users to obtain RAW data from

    Wolfram|Alpha if users purchase the Pro plan.

    In the terms of use, Wolfram|Alpha explicitly states

    that they will not attempt to associate individual

    Wolfram|Alpha inputs with individual human users, and

    will not release individual or aggregated lists of inputs,

    or any personally identifiable information, to any third

    party, except in response to lawful court orders. We will

    not attempt to assert intellectual property rights over

    anything given as input to Wolfram|Alpha simply on the

    basis of its having been given to us as input. However,

    generating content through Wolfram|Alpha, the user isagreeing that Wolfram|Alpha can store [users data] in

    log files, and use [user data] to generate the results.

    Overall, we can see that different SSO integrators go

    about using data collected through Single Sign-On in

    various ways. In StackExchange, a majority of the use

    is for convenience for the user with so many different

    Q&A sites, having one account that allows you to

    access all of them is extremely convenient and it makes

    all the Q&A sites easily accessible. In Groupon, the data

    collected through Single Sign-On is used to create a

    very personalized experience for the user, and target

    specific coupons based on the users data. Finally,

    Wolfram|Alpha makes collected data very accessible to

    the user, and also uses the data to better their own

    Figure 3: The prompt that shows

    that you can download analytic

    information if you subscribe to

    Wolfram|Alpha Pro

  • 7/28/2019 SSO Project Paper

    8/28

    8

    website or search engine. All SSO integrators have

    explicitly stated somewhere in their privacy policy thatthey will not openly reveal user data to third parties

    unless they are required to by court order primarily use

    the data for the convenience of the user. Next we take

    a look in to how users approach the use of SSOs by a

    detailed empirical investigation.

    Empirical Investigation on SSO Users

    We decided to do a survey for our empirical

    investigation to gain insight on what users felt when it

    came to their privacy online. We wanted to determine

    whether users actually cared about the privacy of their

    information online or not. In addition, we would like tosee to what extent are the participants willing to give

    up their privacy for other values.

    Procedure

    For our study, we gave our participants a survey that

    consisted of 23 questions. These questions asked them

    for information about their demographics, Single Sign-

    On services, how much time they spent on the

    computer and Internet, as well as privacy and security

    related questions. We put our survey up online at

    various websites including Amazon Mechanical Turk

    (Amazon Mechanical Turk, 2005) and Reddit. The

    majority of our responses came from Amazon

    Mechanical Turk, which essentially is a paid

    crowdsource service that connects companies to a large

    body of people willing to do small tasks for a small sum

    of money. These tasks are typically those that are

    difficult for computers to accomplish while easy for

    humans due to the difference in comprehension. This

    platform is great to access a large variety of

    individuals.

    Participants

    We had a total of 170 participants for our survey. Ofthe 170 participants, only 142 had valid responses to

    the survey questions. We analyzed and based our study

    on the 142 responses. Of the 142 people that took our

    survey, 48 were female and 94 were male. This means

    that about a third of our data were made up of females

    and two thirds of our data were made up of males. We

    had a wide variety of age groups take our survey.

    Around 60% of the survey responders ranged from

    ages 22-30. As for location, about 82% of our data

    came from India.

    Demographics

    Age n %

    18-21 17 12%

    22-25 43 30.3%

    26-30 41 28.9%

    31-40 28 19.7%

    41-50 9 6.3%

    51-60 3 2.1%

    61-70 1 0.7%

    Gender n %

    Male 94 66.2%

    Female 48 33.8%

    Country n %

    India 116 81.7%

    USA 17 12%

    Pakistan 2 1.4%

    Other 7 4.9%

    Figure 4: A series of tables

    displaying the demographic

    information for the empirical

    investigation survey we conducted.

  • 7/28/2019 SSO Project Paper

    9/28

    9

    Results

    One of the questions that we asked in our survey wasWhy do you use Single Sign-On services? and we had

    a lot of consistent answers from our participants. A

    male around the age of 26-30 from India responded to

    the question by saying Its easy and convenient.

    Another response from a male thats also around the

    age of 26-30 states It provides security as one time

    login and logout. Also [theres] no need to remember

    all the passwords every time. We used a website

    called Many Eyes (Many Eyes, 2007) which is a

    graphical tool that uses techniques to create a

    graphical network representation of patterns of

    reference in collaborative discourse(Wikipedia, 2011).One of the options on this site is a graphical

    representation called a tag cloud which counts the

    frequency of the words within our data. Below is a tag

    cloud for the responses to the question Why do you

    use Single Sign-On services? The word easy is the

    biggest word which means that it is the most frequent

    word response.

    One of the relationships that we explored was the

    usage of SSOs vs. Privacy violated in the future. Weasked the participants to answer the question:

    Do you ever worry that your privacy might be

    violated in the future?1 Please mark the scale

    from 1-5:

    1- Not Worried At All

    2- Somewhat Worried

    3- Neutral

    4- Worried

    5- Extremely Worried

    Of the 142 participants, 47 responded 1 Not WorriedAt All. From the 47 respondents, 32 use SSO services.

    This means that 68.1% use SSOs and are not worried

    about their privacy being violated in the future. As for

    those who answered a 5 Extremely Worried, 10 out

    of the 19 use SSOs which is a 52.6%. There is a 15.5%

    difference between those who answered a 1 and

    those who answered a 5 that use SSOs. This means

    that it is worth noting that out of the participants who

    1: While this question may seem

    quite broad, the context around it is

    a series of questions related to

    internet usage and SSOs, so there

    exists some implicit framing to the

    question

    Figure 5: This is a word cloud

    comprised of user responses to the

    question, Why do you use Single

    Sign-On services? The larger the

    word the more frequent that

    response occurred in survey

    responses.

  • 7/28/2019 SSO Project Paper

    10/28

    10

    use Single Sign-On services, there are more

    participants who are not worried about their privacybeing violated in the future as opposed to being worried

    about their privacy being violated.

    Another relationship that we explored was the usage of

    SSOs vs. Privacy violated in the past. Of the 142

    sample population, 11 participants said yes at having

    their privacy violated in the past. Of the 11

    participants, 5 said they used SSO, which is 45.5%.

    Unfortunately we were not able to determine whether

    SSOs played a part of violating the participants privacy

    in the past or not since about half of the users that had

    their privacy violated in the past used SSOs and theother half did not.

    We also looked at the relationship between the users

    who had their privacy violated in the past and whether

    that affected whether they worried about having their

    privacy violated in the future. Only 11 out of the 142

    participants actually had their privacy violated in the

    past. 72.7% of the 11 participants answered either a 5

    Extremely Worried or a 4 Worried for their

    privacy being violated in the future. This shows that

    people who had their privacy violated in the past are

    more concerned about their future privacy. This makes

    sense because normally people who had a bad

    experience in the past would end up being more

    worried and cautious in the future.

    Limitations

    We had some limitations because some of our

    questions could have been too broad or ambiguous for

    the user. For example, we did not specify the question

    Do you ever worry that your privacy might be violated

    in the future? to just online. We did believe that users

    could determine that is was for online because of the

    wording and flow of our previous questions, but itspossible that not everyone understood it to mean just

    online. Also, our data was limited to participants from

    India which can provide different answers than users

    from the US because of cultural differences.

    Future Work

    In the future, we would try and have more females

    take the survey to get a 50/50 male and female ratio.

    In addition, the majority of the sample for our survey

    was from India, but for the future we would like the

    majority to be from the USA for consistency. We would

    also ask more detailed questions to get richer datafrom our participants as well. Some of our wording

    from our survey could be asked in a better way for the

    future as well.

    Now that we have set up the foundation of knowledge

    in both the SSO Integrators and the SSO Users, we

    have an idea of the essential front-end ofthis

    industry. Next we take an in-depth look in to the back-

    end or in other words a look in to how SSOs work in

    the provider perspective.

    SSO Providers

    A central goal of this research is to be useful to users of

    Single Sign-On (SSO) services for making decisions

    about what data they share and with whom. In order to

    get an overview of the abilities of SSO providers with

    respect to data usage, identify problem areas for users,

    and draft best practices for users to follow when

    deciding whether or not to use an SSO service, we have

    performed analyses of the data use policies of each of

    large SSO providers. This analysis forms the core of our

    technical investigation for this project.

  • 7/28/2019 SSO Project Paper

    11/28

    11

    We observed that data use policies tend to be hard to

    read because of a variety of factors including their size,the vocabulary used in them, and their overall

    complexity. So, part of the motivation for this research

    was to expose details of those policies in a way thats

    easy to understand for users of those services.

    Another reason we performed this analysis was to

    inform the creation of best practices for users to follow

    when deciding whether or not to use a Single Sign-On

    service.

    Methods

    We began this portion of our research by brainstormingsome ideas. Before beginning our formal research, we

    sketched out a few questions we had pertaining to the

    data use policies of SSO providers. These included

    questions such as How and when do SSO providers

    collect user data? How and when do they share user

    data? What sorts of control do users have over the

    sharing and collection of their data?

    We then gathered the data use policies of the top three

    Single Sign-On providers across the web: Facebook,

    Google, and Twitter. (Gigya)

    The next task was to come up with a list of categories

    to classify sections of the data use policies into that we

    consider to be potentially of interest to users. Our

    research and early brainstorms guided the creation of

    high level categories such as allows for collection of

    user data and allows for sharing of user data. We

    used those high level categories in a first-pass reading

    of the data use policies for each major SSO provider in

    which we identified general regions of text that relate

    to the high level categories. Then, we used the insights

    gained from the first readings to produce a more

    detailed list of allowances that may be of interest tousers. The word allowance is used to refer to

    practices that are allowed by a companys data use

    policy.

    The list includes abilities that we consider to be

    concerning, reassuring, or neutral (good, bad, or

    neither for the user). Concerning in this case means

    potentially causing harm to users. Reassuring in this

    case means potentially protecting users from harm.

    Harm is defined to be any occurrence that is

    detrimental to a valued quantity (such as physical

    health, income, reputation, mobility, etc). Groups likehttp://knowprivacy.org and

    http://www.privacychoice.org/ served as inspiration for

    our policy analysis, and some of the practices on the

    list (such as Allows users to delete data and Notifies

    users when government requests access to their data)

    came from those websites. (Know Privacy,

    PrivacyChoice) The final list of practices of interest can

    be seen in the appendix under item Appendix A. The

    list is broken into data collection, data sharing, ad

    targeting, user control, and SSO. However, the

    majority of the allowances on the list are related to the

    first two categories, data collection and sharing,

    because we are primarily concerned with the values of

    privacy and security. User control and SSO refer to the

    value of informed consent.

    During a second read-through, we tagged specific

    clauses in the data use policies that relate to

    allowances on the list with numbers such as [1],

    [2], and [3], and placed the tags within a table

    next to the allowance they relate to. The final result is a

    table that shows each instance of a given allowance

  • 7/28/2019 SSO Project Paper

    12/28

    12

    within each data use policy, and the clauses that relate

    to that allowance. For example, the cell for the

    intersection of the allowance allows collection of IP

    address and the SSO provider Google may contain

    [3], [4], [7] indicating that clauses marked [3],

    [4], and [7] in Googles data use policy relate to the

    given allowance.

    It is important to note that the number of times an

    allowance occurs in a data use policy does not

    necessarily reflect the degree to which a company

    performs a given action. Its tempting to see the

    quantity of references as an indicator of a companys

    actions in that area. Instead, its more useful to think

    about the number of references as a measure of the

    number of ways a company may possibly allow for a

    given practice. Just because an allowance exists doesnt

    mean they use it for example, a company may

    reserve the ability to share personal data with

    governments who request it, but never exercise that

    ability on a users account.

    In the case that a data use policy mentioned the

    collection or sharing of Basic, Personal, or

    Sensitive information, the meaning of the words in

    the context of the particular policy was parsed as

    necessary for entry into the table. For example, the

    definition of Basic information in the Facebook data

    use policy is described as: basic info includes your

    User ID, as well your friends' User IDs (or your friend

    list) and your public information. (Facebook) All

    clauses referencing the collection of basic info were

    broken into the categories that relate to user Id, friend

    information, and public information.

    ResultsThe results show some interesting trends. The first is

    that Google and Facebook have somewhat inverse

    priorities in their data use policies. Facebook is more

    oriented on the sharing of data than collection of data,

    whereas Googles data use policy referenced the

    collection of data more than the sharing of data.

    Googles data use policy has many references to the

    types of data the company may collect from users and

    when, but the policy only mentions the sharing of that

    information with third parties in a few limited

    circumstances. On the other hand, Facebooks data use

    policy allows for the sharing of data in multiple places,

    and only discusses data collection a handful of times in

    the beginning of the data use policy. This relationship

    can be seen in the following stack histogram:

    This stack histogram shows the number of occurrences

    of clauses within each privacy policy that allow for a

    practice on our list of allowances (appendix item

    Appendix A). The blue bars represent data from

    Facebooks data use policy, the red bars represent data

    from Googles data use policy, and the green bars

    Figure 6: A visual example of the

    codifying of an existing data use

    policy and how it fits in the

    categories we established.

  • 7/28/2019 SSO Project Paper

    13/28

    13

    represent Twitters data use policy data. The x-axis ishidden, but each bin is an allowance from the list, in

    the same order as presented in the appendix under

    item Appendix A. The full histogram can be viewed in

    the index under item Appendix C.Since the list was

    broken into Data collection, Data sharing, Ad targeting,

    User control, and SSO, the histogram was drawn in

    clusters, representing data collection, data sharing, and

    user control and consent, indicated by magenta, cyan,

    and yellow regions respectively.The complementary focuses of Facebook and Googles

    data use policies is evident in the distribution of values

    near the beginning and the end of the histogram.

    Facebook is blue, and Google is red. Notice how Google

    has more values near the beginning (where the bins

    represent data-collection allowances), and Facebook

    has more values near the middle and end (where the

    bins represent data-sharing allowances). The circled

    blue bar on the far left of the graph represents the

    category for Allows collection of data generated

    on/with the website (such as game characters, scores,

    application usage etc). Since Facebook is a service

    that largely revolves around content generation anduse of third-party applications, its unsurprising that

    there are many places in the data use policy for

    Facebook that refer to the ability to collect data

    generated with use of the website.

    The second trend is that the policies appear to focus

    more on the companies abilities rather than on the

    users abilities. This is evident in the large proportion of

    concerning practices over reassuring practices that

    each companys policy allows for. The reassuring

    practices largely reflect users abilities, such as the

    ability to delete their data or the ability to opt in or out

    of a data collection/sharing, whereas the concerning

    practices largely reflect abilities of the services, such as

    the ability to collect data or share it with third parties.

    The following pie charts illustrate the ratio of

    concerning policies, neutral, and reassuring policies

    contained within the data use policies of the top three

    SSO providers:

    Figure 8: A display of those

    allowances compared over the

    three SSO Providers and grouped

    according to the buckets they fall

    under. An expanded version is

    available in the appendix.

  • 7/28/2019 SSO Project Paper

    14/28

    14

    Another finding is that none of the top three SSO

    providers data use policies mention two allowances

    that we deemed to be of interest to users. Those

    allowances were Allows sharing of data that third

    parties share with the provider about you with third

    parties and Notifies users when government requests

    access to their data. Those categories were inspired by

    the privacy policy analysis tools on privacychoice.org.

    Figure 9: A series of pie charts

    showing the breakdown of

    concerning neutral andreassuring allowances in their

    respective policies.

    Figure 10: A zoom on the

    allowance chart bringing focus to

    the lack of any policy addressing

    those categories.

  • 7/28/2019 SSO Project Paper

    15/28

    15

    One final observation that stuck out was:

    The only SSO provider in the top three to mention

    single sign on services explicitly in their data use policy

    was Facebook. Google and Twitter may have clauses

    that apply generally enough to cover Single Sign-On

    usage, but they never directly address SSO in their

    data use policies.

    Conclusion

    This form of analysis has its advantages and

    disadvantages.

    One of the most important cons with our approach is

    that gives us no insight into how data is actually used

    by these companies, simply how data may be used.

    Viewers of the results may be misled into thinking

    companies with higher scores associated with a certain

    allowance engage more in the allowed activity.

    The upshot is that very different documents may be

    directly compared with a common metric. This is

    potentially very helpful for anyone interested inunderstanding and comparing data use policies, and

    this gives us a framework to aggregate and compare

    data pertaining to many companies at once. The data is

    quantitative; so many quantitative analysis techniques

    can be used to tease results out of the data. For

    example, we can look for correlation between the

    presence of one type of allowance and the presence of

    another type of allowance within the policies if we had

    enough data to perform the statistics confidently.

    An issue with our study in particular is that we missed

    some allowances that users may be interested in. For

    example, the length of time companys hold on to user

    data before deleting it is of concern for some people,

    but we did not cover it. Other researchers may want to

    look into holes in our allowances, find out how

    confidently we can equate terms across privacy

    policies, and investigate whether or not there is a

    correlation between the number of times a privacy

    policy mentions an ability and the number of ways that

    ability is used in practice.

    Figure 11: Another observation of

    a lack of a service addressing a

    specific category.

  • 7/28/2019 SSO Project Paper

    16/28

    16

    Best Practices & Conclusion

    At this point we have had an in-depth look on the three

    main aspects of SSO services: Integrators, Users and

    Providers. We have establish an understanding of how

    SSO Integrators utilize SSO systems in a practical

    manner to provide better services for users while

    seeing how the treat data of the users. After an

    interesting survey we have determined some more

    information on the prevalence of SSOs in a typical

    users life and their views on how their values such as

    privacy are treated. Finally we had a power look in to

    the way SSO providers approach their services and how

    data from users are treated. While the information

    provided can be used to develop critical thoughts onvarious aspects of SSOs and even internet usage, or

    original goal was to service users by informing them of

    how they can better serve themselves when dealing

    with their data online rooted in our research in SSO

    services.

    BE MINDFUL ON THE VALUE OFYOU

    We cannot stress enough the value that an individual

    has and in particular their identifying information. A

    trend that we have come to see is that individuals tend

    to not value their privacy and security of data until

    something that harms those values. We urge that

    users take their identity online seriously to avoid leaks

    on their data to undesired third-parties.

    STAY UP-TO-DATE

    During our investigation we experienced a change on

    the privacy and data-use policies of Facebook, one of

    our SSO Providers. While we adjusted our work we

    realized that it is incredibly important for users to stay

    on top of changes to the privacy- and data use- policies

    they engage in. While the changes that we encountered

    for Facebook were minimal, its very easy for services

    to change their stance quickly. If anything is apparent

    by the data we harvested and more so the purpose of

    this paper, these services dont work to inform users

    on the details of their policies.

    MANAGE THE ACCESS TO YOUR INFORMATION

    Over extended periods of time, a user is likely to

    establish many different connections between their SSO

    Providers and various SSO Integrators. While some

    may be valuable to the user and their day-to-day life,

    others arent necessary for the user to maintain

    connection with. We advise that for those services that

    are used less often, its useful to disconnect or shutdown accounts so that those services no longer have

    active access to your data and you have one less

    service to manage.

    EVALUATE THE VALUE OF SERVICES USED

    We investigated how three well-known services utilized

    SSO systems and what they provided in terms of value

    to users. While these services do indeed offer great

    value and protection of user data, that is not the case

    with others. Therefore we advise that you take time to

    evaluate on your own whether or not a service you

    intend to sign-up with provides the right value for you

    and how they handle your information by reading their

    policies. In addition, you can take some time to do

    some investigation online in to possible violations those

    services have had in regards to user data.

    MANAGE DIFFERENT KINDS OF DATA

    Although this should be fairly straightforward, its

    something that should always be kept in mind. The

    purpose of this best practice is for you to keep in mind

    what kinds of information you have available and to

  • 7/28/2019 SSO Project Paper

    17/28

    17

    whom. While information like your personal email and

    your name may not be that important or potentially

    insecure, having your address or social security

    information shared around can be quite detrimental to

    user identity security. Conduct a self-audit of what

    information you can find about yourself that can be

    harmful and work towards eliminating that data from

    the internet as best you can.

    While our investigations can be picked through for

    further conclusions we believe that we have established

    a fair foundation for informing users on quite a few

    aspects of their online identities. As points of authority

    on the internet grow further and start showing up inother aspects of our lives, the importance for an

    informed user is paramount to the overall security of

    individuals on the internet.

    References

    "Facebook Data Use Policy." Facebook. N.p., 08 June

    2012. Web. 06 Dec. 2012.

    "Groupon: Privacy Statement." Privacy Statement.

    Groupon, 13 Sept. 2012. Web. 05 Dec. 2012.

    .

    "Which Identities Are We Using to Sign in Around the

    Web?" Gigya. N.p., n.d. Web. 06 Dec. 2012.

    Stack Exchange, Inc. Official Privacy Policy. Stack

    Exchange, Inc., 28 June 2012. Web. 5 Dec. 2012.

    .

    "Which Identities Are We Using to Sign in Around the

    Web?" Gigya. N.p., n.d. Web. 06 Dec. 2012.

    "Wolfram|Alpha Privacy Policy." Wolfram|Alpha.

    Wolfram|Alpha, 5 Mar. 2009. Web. 5 Dec. 2012.

    .

    "Your Privacy. Simplified." PrivacyChoice. N.p., n.d.

    Web. 06 Dec. 2012

    http://www.privacychoice.org/http://www.privacychoice.org/
  • 7/28/2019 SSO Project Paper

    18/28

    18

    Appendix

    Appendix A

    This is an expanded list of the categories that we compared the SSO Providers with each other. They are listed by the

    bucket the fall in to and are codified by whether they are Concerning Reassuring or Neutral.

    Concerning

    Reassuring

    Neutral

    Data Collection

    Allows collection of personally identifiable information (name, birthday, address, phone, email,

    gender)

    Allows collection of information about contacts/friends

    Allows collection of information others have shared about you

    Allows collection of profile information (such as user ID, personal description, likes, interests, etc)

    Allows collection of IP address

    Allows collection of location data

    Allows collection of data generated on/with the website (such as game characters, scores, application

    usage etc)

    Allows collection of browsing history/health history/religion/political orientation (Potentiallysensitive information)

    Allows collection of uploaded media (images, video, text, etc)

    Allows collection of data that third parties share with the provider about you

    Data Sharing

    Allows sharing of personally identifiable information (name, birthday, address, phone, email, gender)

    Allows sharing of information about contacts/friends

  • 7/28/2019 SSO Project Paper

    19/28

    19

    Allows sharing of information others have shared about you

    Allows sharing of profile information (such as user ID, personal description, likes, interests, etc)

    Allows sharing of IP address

    Allows sharing of location data

    Allows sharing of data generated on/with the website (such as game characters, scores, applicationusage etc)

    Allows sharing of browsing history/health history/religion/political orientation (Potentially sensitiveinformation)

    Allows sharing of uploaded media (images, video, text, etc)

    Allows sharing of data that third parties share with the provider about you

    Allows sharing of untagged data (unassociated with users profiles) with third parties with thirdparties

    Requires that receivers of data follow certain guidelines/rules

    Notifies users when government requests access to their data

    Ad Targeting

    Uses data to target users with advertisements (but does not share that data with advertisers)

    User Control

    Allows for opt-out and opt-in for data collection/sharing

    Allows users to delete their data

    SSO

    Specifically mentions single sign on in the data use/privacy policy

    Appendix B

    The bracketed numbers in the provider columns represent locations in the corresponding privacypolicies (included with the appendix of this document with item numbers: _____) in which the specific

  • 7/28/2019 SSO Project Paper

    20/28

    20

    clauses related to a data collection practice exist. To view specific clauses, reference the privacy

    policy and look for a number highlighted in yellow with the value of interest. The text that comes afterthe number is the clause referenced.

    Allowances Provider 1 Provider 2 Provider 3

    Data collection Facebook Google Twitter

    Allows collection of personallyidentifiable information (name,birthday, address, phone, email,gender)

    [1][13][16] [1][2][3][4][6] [1][3][7][12][13][15]

    Allows collection of information aboutcontacts/friends

    [3][15][24][25] [2][3] [11][13][15]

    Allows collection of information othershave shared about you

    [24] None None

    Allows collection of profile information(such as user ID, personaldescription, likes, interests, etc)

    [12] [1][2][3] [2][5][13] [16]

    Allows collection of IP address [8] [3][4][7] [21]

    Allows collection of location data [8][11] [2][9] [6][15][17][19]

    Allows collection of data generatedon/with the website (such as gamecharacters, scores, application usageetc)

    [2][3][6][7][9][11][19][24] [3][5][10] [14][15][16]

    Allows collection of browsing None [3][5][10] [20]

  • 7/28/2019 SSO Project Paper

    21/28

    21

    history/health history/religion/ political

    orientation (Potentially sensitiveinformation)

    Allows collection of uploaded media(images, video, text, etc)

    [5][14] [3][2] [7]

    Allows collection of data that thirdparties share with the provider aboutyou

    [10][37] [3] [13][22]

    Data Sharing

    Allows sharing of personally

    identifiable information (name,birthday, address, phone, email,gender) with third parties

    [13][19][29][30][36][44] [2][16] [9][25][26]

    Allows sharing of information aboutcontacts/friends with third parties

    [16][19][29][30][36][38][44]

    None [15][25][26]

    Allows other users to shareinformation about you with thirdparties

    [24][32][33] None [15]

    Allows sharing of profile information(such as user ID, personaldescription, likes, interests, etc) withthird parties

    [15][19][26][29][30][36][38][41] [44]

    [2] [4][9][22][25][26]

    Allows sharing of IP address with thirdparties

    [44] None [24][25][26]

    Allows sharing of location data withthird parties

    [19][28][29][30][44] None [15][17][19][25][26]

    Allows sharing of data generatedon/with the website (such as gamecharacters, scores, application usageetc) with third parties

    [16][29][30][44] [2] [9][14][15][25][26]

  • 7/28/2019 SSO Project Paper

    22/28

    22

    Allows sharing of browsing

    history/health history/religion/politicalorientation (Potentially sensitiveinformation) with third parties

    [29][30][44] [16] [25][26]

    Allows sharing of uploaded media(images, video, text, etc) with thirdparties

    [15][19][29][30][44] None [14][15][25][26]

    Allows sharing of data that thirdparties share with the provider aboutyou with third parties

    None None None

    Allows sharing of untagged data(unassociated with users profiles) withthird parties with third parties

    [12][42][45] [18] [29]

    Requires that receivers of data followcertain guidelines/rules

    [45][46] None [29]

    Notifies users when governmentrequests access to their data

    None None None

    Advertisements

    Targets users with specificadvertisements (but does not sharethat data with advertisers)

    [4][43] [11] [20]

    User Control

    Allows for opt-out or opt-in for datacollection and sharing

    [20][22][27][34][39] [12][13] [10][18][23]

    Allows users to delete their data [21][23][31][40] [14][15] [30][31]

    SSO

  • 7/28/2019 SSO Project Paper

    23/28

    23

    Specifically mentions single sign on in

    the data use policy

    [35][36] None None

  • 7/28/2019 SSO Project Paper

    24/28

    24

    Appendix C

  • 7/28/2019 SSO Project Paper

    25/28

    25

    Appendix D

    SSO Survey for Users1) Are you male or female? Male Female

    2) What age group do you fall under? 17 and under 18-21 22-25 26-30 31-40 41-50 51-60 61-70 71 and over

    3) What country do you live in?______________________

    4) How many hours a week do you spend on a computer?________

    5) How many hours a week do you spend on the internet?________

    6) What percent of the time do you use the internet for personal and business uses? (Your responses should sumto 100.)

    ______% Personal______% Business

    7) Please estimate the number of hours you spend per week on the following services:Services Number of HoursEmail ________Facebook ________Twitter ________Google Account ________Other: __________ ________Other: __________ ________Other: __________ ________

    8) What are your primary uses for the internet? Shopping Research

    Communication

  • 7/28/2019 SSO Project Paper

    26/28

    26

    News Other (please list) __________________

    9) Has your privacy ever been violated on the Internet? Yes No

    10)If yes, please briefly describe the most recent time that your privacy was violated.___________________________________________________________________

    11)Do you ever worry that your privacy might be violated in the future? Please mark the scale from 1-5. 1- Not worried at all 2- Somewhat worried 3- Neutral 4- Worried 5- Extremely worried

    12)Please briefly describe a situation where your privacy might be violated online.

    13)Are you familiar with Single Sign-On Services (SSO)? (For example: Facebook Connect or Google Accounts) Yes No

    14)Do you use Single Sign-On services? Yes No

    15)If yes, which SSO do you use? Facebook Connect

    Google Account Twitter Other (please list) ___________________

    16)If yes, why do you use Single Sign-On service?__________________________________________________________________________

    17)Please describe how you think Single Sign-On services work.__________________________________________________________________________

    18)Do you have any privacy or security concerns related to your use of Single Sign-On services? Yes No

  • 7/28/2019 SSO Project Paper

    27/28

    27

    19)Do you have multiple identities online? Yes

    No

    20)How many email addresses do you have?______

    21)Do you typically link your payment/credit card information to your personal identity online? Yes No

    22)How do you typically pay for things you purchase online? Direct credit card PayPal Google Wallet Other (please list) _____________

    23)Where did you hear about this survey? Facebook Reddit Search Engine Other (please list) _____________

    Appendix E

    For access to other pertinent data points please reference:

    Data Link

    RAW Survey

    Data

    https://docs.google.com/spreadsheet/ccc?key=0AqKTq25pswcgdG1ZTldvRVE3TnNMdEg3M0IyamNNSlE

    Facebook

    Policy

    (Codified)

    https://docs.google.com/open?id=0B6ANIPyq21eMcW5mclFmR3ZKb2M

    Twitter

    Policy

    (Codified)

    https://docs.google.com/open?id=0B6ANIPyq21eMMVYxSVZ6MW10SEk

    https://docs.google.com/spreadsheet/ccc?key=0AqKTq25pswcgdG1ZTldvRVE3TnNMdEg3M0IyamNNSlEhttps://docs.google.com/spreadsheet/ccc?key=0AqKTq25pswcgdG1ZTldvRVE3TnNMdEg3M0IyamNNSlEhttps://docs.google.com/open?id=0B6ANIPyq21eMcW5mclFmR3ZKb2Mhttps://docs.google.com/open?id=0B6ANIPyq21eMcW5mclFmR3ZKb2Mhttps://docs.google.com/open?id=0B6ANIPyq21eMMVYxSVZ6MW10SEkhttps://docs.google.com/open?id=0B6ANIPyq21eMMVYxSVZ6MW10SEkhttps://docs.google.com/open?id=0B6ANIPyq21eMMVYxSVZ6MW10SEkhttps://docs.google.com/open?id=0B6ANIPyq21eMcW5mclFmR3ZKb2Mhttps://docs.google.com/spreadsheet/ccc?key=0AqKTq25pswcgdG1ZTldvRVE3TnNMdEg3M0IyamNNSlE
  • 7/28/2019 SSO Project Paper

    28/28

    28

    Google

    Policy

    (Codified)

    https://docs.google.com/open?id=0B6ANIPyq21eMR00xT2tfcFVid2c

    https://docs.google.com/open?id=0B6ANIPyq21eMR00xT2tfcFVid2chttps://docs.google.com/open?id=0B6ANIPyq21eMR00xT2tfcFVid2chttps://docs.google.com/open?id=0B6ANIPyq21eMR00xT2tfcFVid2c