ethical issues with customer data collection

13
1 Missouri University of Science and Technology Ethical Issues with Customer Data Collection Submitted To: Dr. David Spurlock Submitted By: Tatiana Cardona Sulagna Mandal Hari Nadathur Pranav Godse

Upload: pranav-godse

Post on 15-Apr-2017

250 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: ETHICAL ISSUES WITH CUSTOMER DATA COLLECTION

1

Missouri University of Science and Technology

Ethical Issues with Customer Data Collection

Submitted To:

Dr. David Spurlock

Submitted By:

Tatiana Cardona

Sulagna Mandal

Hari Nadathur

Pranav Godse

Page 2: ETHICAL ISSUES WITH CUSTOMER DATA COLLECTION

2

ABSTRACT

The paper discusses topics related to:

Data mining-collection and analysis of large amounts of data.

Data mining is a branch of Computer Science which deals in processing large scale data to extract

previously unknown, interesting patterns. The objective is to process and get the required relevant

information from large volumes.

Ethical issues related to data mining and how it impacts web miners and web users.

Data mining does possess information privacy threat as an individual’s/group’s personal

information is freely available. The individual must have information on what is the purpose of the

data collection, who is the recipient of the data, its implications and related information. Ethical

data mining is however acceptable. It refers to the ethical usage of individual data in accordance

with the privacy rules and set standards.

Defines the fine line between ethical and unethical usage of data mining.

Although the impact of web-data mining should be a concern for every web user, there is no reason

for people to panic. This technique is not yet being used to its full potential.

There is, however, no clear indication of web data being misused to an extent that people are

offended.

Page 3: ETHICAL ISSUES WITH CUSTOMER DATA COLLECTION

3

DATA MINING

Data Mining involves six stages:

1. Detection: In this stage any noticeable difference in the data patterns is detected. This stage is very

crucial, since the quality of data collected will impact on the output.

2. Dependency modelling: The relationship between the variables is found such as the buying trends

of a particular age group, effect of tax reduction on savings, sales effect on sale of goods due to

discount, etc.

3. Clustering: Clustering is a process of partitioning a set of data (or objects) into a set of meaningful

sub-classes, called clusters. This helps in understanding the natural grouping or structure in a data

set.

4. Classification: Data classification is the classification of data based on its level of sensitivity. The

classification of data helps determine what baseline security controls are appropriate for

safeguarding that data.

DAT

A M

ININ

G

Anonymous data collected. (Generally

used to recognize group patterns)

CONTENT MININGInformation collected trhough

web navigation history (E.g. trough cookies)

ESTRUCTURE MINIG

Information collected identifiyin the IP addres and relating it whit the company

provider (IPS) in order to obtain more specific data (E.g.

names, address, phone number etc)

Data collected from an user. (Generally

used to characterize his/her behavior)

USAGE MINING

Informaton collected when the user give information in order

to acces to certain benefits (E.g. loggin information)

Page 4: ETHICAL ISSUES WITH CUSTOMER DATA COLLECTION

4

5. Regression : It is a statistical approach to forecast change in a dependent variable (sales revenue,

for example) on the basis of change in one or more independent variables (population and income).

6. Summarization: Summarization is a key data mining concept which involves techniques for

finding a compact description of a dataset. Data summarization provides the capacity to give data

consumers generalize view of disparate bulks of data.

METHODS OF DATA MINING

1. WEB TRACKING

Web tracking is all about the Companies that track consumers’ behavior across the Web without their

consent, and without providing them any recognizable value.

Behavioral audience targeting, like content targeting, sponsored advertorials, pre-rolls and every other

ad-product available in digital environments, serves content creators. Keeping content creators in

business serves consumers, giving them a myriad of digital environments to explore.

But in many cases advertisers misuse behavioral data and this is something against ethics. Third Party

companies with no direct relationship to the consumer begin tracking those consumers across numerous

websites, create profiles of that behavior and profit off that information that they haven’t asked

permission to collect. This is what we call ethically problematic issue.

Reasons for Web tracking:-

To Boost Marketing Capabilities

Law Enforcement and Intelligence

Web Analytics

Page 5: ETHICAL ISSUES WITH CUSTOMER DATA COLLECTION

5

2. SURVEYING

A survey is a research method for collecting information from a selected group of people using

standardized questionnaires or interviews. There are numerous survey research methods to obtain

customer preferences and likeness such as:

In-Person Interviews

Telephone interviews

Online Questionnaires

BIG DATA PERSPECTIVES

Large collections of data have addressed the focus on different perspectives as it can be seeing.

As a Technology innovation In order to accomplish its purpose data mining must be developed to answer

effectively the concerns about: First, the progress of storage alternatives or Volume. Second, easy acces in

real time Velocity. Third, the current data is mostly unstructured (difficult to stablish its exact use due to

the large amount and possibilities of anlysis) Variety.

As a Commercial Value: The use of data generates value trough the identification of complex patterns in

real time (foundation of market research) and the prediction of quality issues.

As a matter of privacy: In the challenge of protect the privacy, it must exits a balance in its use and the

following factors. Recolection:- sets of data analyzed independently do not represent privacy implications

but combined can threaten the privacy. Security:- Personal data can be hacked and stolen. High volume and

velocity:- Data should be autonomously analyzed (No time to wait for consents). Significance:-

Organisations are far from have the ability to use all the collected data.

These perspectives can give a scheme about the direction in which data mining is evolving, and surely is

possible to assume that there is not a coming back in the way information is being used.

Page 6: ETHICAL ISSUES WITH CUSTOMER DATA COLLECTION

6

Considering together the three points of view is likely to assume that the progress of the first tow

(Technology innovation and Commercial value) are linked to the use of big data as a matter of privacy

based on how personal information is analyzed and how consumer relationships are built, bearing in mind

security implications within individuals’ social interaction through the use of personal technological

devices.

ETHICAL CHALLENGES IN CUSTOMER DATA HANDLING

Information privacy is defined as the relationship between collection and dissemination of data, technology,

the public expectation of privacy, and the legal and political issues surrounding them.

Data mining does possess information privacy threat as an individual’s/group’s personal information is

freely available. The individual must have information on what is the purpose of the data collection, who

is the recipient of the data, its implications and related information.

The following are some issues in application of data mining as a commercial value with their ethical facts:

The social graph: Deducted by social networking (information given voluntary) is the picture to be built

of group-level interactions and the nature of the bonds that bring these people together

Ethical challenge: Ambiguity. Uncertainty in the group picture due to the possibility of labeling friends

with weak social ties that are not representative of the physical-world life.

Ownership of data: Instead of being collected by government entities or the traditional large companies,

data is collected by high technology companies as Facebook, Google, and Twitter among others.

Ethical challenge: Some of the owner of the data have the promise of not to sell the data now, but the

evolution of data mining as a valuable technology it can change in the future as a consequence of the

changes in the policies of data use.

Data memory: Data collected and stored can be recalled and analyzed in the future.

Page 7: ETHICAL ISSUES WITH CUSTOMER DATA COLLECTION

7

Ethical challenge: Information storage about individual’s life can retrieve past behaviors (E.g.

Facebook timeline can represent a disadvantage for a person who use to party very frequently and now

is in a job search). Data memory "may remove the ability for individuals to forget and be forgotten"

Passive data collection: Automatic data collection trough passive technologies. (E.g. Mobile location

information).

Ethical challenge: Increases the amount of data collected and the variables to take in account in the

analysis of the data. But individuals are not aware of it, and even if they authorized the data collection

at a first point, systems are not asking each time that are doing the collection.

Respecting privacy in a public world: The use of technologies has become necessary nowadays and they

are of easy access, offering benefits at low cost (e.g. free apps). However the use of certain technological

devices implicates the collection of information from the servers.

Ethical challenge: Individuals can step up from giving information; however the use of the technology

has become a necessity and an important factor of social interaction, then the paradox is that making

the decision of giving information can represent to be excluded from the community.

Although now this ethical issues are challenges in the application of this technology, the laws and

regulations are gradually being updated based on the concerns on individuals privacy. Thus is important to

highlight the fact that Data mining is an emergent practice, hence it is under an adjusting phase. For its

current application the self-regulation is a very important aspect for the companies to take in consideration

when dealing with big data.

Below are some recommendations that must be taken in support of ethical data mining

1. Verify the data source for authenticity

2. Expectation of customers must be considered and respected.

3. Developing better customer relations

4. Emphasis on ethical data mining

Page 8: ETHICAL ISSUES WITH CUSTOMER DATA COLLECTION

8

5. Control on unregulated data access and software

6. Corrective action to be taken on offenders

CASE STUDIES- CONS

Target Corporation Case:

Target Corporation - A large scale retailer of consumer goods assigns every customer a Guest ID number,

tied to their credit card, name, or email address and stores the history of that customer’s purchases and other

demographic information they have collected from them or obtained from other sources.

Lots of people buy lotions, but one of Target’s employees noticed that women on the baby registry were

buying larger quantities of unscented lotion around the beginning of their second trimester.

An angry man went into a Target store outside of Minneapolis, demanding to talk to a manager: My

daughter got this in the mail!” he said. “She’s still in high school, and you’re sending her coupons for baby

clothes and cribs? Are you trying to encourage her to get pregnant?”

The manager having no idea about the issue, looked at the mailer which was addressed to the man’s

daughter, and contained advertisements for maternity clothing, nursery furniture and pictures of smiling

infants.

The manager apologized and then called a few days later to apologize again. This time however the man

said “I had a talk with my daughter. It turns out there’s been some activities in my house I haven’t been

completely aware of. She’s due in August. I owe you an apology.”

Despite the accuracy of data analysis by Target Corporation, the teenage girl’s privacy with her personal

life is exposed and this results in unethical usage of customer behavior on the web.

Page 9: ETHICAL ISSUES WITH CUSTOMER DATA COLLECTION

9

LinkedIn Lawsuit:

Recently LinkedIn CEO Jeff Weiner admitted that the social networking site was guilty of sending too

many emails to some users.

The “Add Connection” service in LinkedIn lets users to import contacts from their email accounts and send

invitations to connect on the site. The way the "Add Connections" service works is that an email invitation

is sent out by LinkedIn to the contact, but if the person does not respond to the invitation within a certain

amount of time, LinkedIn follows up by sending them two more reminder emails.

The suit claims that LinkedIn repeatedly “spammed” those contacts with unwanted emails despite LinkedIn

members not providing their consent to send the additional emails.

LinkedIn said in an email to its users that anyone who used the service between Sept. 17, 2011, and Oct.

31, 2014, is eligible to file a claim.

The amount that each user will receive will depend on how many people come forward, but LinkedIn said

each person could earn up to $1,500.

LinkedIn says it has revised its disclosures to clarify that two reminder emails will be sent as part of its

"Add Connections" feature. The company says it will, by year's end, also offer an option to users to cancel

a connection invitation, thereby halting any additional reminder emails from being sent out.

This case is a classic example of ownership of data and passive data collection which pose ethical challenges

to customer’s privacy on the web.

Page 10: ETHICAL ISSUES WITH CUSTOMER DATA COLLECTION

10

ARGUMENTS TO SUPPORT DATA MINING-PROS

Arguments that defend the above discussed ethical issues based on the experiment conducted on

professionals applying web data mining practices in a business context. Their views are as follows:

Web-data mining itself does not give rise to new ethical issues.

Professionals argue that there is nothing new about web-data mining practices as it is just an

extension of old situations to new situations created by computer and information technology. One

first has to clear up the uncertainties, which have to do with understanding what data mining is.

Most of the possible dangers come from group profiling, and since group profiling has been done

before data mining techniques were known, the issues could be considered to be old news.

There are laws to protect private information.

This argument cannot be told with conviction, as the law is never fully sufficient with respect to

privacy problems. For instance, current privacy laws only offer protection for the misuse of

identifiable personal data but there is no legal protection for the misuse of anonymized data used

as if it were personal data. The growing number of online privacy policies is an example of self-

regulating efforts. Such policies, however, are not found on every site. Thus, there are still a lot of

sites that a person, who is concerned about his online privacy, should not visit. In addition, it is not

always an easy task for a web user to thoroughly read the privacy statements on every site he/she

visits.

Page 11: ETHICAL ISSUES WITH CUSTOMER DATA COLLECTION

11

Many individuals simply choose to give up their privacy, and why not use this data.

As people can refuse to give out information about themselves, they possess some power to control

their relationship with organizations. Many individuals simply choose to give up their privacy and

what can be wrong with collecting this public data from the web that is voluntarily given? It is there

for the taking.

Most collected data is not of a personal nature, or is used for anonymous profiles.

So why should there be a privacy problem? An argument often heard is: “Our software is used to

identify crowd behavior of visitors to web sites. Therefore, if we don’t know who you are, how can

we be invading your privacy?

Web-data mining leads to less unsolicited marketing approaches.

Data mining techniques will provide more accurate and more detailed information, which can lead

to better and fairer judgements. So, web-data mining leads to less unwanted marketing approaches.

Therefore, why would people complain?

Personalization leads to individualization instead of de-individualization.

Most customers like to be recognized, and treated as a special customer. So it is not considered a

violation of privacy to analyze usage interaction.

Page 12: ETHICAL ISSUES WITH CUSTOMER DATA COLLECTION

12

CONCLUSION

Although there are many ethical challenges prevalent with respect to data mining, it can be attributed to the

fact that data mining is an emerging technology and the market is adjusting to its capabilities and there is

no immediate threat to users. So, it is by no means clear that companies are using unexpected and non-

obvious associations, classifications, clusters, and profiles based on web data as grounds for decision-

making

The solutions discussed previously can contribute to the responsible and well considered

development and application of web-data mining.

The laws and regulations associated to it are bound to evolve depending on how it is perceived.

There are things that can be done to guide this technique in a socially acceptable direction.

As ethical issues will grow as rapidly as the technology, ethical considerations should be an

integrated and essential part of this development process instead of something at its side.

This is a joint responsibility of both web miners and web users.

Some methods to avoid web tracking:

1. Ensure that the website is safe before sharing any information or filling out any registration forms

(by checking the website’s privacy policy and commentaries).

2. Ensure that your online accounts in the different websites are configured for providing optimal

privacy levels.

3. Use an email provider that has a reliable dedication to the protection of customer privacy.

4. Enhance the privacy of your browser through various add-ons and extensions.

Page 13: ETHICAL ISSUES WITH CUSTOMER DATA COLLECTION

13

REFERENCES

Earley, S. (2014). Big Data and Predictive Analytics: What's New? IT Professional IT Prof., 13-15.

Reteived November 16, 2015.

http://ieeexplore.ieee.org.libproxy.mst.edu/stamp/stamp.jsp?tp=&arnumber=6756866

Wel, L., & Royakkers, L. (2004). Ethical issues in web data mining. Ethics and Information Technology,

6(2), 129-140. Retrieved November 17, 2015, from

http://link.springer.com/article/10.1023/B:ETIN.0000047476.05912.3d

Nunan, D., & Domenico, M. (2013). Market research and the ethics of big data. International Journal of

Market Research Int. J. Market Res. http://um9mh3ku7s.search.serialssolutions.com/?ctx_ver=Z39.88-

2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-

8&rfr_id=info:sid/summon.serialssolutions.com&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=art

icle&rft.atitle=Market+research+and+the+ethics+of+big+data&rft.jtitle=INTERNATIONAL+JOURNAL

+OF+MARKET+RESEARCH&rft.au=Nunan%2C+D&rft.au=Di+Domenico%2C+M&rft.date=2013&rft

.pub=MARKET+RESEARCH+SOC&rft.issn=1470-

7853&rft.volume=55&rft.issue=4&rft.spage=505&rft.epage=520&rft_id=info:doi/10.2501%2FIJMR-

2013-015&rft.externalDBID=n%2Fa&rft.externalDocID=000340017200005&paramdict=en-US

GENERAL REFERENCES

Moftakhari, M., Ethical issues in data Mining. 23 pages. http://ickm2014.bilgiyonetimi.net/wp-

content/uploads/2015/01/mandana.pdf

Carr, N., (2010). Tracking is an assault on liberty, with real dangers. The wall street journal.

http://www.wsj.com/articles/SB10001424052748703748904575411682714389888

Harper, J., (2010). It’s modern trade: Web users get as much as they give. The wall street journal. http://www.wsj.com/articles/SB10001424052748703748904575411530096840958.

CASE STUDIES:

Hill, K., (2012). How Target Figured out a teen girl was pregnant before her father did. Forbes Tech.

http://www.forbes.com/sites/kashmirhill/2012/02/16/how-target-figured-out-a-teen-girl-was-pregnant-

before-her-father-did/

Roberts, J., (2015).LinkedIn will pay $13M for sending those awful mails. Fortune.

http://fortune.com/2015/10/05/linkedin-class-action/