social media investigations

36
Data Mining And Internet Profiling: Approaches to Successful Online Social Media Investigations Shellee Hale

Upload: shelleehale

Post on 19-May-2015

11.251 views

Category:

Technology


3 download

DESCRIPTION

Data Mining and social media investigations and profiling.

TRANSCRIPT

Page 1: Social Media Investigations

Data Mining And Internet Profiling:

Approaches to Successful Online Social Media Investigations

Shellee Hale

Page 2: Social Media Investigations

Shellee Hale - President of Camandago, Inc. • WA Licensed Private Investigator• CAS – Certified Anti-Terrorism Specialist• CEMS – Certified Emergency Management Specialist• Specializes in:

– Cyber Tracing– Dataveillance – Cyber Warfare Threat Profiling  

• Constituent for the Overseas Security Advisory Council (OSAC)– Federal Advisory Committee with a U.S. Government charter

to promote security cooperation between US private sector interests worldwide and U.S. Dept. of State

• Infragard Member• Seattle FBI Citizens Academy Alumni Association

Page 3: Social Media Investigations

Dataveillance

• Dataveillance is the systematic use of digital personal data in the investigation or monitoring of the actions or communications of one or more persons

• Web based search

• Social Media

Page 4: Social Media Investigations

Search Engines• Search engines are algorithmic information retrieval systems that

allow searching of massive web-based databases. A web search engine is designed to search for information on the World Wide Web and FTP servers. The search results are generally presented in a list of results and are often called hits. The information may consist of web pages, images, information and other types of files.

• Google, • Bing, • Yahoo,• AOL Search, • AlltheWeb.com, • Ask Jeeves, • Excite, • Lycos, • Alta Vista

Page 5: Social Media Investigations

Google Hacks,

Tara Calishain, Rael Dornfest Publisher: O'Reilly Media

http://oreilly.com/catalog/9780596004477

• http://www.googleguide.com/advanced_operators_reference.html

• http://www.googleguide.com/

• http://tothepointresearch.com/

• http://www.tothepc.com/archives/10-bing-search-tips-features-for-better-searching/

• http://malektips.com/bing_search_engine_help_and_tips.html

• http://www.websightcreations.com/search_engine_tips.html

• http://www.cnet.com/4520-10165_1-6415447-1.html

• http://www.google.com/support/websearch/bin/answer.py?hl=en&answer=136861

Search Engine Tutorials

Page 6: Social Media Investigations

Search Techniques• +-• Quotations• Keyword Order and lowercase • Truncation (*)• Allinbody, [ allinbody:keyword]

(Allintitle, Allinurl)

Boolean logic

• Enclose OR statements in parentheses.

• Always use CAPS Most engines require that the operators (AND, OR, AND NOT/NOT) be capitalized.  

• http://www.internettutorials.net/boolean.asp

Page 7: Social Media Investigations

Archive.org• History Of A Website• Doesn’t include adult• Not a complete archive • You can remove yourself

from the machine with robots.txt

• Always copy urls, because sometimes you can’t backtrack. Google updates its results constantly and with the more than 20 billion websites out there, you may never find the same info again.

• Take screenshots of content, or consider making use of CAMTASIA, a screen recorder and editing software program.

• Camtasia Studio

Page 8: Social Media Investigations

Meta Search Engines• Search engines that search other search engines and

directories. They extract the best of the searches from various popular search engines and directories and include the information in their own search results.

• Dog pile, • WebCrawler, • Excite, • MetaCrawler, and • Ixquick

Page 9: Social Media Investigations

Directories• Search engines that search other search engines and

directories. They extract the best of the searches from various popular search engines and directories and include the information in their own search results.

• Yahoo Directory• LookSmart• Open Directory (DMOZ.org)• Wikipedia

Page 10: Social Media Investigations

Gateways

Collections of databases and informational sites assembled, reviewed and recommended by specialists used to access this material

Subject-Specific Databases (Vortals) are devoted to a single subject ie WebMD

Invisible Web

• Large portion of the Web that search engine spiders cannot index - 60-80% of web material

• Pass-protected sites, • Documents behind firewalls,

archived material, • The contents of certain

databases, • Information that isn't static but

assembled dynamically in response to specific queries

Page 11: Social Media Investigations

Verifying Sources• Unlike scholarly books and journal articles, web sites are

seldom reviewed or refereed. It's up to you to check for bias and to determine objectivity.  Try to assess the stability of the pages you reference.

• Understand legitimacy of web address: edu, gov, mil - most reliable sources. com, net ,org  Countries have specific codes .ca, .uk, etc• look closely at the page sponsor, last date updated, and the authority of the author(s) if possible.

• Research Information on domain ownership whois.net/

• Verify inbound links checkbacklinks.net

• Check web traffic Alexa.com

Page 12: Social Media Investigations

Information Aggregators

These are tools which pull in information from multiple sources, and consolidate that information into a smaller and more easily digested number of streams

• RSS Feeds (Google Reader, Bloglines) Pull blogs into a single stream of information

• Spokeo - Big Brother Of Social Networking http://www.pandia.com/sew/620-spokeo.html

• 123people.com – Gateway to Paid databases. Shows available websites around a specific name.

• Pipl - The most comprehensive people search on the web

• yoName – Searches Social Networks

Real time news interceder.net

Email alerts

Brizzly, Seesmic Web, HootSuite, Dabr, Slandr, etc.

Real-time news

Page 13: Social Media Investigations
Page 14: Social Media Investigations

Analyzing Social Media Networks with NodeXL: Insights from a Connected World

Derek Hansen, Ben Shneiderman, Marc A. Smith 

Social Media Networks

Page 15: Social Media Investigations
Page 16: Social Media Investigations

Website TOS, Privacy Laws And Proposed Regulations

• Social Media is a key component to profiling a subject of investigation. The pool of information about each individual can form a distinctive “social signature,” But there are limitations to the info you can access on a Social Network due to privacy settings and anonymity.

Page 17: Social Media Investigations

Issues With Anonymity• We have a right to it, but websites are not allowing it

via TOS. You can be anonymous online, but how can u be anonymous online when they are asking for real info?

– If you go into Facebook and setup a profile, their TOS say that is you. You have to have a valid email address, but how do you know that they are using any random email address and name?

– It is not illegal for internet users to impersonate or create a false identity online.

• Popularity of a site comes with vulnerability of attack.

– We are seeing and increase in SPOOFING - ie reset password emails giving someone else ownership of your account.

– Be advised that accounts under a persons name can be a result of spoofing and not nessicarily created by a user.

In the context of network security, a spoofing attack is a situation in which one person or program successfully masquerades as another by falsifying data and thereby gaining an illegitimate advantage.

Page 18: Social Media Investigations

The Privacy Debate• We want privacy We expose private details of our

lives online. • Once you post something, you are leaving a digital

footprint that is owned by the site.

• Facebook has been receiving a lot of bad press. Users fear of how their data might be used. Privacy Policies and TOS are constantly being changed

• We are seeing 2 different agendas in terms of advocates in online privacy

1. We put pressure on websites to protect our information, and we do reserve that right.

2. But the same time because of the vast scope and information on social media the government wants a backdoor to get info for investigations and terrorism research. – this will leave personal info vulnerable to

hackers...

Consider This…

There are different privacy laws in every country.

Check TOS and privacy laws on each websites. They may allow backdoors.

Page 19: Social Media Investigations

Privacy Settings • Its Important to understand privacy laws and settings for major

social networks to understand limitations, and how to potentially work around them...Users can select their own privacy settings, and there are few ways to get around them,

Privacy Settings: Profile can be viewable by

• No one, • Viewed by everyone,• Friends Of Friends, • Friends Only, • Custom

Facebook Profiles Offer

• Phone numbers, • Email addresses,• Photos provide a history and

timeline. • Status updates offer current

whereabouts etc.

Page 20: Social Media Investigations

Public Tweets:Your updates appear in Twitter’s public timeline — a flowing river of every member’s status.Anyone can see your Twitter updates.Your Twitter updates can be

indexed by search engines.

Protected Tweets:

People will have to request to follow you and each follow request will need approvalYour Profile and Tweets will only be visible to users you've approved

Protected Profiles' Tweets will not appear in Twitter search@replies sent to people who aren't following you will not be seenYou cannot share static page URL's with non-followers

Page 21: Social Media Investigations

Default Settings: By default, people on MySpace can see when you’re online. Your profile and photo is also set to be viewable by everyone.

Privacy Options:MySpace’s privacy options are very limited, but changing three key settings can provide you with some important privacy protection:

• Online Now• Profile Viewable by• Photos

Page 22: Social Media Investigations

• If you have an email address you want to put a face to, you can also find who owns an email address by searching the email address in the Facebook search window.

• Anyone can create a fake profile so use this to your advantage. Some users will allow friends of friends to access part if not all of a profile. Befriend a friend of someone you are investigating.

Tips & Tricks

• How do you get in and see info if its been deleted?

• Tweleted allowed you to recover Twitter message

• If user a quotes user b who then removes tweet, it will still show up in user a's quotes.

RESOURCE:How To Protect Your Privacy onFacebook, Myspace, And Linked In

http://www.mint.com/blog/moneyhack/howto-protect-your-privacy-on-facebook-myspace-and-linkedin/

Page 23: Social Media Investigations

Properly DocumentingSocial Media Investigations

• Always copy urls, because sometimes you cant backtrack. Google updates its results constantly and with the more than 20 billion websites out there, you may never find the same info again.

• Take screenshots of content. (ie. craigslist ads)

• Consider making use of CAMTASIA, a screen recorder and editing software program.

Take Screencaps on the fly Draw attention with arrows, add text Organizational tools - Search for your captures by date, website, or a custom flag that you create and assign.

Page 24: Social Media Investigations

Centrifuge Systems• Centrifuge has created a

powerful approach to analysis called “Interactive Analytics”. Our next generation approach provides groundbreaking visualizations accessible from any browser and any operating system.

• “Interactive Analytics” (IA) is based on extensive work with the US Intelligence Community and brings together three innovations in analytics today, Interactive Data Visualization, Unified Data Views and Collaborative Analysis.

http://www.centrifugesystems.com/

Page 25: Social Media Investigations

AskSam.comA free-form database designed for users rather than programmers (Like a CMS)

Easy to turn anything into a searchable database: email messages, word processing documents, text files, spreadsheets, addresses, Web pages, and more.

http://www.asksam.com/

Page 26: Social Media Investigations

Factors in Predicting Online Deception

Any intentional control of information in a message to create a false belief in the receiver of the message

DECEPTION

Page 27: Social Media Investigations

Frequency Of Lying

“Electronic mail is a godsend. With e-mail we needn’t worry about so much as a quiver in our voice or a tremor in our pinkie when telling a lie. Email is a first rate deception-enabler.”

~Keyes (2004) The Post-Truth Era

How do different media affect lying and honesty?

• 1.75 lies identified in a 10 minute exchange

• Range from 0 lies to 14 lies• Self-preservation goal (‘likeable’)

increases deception

Page 28: Social Media Investigations

True Personality vs. Embellished Identity

Changing pronouns as benign as it seems is the queen mother of linguistic violations and is a very strong indication that deception might be present!

for instance our house vs. my house

• In many cases if a person does not start with "i" the statement is more likely to be lacking credibility.

• Consider that• The others were not significant enough to mention• There is emotional distance• The author may be trying to conceal someone's presence in

the story• The author is under tremendous amount of stress• We went to the store• He and I went to the stores• I went to the store with him

• i, • me, • my, • mine, • you , • your(s) • him • his • he • she • her • hers

http://www.likasoft.com/document-search/

• it • its • they • them • their • theirs • us • we • our(s)• myself • Yourself• himself• herself• ourselves • themselves

Page 29: Social Media Investigations

Online DeceptionThe ambiguity of the Internet allows complete anonymity, providing the user with the ability to create false and misleading profiles and identities online, thus hiding their true identity.

•gender swapping online, •with men playing women. •Adults posing as children etc

lies or exaggerations of one’s physical appearance, personality or characteristics,

or even slight exaggerations of a genuine characteristic such as denying being a smoker, drinker, etc.

One can have ‘as many electronic personas as one has time and energy to create’ (Donath, 1999).

Page 30: Social Media Investigations

• STUDY• the University of Texas at Austin that suggest users

express their true personality – not an embellished identity – over online social networks such as Facebook.

• The Texas researchers collected 236 profiles of college-aged users of Facebook in the United States and StudiVZ, the equivalent in Germany. The users filled out questionnaires about their personality and also about who they'd like to be. Strangers browsed and rated the online profiles, and the study authors compared the ratings with the users' questionnaires.

• FINDINGS:

• Networks such as Facebook are more “genuine mediums for social interactions than vehicles for self-promotion,”

• But whether honesty on Facebook comes naturally or is necessitated by your audience is up for debate “You don't have full control over it. Other people can write things on your wall and tag you in unflattering photos. etc” Stated Professor Hancock

CASE STUDY ON DECEPTION ON FACEBOOK

Page 31: Social Media Investigations

Detecting Deception

• Inconsistencies in actions or words do not necessarily indicate a lie, just as consistency is not necessarily a guarantee of the truth.

• However, a pattern of inconsistencies or unexplainable behavior normally indicate deceit.

Page 32: Social Media Investigations

Repeat questions

• Should not be exact repetitions of an earlier question.

• The investigator must rephrase or otherwise disguise the previous question.

• Repeat questions also need to be separated in time from the original question so the information cannot easily be remembered.

Control Questions

• Developed from recently confirmed or known information that is not likely to have changed. – If the answer to a

control question is not given as expected, it may be an indicator of deceit.

Topical Examples:

• Last day of school, Vacation dates• School events, Pop culture trivia • Video game trivia

Techniques For Identifying Deceit

Example:

Q1 – What was the score of the baseball game?

A1 – Well, first of all, you wouldn’t believe how much the tickets cost; then I had to get something to eat, which is a total waste of money....

Page 33: Social Media Investigations

Internal Inconsistencies• Frequently when someone is

lying, an investigator will be able to identify inconsistencies in the timeline, the circumstances surrounding key events, or other areas within the questioning.

– For example, someone spends a long time explaining something that took a short time to happen, or a short time telling of an event that took a relatively long time to happen.

Example:

Q1 – What was the score of the baseball game? A1 – Well, first of all, you wouldn’t believe how much the tickets cost; then I had to get something to eat, which is a total waste of money....

Page 34: Social Media Investigations

“Placement” and “Access”

• Based on a person’s job, geographical location, age, etc., investigators should have a basic idea of the breadth and depth of information that such a person should know. – When answers show that someone

does not have the expected level of information (too much or too little or different information than expected), this may be an indicator of deceit.

Repeated Information• Often if someone plans on lying about a

topic, they will memorize or practice exactly what they are going to say.

• If they always relate an incident using exactly the same wording, or answer ‘repeat’ questions identically (word for word) to the original question, it may be an indicator of deceit.

Example:

In an extreme case, if someone is interrupted in the middle of a statement on a given topic, they will have to start again at the beginning in order to “get the story straight.”

Page 35: Social Media Investigations

Incongruent Appearance and Incongruent Language

• If someone’s online appearance does not match their story, it may be an indication of deceit.

• If the type of language, including sentence structure and vocabulary, does not match the story, this may also be an indicator of deceit.

Example:

If the suspected liar does not use the proper technical vocabulary to match an otherwise familiar story, this may be an indicator of deceit.

Page 36: Social Media Investigations

Questions?