best methods for using internet search and analysis in...

47
Internet Search and Analysis in Intelligence and Investigations Tuesday, January 11, 2011 7:30 AM – 8:45 AM Ed Appel Proprietor, iNameCheck 1

Upload: hoangthu

Post on 03-Apr-2018

216 views

Category:

Documents


3 download

TRANSCRIPT

Internet  Search  and  Analysis  in  Intelligence  and  Investigations

Tuesday,  January  11,  20117:30  AM  – 8:45  AM

Ed  AppelProprietor,  iNameCheck

1

Presentation

• Quick  Internet  Overview• Online  Sources  &  Methods• Legal,  Policy  &  Privacy  Guidelines• Policy  &  Regulatory  Issues

2

The  Internet  Is  Essential  for  Investigations  and  Intelligence

• Accessible  data• Who’s  online:  80%• 30%+  power  users• Crime  &  misbehavior• Due  diligence• Intelligence• Vetting• Investigations

3

Pew  found  all  age  groups  online  in  significant  percentages.

Millions  of  Users

IP  traffic  in  Petabytes/month

Source:  InternetWorldStats

Source:  Cisco

Internet  GrowthThe  numbers,  however  precise,  show  that  Internet  growth  is  rapid

US:  #2  in  world  (after  China)  – 239,893,600  users  of  310.2M  population  – 77.3%,  per  InternetWorldStats.com

4

Millions  of  Users

IP  traffic  in  Petabytes/month

The  Internet  Universe

5

November  3,  2003  Map  of  the  Internet

MIT  Internet  Map  2007

San  Diego  Supercomputer  Center  I-­‐Map  2008

An  increasingly  complex,  interconnected  galaxy  of  nodes  is  portrayed  in  these  Internet  maps  by  leading  technologists.

6

San  Diego  Supercomputer  Study  of  Internet  Links

The map of the Internet, as built and described in a Nature Communicationspaper, shows the locations of Internet systems on the hyperbolic plane. Image courtesy of Dmitri Krioukov, SDSC/CAIDA

A billion or more people use the Internet daily, according to recent studies by SDSC research.

What’s  on  the  Internet?Social  NetworkingNews  &  BlogsMaps  &  LocationsGames  &  HobbiesPhotosVideo,  Film,  MusicLibrariesE-CommerceAdvertisingPrivate Websites

Porn,  ExploitationIllegal  SitesIllegal  ActivitiesIllicit  Activities  Forbidden  ActivitiesFantasyHumorJuvenile  Delinquency

7

Wireless: Major Growth Area

What’s  on  the  Internet?

• Public  Records  (Real  Estate,  Courts,  Licenses,  Businesses,  Arrests,  Liens,  etc.)

• Residences,  Building  Occupants• Telephones,  Email,  Mailing  Addresses• Genealogy,  Births,  Deaths• Educational  Institutions  &  Alumni• Business  &  Executive  Profiles• Associations  &  Volunteer  Organizations• Private  data  vendors  (Acurint,  IRB,  TLO)

8

Self-­‐Descriptions  in  Online  Profiles

9

Yedo  Da  Meth  Lover,  26,  Colbert,  Washington  

MySpaceLowlife,  26,  Brownsville/Austin,  TX,  “Death  

to  the  New  World  Order”  MySpace Crack  Monkey,  21,  Somerset,  NJ,  Rider  grad,  MySpace

Hacker  ClubFacebook

Lynn,  N.  Seattle  ecstasy  dealer

MySpaceAngela,  meth  addict

MySpace

Illicit  Behavior  Online:  People  We  Trusted

10

Florida  Asst.  US  Attorney  arrested  in  2007  as  he  arrived  in  Detroit  with  doll,  earrings,  Vaseline,  for  trying  to  arrange  to  have  sex  with  5-­‐year-­‐old  in  Internet  chats.    He  committed  suicide  in  his  cell  in  2007.

A  DHS  press  spokesman  caught  trying  to  induce  “14-­‐year-­‐old  girl”  (an  undercover  detective)  to  have  sex,  pled  No  Contest  in  2006

Army  Chief  Warrant  Officer,  Director  of  Army  School  of  Information  Technology,  arrested  in  2010  for  collecting  and  sharing  child  pornography  over  the  Internet

US  military  contractor  in  Baghdad  hacked  girls’  computers,  extorted  them  for  nude  photos  &  sex  tapes,  tried  to  meet  some  for  sex  while  on  leave,  had  over  4,000  victims  when  arrested.    Serving  a  30-­‐year  sentence,  2010.

Case  Examples

11

A computer forensic analyst – part of the IT security department of a Fortune 500 firm – was found publicizing himself online as a profane, offensive “leader” of 5,000 players in a worldwide, popular massively multiplayer online fantasy sci-fi game –which led to discovery of his game playing all day, during both work and off-hours.

A new chief of research was found to have been disciplined by the FDA – 3 years prohibition from government contracting – for admitted scientific misconduct. While the FDA database did not show the 10-year-old sanctions, three FDA newsletters online reported them.

One lesson: What you don’t know about what’s online can hurt you.

Case  Examples

12

~1,000 US Navy personnel using their Navy.mil email addresses as their MySpace user names. Many postings contain unsuitable material, including operational security issues.

A computer security man who pled guilty to operating a massive botnet that stole IDs and money was hired by a Santa Monica Internet search firm while he was awaiting sentencing. The firm failed to Google the convict.

13

Spc.  Bradley  Manning  Accused  in  Wikileaks  Case

“Wikileaks”  chief  suspect  Spc.  Bradley  Manning,  22,  of  Potomac,  MD,  was  arrested  in  Kuwait  and  incarcerated  at  Quantico  Marine  Base,  charged  in  July  2010  with  leaking  classified  videos  of  US  air  strikes  in  Iraq  to  the  Wikileaks  website  in  April  2010.  An  online  chat  acquaintance,  Adrian  Lano  (formerly  convicted  of  computer  hacking)  told  authorities  and  the  press  that  Manning  provided  thousands  of  classified  documents  to  Wikileaks.    Julian  Assange,  Wikileaks’  founder,  claimed  the  leaker  exposed  US  military  misdeeds.    US  government  leaders  voiced  fear  that  US  troops  and  informants  would  be  killed  based  on  secrets  leaked,  and  defended  the  actions  depicted.    75  MB  of  classified  documents  posted  by  Wikileaks  numbered  in  the  thousands.

Adrian Lamo ~2001

Julian Assange, Wikileaks

Bradley Manning was reportedly despondent over losing a lover and disciplined for striking a soldier

Leaked videos included US air strikes that killed civilians, including a Reuters reporter & driver

Manning’s charges include illegally transferring classified data to his PC, placing unauthorized software on military computers and delivering national defense info to an unauthorized party

Internet  Searching  is  Useful  For:

• Cyber vetting – virtual neighborhoods• Criminal & corporate investigations• IP & asset protection (insider threat)• Compliance• Competitive intelligence• Legal support• Research (any topic)

14

Likely  Findings

• History  of  malicious  online  activities:  ~3-­‐6%• Derogatory  information,  e.g.  past  bad  acts

– Arrests,  convictions,  lawsuits,  bankruptcies,  firing

• Misuse  of  “anonymous”  virtual  identity  online• Most  likely:  Verification  of  qualifications  and  eligibility  for  the  position  sought  in  vetting

15

Sources  &  Methods  for  Internet  Searching

• Systems  &  Tools• Search  Engines  &  Metasearch• Websites  with  Databases:  “Dark  Web”• Automated  Searching

16

Analysis  is  critical  for  the  information  to  have  value

Systems

• Search  on  the  right  computer– Use  a  separate  system  for  searching    -­‐malware  risk– Keep  anti-­‐virus,  firewall,  anti-­‐malware  up  to  date

• Protect  your  anonymity  – you  can  be  detected• Protect  the  subject  – don’t  leave  a  trail• Use  fast  systems,  applications,  enough  memory

17

Applications

• Browser:  Internet  Explorer,  Firefox,  Chrome,  Safari,  Opera

• Browser  settings,  search  engine  integration• PDF  printer  (e.g.  Adobe  Acrobat)• Database  or  folders  – retrievable  files• Search  tools  (internal,  Internet)

18

Manual  Searches

• Big  5  Search  Engines  – Live  &  Cached  Results– Google  (YouTube)  – Page  Rank:  100  factors– Yahoo!  4B  pages– Microsoft  (Bing)  – Ask  (MyWebSearch)  3%  of  searches– AOL  (MapQuest)

• Popular  (Social  &  Sales)  websites  – eBay,  Facebook,  MySpace,  Craigslist,  Amazon  

19

Other  Search  EnginesAll  the  Web  -­‐ "live  search"  looks  for  terms  as  you  type  them  AltaVista  -­‐ A  Yahoo  property  that's  not  what  it  used  to  be  Exalead  -­‐ Search  engine  from  France  FreeSearch -­‐ U.K.  search  engine  Gigablast -­‐ Looks  similar  to  Google,  smaller  database  IceRocketLycos  Mamma  (really  a  metasearch  engine)Openfind -­‐ Emphasizes  Chinese-­‐language  results  WiseNut -­‐ Includes  "Wise  Guides,"  (topic  groups  )

20

Contemporary  (“Web  2.0”)  Search  Tools

Twitter.com  ,  Trackle.com,  Monitter.com  and  Friendfeed.com  –help  find  people  &  provide  “right  now”  results

Specialized  Searching  (Examples)

• Blogs:  blogsearch.google.com,  icerocket,com,  sphere.com,  technorati.com,  blogdigger.com

• IP  addresses:  SamSpade.org,  whois.com,  networksolutions.com,  domaintools.com

• Reverse  phone/address:  Whitepages.com,  anywho.com,  verizon.com

• Public  records:  brbpub.com  (county)• Government:  usa.gov

21

22

More  Searches

• Advanced  search  (Boolean  logic)• Special  features:  images,  videos,  maps,  news,  blogs

• Country-­‐based  searching• Translations  (rough)• Tracking:  Google.com/alerts  (emails)

23

Tracking

• Google  and  other  tools  (Trackle.com)  allow  one  to  track:– Changes  in  websites– Appearance  of  terms  on  indexed  pages– Appearance  of  terms  in  Twitter  &  other  places– Blogs  &  news  references  to  a  term

• Tracking  is  important  in  protection  of  assets  and  following  activities  of  rivals  &  adversaries

24

Leveraging  Search  Engine  Findings

• Identify  websites  that  may  hold  more  on  topic– Colleges,  associations,  groups,  social  sites– Local  press,  hobbies,  sports,  high  schools

• Identify  subject’s  activities  that  may  lead  to  further  searching

• Identify  subject’s  family  and  closest  friends,  who  may  post  about  the  subject

Metasearch  Engines

Notice  that  results  differ  in  order  &  number

25

Cached  Web  Pages

Archive.org:  Website content  no  longer  online  (Wayback Machine)

Dogpile http://www.dogpile.com/   Google,  Yahoo,  Bing,  Ask

ixquick http://www.ixquick.com/ 11  sites

Metasearch http://www.metasearchengine.com/ 27  sites

Excite http://www.excite.com/ Google,  Yahoo,  Bing,  Ask

Infospace http://www.infospace.com/   Google,  Yahoo,  Bing,  Ask,  Twitter

Addictomatic http://addictomatic.com/   Metasearch  engine  (23  sites)

Metacrawler http://www.metacrawler.com/   9  or  more  sites

Search3 http://www.search3.com/ Google,  Twitter,  Bing,  in  columns

Invisible  Web

26

Internet

Many online databases are not accessible to Google

27

Variations  in  Name  Searches:  Examples

• Use  different  versions  of  a  name:– “John  J.  Doe”      (full  name  in  quotes)– “Jack  Doe”    (nickname  in  quotes)– “Jack  Doe”  Nevada      (name  in  quotes  +  geographic  location)– “Jack Doe” IBM (name  in  quotes  +  job/industry/hobby)– “Jack Doe” Purdue    (name  in  quotes  +  school)

• Address  – reverse  address  – J.  Doe  may  work better  than  John  Doe• Phone  Numbers• Email  Addresses

[email protected]– doe– jjdoe@– @jacksbar (used  with  smaller  companies)

28

Quick  Anatomy  of  Google• Google  (YouTube)  constantly  spiders  the  Internet,  hits  pages  about  once  every  30  days

• Caches  &  indexes  about  10  billion  pages,  more  than  any  other  search  engine

• Presents  search  results  instantly,  showing  live  and  cached  data  links

• Presents  results  in  “PageRank”  order  based  on  popularity  (note:  ads  influence  results)

The Internet:506M

websites56B pages

Google has about 18% of pages indexed

Web

Google

Searching  Online  Databases:  Contents  May  Not  Be  Indexed  by  Search  Engines

• PeopleFinders,  zabasearch

• WhitePages.com,  Anywho.com

• USA.gov• USTaxCourt.gov• BlogSearch.google,  IceRocket,  Sphere

• Yahoo  message  boards

• Whois,  SamSpade.org• Nsopr.gov• SSNValidator.com• USAF-­‐locator.com• Bop.gov/inmate• AMA-­‐assn.org,  bms.org  (MDs)

• RipoffReport.com• RagingBull.com

29

Finding  Search  Tools

• Library  of  Congress:  http://www.loc.gov/rr/ElectronicResources/subjects.php?subjectID=69

• List  of  Search  Engines:  http://www.pandia.com/powersearch

• Yahoo  List:  http://dir.yahoo.com/Computers_and_Internet/Internet/World_Wide_Web/Searching_the_Web/Search_Engines_and_Directories/  

30

Search  Automation

• Metasearch• Copernic:  www.copernic.com  • Corporate  datamining  tools• Proprietary  Software

Better  COTS  products  are  neededBoolean Logic, Search Techniques Optimize Queries

31

32

Step-­‐by-­‐Step  Approach

1. Search enginesIndividual (e.g. Google, Yahoo)Meta (DogPile, Metasearchengine)

2. Social Networks/Blog sites3. Copernic4. Automated searches5. Follow-up searches

Keeping  Up  With  The  Internet

• Keep  a  spreadsheet  with  links  to  best  sources• Don’t  rely  on  search  engines  alone• Find  new  sites  &  drop  those  no  longer  useful• Research  what  works  best• Use  experts  in  Internet  searching  -­‐ outsource• Train  &  equip  internal  Internet  searchers

33

Procedures

• Plan  – include  subject-­‐specific  sites  &  terms• Capture  content,  print  into  PDFs• Include  details  (URLs,  dates,  specifics)• Provide  source  for  each  item  reported• Log  the  process,  if  evidence  results• Do  not  include  inappropriate  data  (Title  VII)• Include  caveats  about  reliability  in  reports  

34

Controversial  Methods

• “Friending”  subjects  – in  real  or  false  identity• Social  engineering  to  elicit  info  about  subject• Emailing  subject  under  a  false  identity• “Pretexting”  as  the  subject  to  elicit  data  from  a  company  or  someone  who  knows  subject

• Identifying  an  anonymous  emailer using  hidden  code

• “Lurking”  in  chat  rooms  

35

Large  Scale  Internet Intelligence

• Use  automated  search  tools• Capture  &  store  on-­‐line  activities  for  reference• Filter  and  scan  results  to  find  relevant  data• Analyze  and  report  results  along  with  other  investigative  sources• Identify  users:  link  real  names  to  online  IDs

• Be  careful  in  using  Internet  data  to  ensure  accuracy  and  fairness

36

Analyzing  Search  Results

• Attribution:  Who  uses  a  virtual  identity,  posts• Verification:  Proving  or  confirming  online  data

– Ultimate  confirmation:  admission  of  subject

• Filter  non-­‐identifiable,  irrelevant  references• Evaluating  the  seriousness  of  findings• How  much  searching  is  enough?

37

Preserving  Online  Evidence

• Print  relevant  web  pages  (PDF  files)• Maintain  securely  (encryption,  digital  signatures)

• Keep  long  enough  to  meet  legal  obligations  (then  delete  completely)

38

If  you  are  not  using  computer  forensic  tools….

If  the  content  can  become  evidence,  keep  a  log  and  notes  to  support  testimony  about  collection.

Using  Search  Results

• Integrate  into  other  reporting  – with  clear  indication  of  source

• Remember:  subject  may  not  have  posted  item• Fairness  may  demand  verification  of  the  data  by  the  subject

• In  vetting,  it’s  best  to  interview  the  subject  about  any  questionable  postings

39

Is  Internet  Vetting  Legal?Is  Internet  Information  “Private?”

• Internet  data  is  public,  not  private:  plain  view,  published  information

• No  restriction  on  using  published  information• Must  abide  by  all  legal  requirements  for  other  types  of  investigative  information

• No  current  legal  requirements  for– Advising  the  subject– Using  Internet  searching,  if  not  outsourced

Caveat:  This  does  not  constitute  legal  advice40

Legal  &  Privacy  Gold  Standard

Notice,  consent:  add  to  current  formsAttribution,  verification,  subject  interview,  redressAssessing  results  as  intelligence:– Virtual  ID  might  be  used  by  someone  else– Online  data  may  be  fabricated,  fantasy,  altered– Basis  for  subject  interview,  adjudication

Meets  FCRA  &  other  legal  requirements

41

Cyber  Vetting  Guidelines

• IACP-­‐PERSEREC  Project:  Guidelines– Cyber  Vetting  for  Law  Enforcement– Cyber  Vetting  for  National  Security– Cyber  Posting  for  both  above

• Nationwide  series  of  focus  groups,  research• Baseline  considerations  for  establishing  enterprise  policies  and  procedures

42

PERSEREC:  Defense  Personnel  Security  Research  Center,  Monterey,  CAIACP:  International  Association  of  Chiefs  of  Police

43

IACP  Cyber  Vetting  Guidelines

Developing a Cybervetting Strategy for Law Enforcement, December 2010, IACP[Companion study for national security]

http://www.iacpsocialmedia.org/Portals/1/documents/CybervettingReport.pdf

Key  Policy  Issues

• Trained  Internet  investigators• Outsourced  (can  address  EEO  issue)• Internet  search  policies  &  procedures

– Liability  if  Internet  searching  is  done  improperly

• Defining  sufficiency  -­‐ completeness• Utilizing  results  of  searching

44

Issues  with  Private  Investigators

• Licensing  of  cyber  investigators– Training

• Legal  and  ethical  guidelines  for  cyber  vetting• Watching  the  watchers:  regulators  online• Keeping  up  with  the  Internet

45

Internet  Searches  for  Vetting,  Investigations  and  Open-­‐Source  Intelligence

By  Edward  J.  AppelTaylor  &  Francis

http://www.taylorandfrancis.com/books/details/9781439827512/

Scheduled  publication  January  14,  2011

46

…contains  more  details  on  topics  discussed  here,  e.g.  how  to  do  cybervetting  and  investigations  ethically  &  legally

Forthcoming  Book:

47

Questions?

Contact  Information:Ed  Appel,  Proprietor,  iNameCheck

(301)  524-­‐[email protected]