spiders, farms, and bubbles: how to become an expert internet searcher
TRANSCRIPT
SPIDERS, FARMS, AND
BUBBLESH O W T O B E C O M E A NE X P E RT I N T E R N E T S E A R C H E R
WHAT WE WILL COVER TODAY• How Google works (briefly)• Why that can limit your access to information you
want to find• Tricks, tips, and new ways of thinking about finding
information online
HOW DOES GOOGLE WORK?S P I D E R S E X P L A I N E D
IT STARTS WITH UNDERSTANDING THE INTERNET• It’s big…approximately 299 million registered domain
names globally as of fall 2015.1
• It’s diverse…representing a connection of many networks (silos).• It’s dynamic.
DOES GOOGLE SEARCH THE INTERNET?
THREE PARTS OF A SEARCH ENGINE
MAJOR POINTS
• Spiders explore the internet through links• Spiders build lists of words and where the words are
found on websites• Search engines look through the index created by the
spiders, not the Internet
WHAT DOES THIS MEAN FOR YOUR SEARCH?FA R M S A N D B U B B L E S
NOT EVERYTHING IS FINDABLE BY SEARCH ENGINES
• Not findable by search engines:–Dynamic pages (accessed through a web form)–Content that requires authentication–Non-HTML text content–Unlinked content
• All search engines exclude material from their index of web pages
HIDDEN WEB
Estimated to represent 99% of the Internet*Consists largely of database content (like library databases)
*Given that this is really hard to estimate
THE PAGES YOU SEE ARE RANKED BY AN ALGORITHM• For Google, there are over 200 factors in the ranking
algorithm3
• Includes a judgement of quality according to PageRank, or the number of links to a website• Includes personalization, or results tailored to your
previous search behaviors
FILTER BUBBLESThe information fed to you by search engines and social media that represent your personal interests and views, your geographic location, age, ethnicity2
Cognitive hidden web
INTERESTED PARTIES WILL ALWAYS TRY TO BEAT THE SYSTEM• Search Engine Optimization (SEO)• Tampers with the objectivity of search results
CONTENT FARMSVast amounts of low quality texts based on analysis of search queries and ranking optimization
Examples include eHow, WikiHow, answers.com
Written to boost advertising
Google tweaked algorithm in 2011 and may again with Knowledge-Based Trust
AN ALGORITHMIC CULTURE
Research has shown that many people equate relevancy in search rankings with reliability.3
Google-conditioned expectation of simple search
MAN AGAINST MACHINET I P S F O R T H O U G H T F U L I N T E R N E T S E A R C H
PUT YOURSELF BACK IN THE DRIVER’S SEAT
AVOID SATISFICING WHEN IT MATTERS• People make good enough
decisions rather than exploring all possible options.4
• For most, ease of access trumps quality of content.• Consciously decide when you
must resist satisficing.
Google Instant Search by geek & poke is licensed CC 3.0
IS AUTISM RELATED TO FOOD ALLERGIES?
Performed a Google search of “Is autism a food allergy?”
KNOW WHEN GOOGLE IS GOOD…AND WHEN IT IS BAD
• Ready reference versus complex information problems• Alternate search engines
– Specific types of information– US vs. international– Info Google isn’t designed to find or prefer
ALTERNATE SEARCH ENGINES• Bielefeld Academic Search Engine http://www.base-search.net/• Deep web business search
www.biznar.com• Curated results for students
www.sweetsearch.com• Carrot2 Clustering,
federated search http://search.carrot2.org/stable/search • Mamma metasearch
https://mamma.com/
• Statistics http://www.statista.com/• Social media search
http://www.socialmention.com/• Blog directory/search
http://regator.com/• Search results with subject
facets, global focus http://www.exalead.com/search/web/
ALTERNATE SEARCHDid a PubMed search for “autism food allergy” and limited results to reviews from the last 5 years
Clicked link for similar articles
START AT THE SOURCE• Consider first what sources of information or pieces of evidence
might be found.5
• Directories• Associations• Government sources• Library resources• Citation trails
HOW TO LOOK FOR SOURCESDIRECTORIES AND SEARCH ENGINES• Open Directory Project• Library of Congress e-
Resources• SMU Research Guides and
other LibGuides• Directory of Open Access
Journals• Web of Science• Internet Archive• SimilarSiteSearch.com
GOOGLE TRICKS• Keyword search for topic and…
– “LibGuides”– “database”– “government”– “association” or “organization”– “directory”
• Related:URL for similar sites– Ex.
Related:http://censusreporter.org/• Site:URL to search within a site
SMU RESEARCH GUIDES
START AT THE SOURCEDid a search for “autism” in the Open Directory Project
START AT THE SOURCEPerformed Google search of “LibGuides autism”
USE ADVANCED GOOGLE SEARCH FUNCTIONS
– https://www.google.com/advanced_search – Custom time frame– Quotation marks for exact phrases (ex. autism “gluten
intolerance”)– Minus symbol to omit words (ex. Salsa recipes –tomato)– Site:URL to find keywords withing a specific site– Related:URL to find similar sites– Filetype: to find only a specific file type– Link:URL to find sites that link to that URL
GOOGLE CUSTOM T IME FRAME
Google assumes you want the most recent resultsSet a custom time frame from the results screen
START AT THE SOURCEPerformed Google search of “related:autismspeaks.org”
QUESTION YOUR MOTIVES
• Be aware of your own confirmation bias in your search terms.• People underestimate the value of what they do not
know and overestimate the value of what they do know.4
QUESTION YOUR MOTIVES
Did a Google search of “autism cure by diet”Contrast with a search of “autism food myth”
TURN OFF PERSONALIZATIONWITH GOOGLE ACCOUNTS
• Go to https://support.google.com/accounts/answer/465?hl=en&rd=1
• “Pause” from saving future searches• Delete search activity from your
account• Delete past searches in your Google
account and on your browserOR• Click the globe icon in the upper right
hand corner of Google search results
ALTERNATIVES
• Log out of Google• Search in incognito mode
• Use another search engine that does not personalize your results like Mamma and Dogpile
REFERENCES1Verisign. (2015). Domain name industry brief. Retrieved from http://www.verisign.com/en_US/innovation/dnib/index.xhtml2Parser, E. (2011). Beware online filter bubbles [Videofile]. Retrieved from https://www.ted.com/talks/eli_pariser_beware_online_filter_bubbles?language=en 3Eszter, H., Fullerton, L., Menchen-Trevino, E. & Thomas, K. (2010). Trust online: young adults’ evaluation of web content. International Journal of Communication. Retrieved from http://ijoc.org/index.php/ijoc/article/view/636/423 4Kahneman, D. (2011). Thinking, fast and slow. Ferrar, Straus, and Giroux: New York.5Stebbins, L. (2015). Finding reliable information online. London: Rowman & Littlefield.