how does a web search engine work?. search google (started 1998 … now worth $365 billion) bing ...

Download How does a web search engine work?. search  google (started 1998 … now worth $365 billion)  bing  amazon  web, images, news, maps, books, shopping,

If you can't read please download the document

Upload: tamsin-parrish

Post on 16-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

  • Slide 1
  • how does a web search engine work?
  • Slide 2
  • Slide 3
  • search google (started 1998 now worth $365 billion) bing amazon web, images, news, maps, books, shopping, apps, videos, music and much more! sometimes people tell you what they want type your query into google sometimes you have to guess and offer stuff to them amazon gives you shopping ideas
  • Slide 4
  • our objective search engines help people find things saves them time and effort as a search engine, your job is to make sure they can find things quickly and easily!
  • Slide 5
  • overview 1.collect the things that you want to search... take a snapshot of all the internet 2.figure out what those things are about... words in text documents, speech in videos, notes in sound 3.allow people to find what they want from your collection decide which things you have are relevant like finding a needle in a haystack!
  • Slide 6
  • 3 core parts crawling the web spider crawls across all the pages on the internet indexing like a librarian categorising books retrieving letting people find the stuff they want
  • Slide 7
  • but it gets complicated the internet is huge (trillions of webpages) useless information (old, poorly written, advertising, duplicates) inappropriate stuff different languages spam
  • Slide 8
  • search engines are expensive lots of computers to do you work for you! but you need to tell them what work to do programming. they all have to work together what if some break? 24/7/365 mobile phones, ipads, computers etc in every corner of the world lots of fast internet connections
  • Slide 9
  • Slide 10
  • servers
  • Slide 11
  • cooling
  • Slide 12
  • lets take a little look at how each part works
  • Slide 13
  • 1. crawling start with a bunch of website you know about and just follow the links Imagine if you kept clicking all the links forever How long would it take to get back to the page you started on, if you were clicking on a different link each time? Could you cover all the pages on the internet? Is it equally likely you will cover all pages? What about more popular pages, for example: bbc.com, facebook.com etc?
  • Slide 14
  • Slide 15
  • 2. indexing activity...
  • Slide 16
  • 3. retrieval activity...
  • Slide 17
  • challenges words like the, a, and, what useless! tiger/tigers, bengali, tyger, big cat plurals, spelling mistakes, synonyms what people search for mismatches exactly what people write, even though it means the same! how easy to read, or helpful is a web page? how about the search topic is the page, really?