search engine history and algorithm

Upload: satrajit-nag

Post on 08-Apr-2018

224 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/7/2019 Search Engine History and Algorithm

    1/18

    Seminar on

    Arijit Roy

    B.Tech, 6th Semester

    Enrolment No: 0810407

    Department of CSE

  • 8/7/2019 Search Engine History and Algorithm

    2/18

    ` Searc engines

    ` es of searc engine

    ` How t e work

    ` lgorit ms used

    ` Pros and cons

    5/2/2011 2search engine history and algorithm

  • 8/7/2019 Search Engine History and Algorithm

    3/18

    A program t at searc es documents for

    specified ke words and returns a list of t e documents

    w ere t e ke words were found.

    A web searc engine is designed to searc for

    information on t e World Wide Web and F P servers.

    e searc results are generall presented in a list of

    results and are often called its. e information maconsist of web pages, images, information and ot er

    t pes of files.

    5/2/2011search engine history and algorithm 3

  • 8/7/2019 Search Engine History and Algorithm

    4/18

    Searc engines look t roug t eir own databases of

    information in order to find w at it is t at ou are looking for.

    Searc engine is t e popular term for an Information

    etrieval (I ) s stem.

    Searc ngine is reall a general class ofprograms, t e term is

    often used to specificall describe s stems like Google, Alta

    Vista and xcite t at enable users to searc for documents on

    t e World Wide Web and US newsgroups.

    5/2/2011search engine history and algorithm 4

  • 8/7/2019 Search Engine History and Algorithm

    5/18

    5/2/2011search engine history and algorithm 5

    1970 ARPANET publis ed first.

    1984 DNS (Domain Name Server) introduced.

    1990 First tool for searc en ine, ARC IE developed by Alan Emta e

    1993 Wit t e rise of Gopher, two new searc programs, Veronica andJug ead were developed.

    1994 First full-text crawl based searc engine web crawler.

    1998 Google and MSN launc ed

    2000 Ya oo was providing searc services based on Inktomi's searc

    engine.

    2009 Microsoft's rebranded searc engine, ing, was launc ed.

  • 8/7/2019 Search Engine History and Algorithm

    6/18

    ` Crawler- ased Searc Engines

    ` Human-Powered Directories

    `

    H brid Searc Engines" Or Mixed Results

    5/2/2011search engine history and algorithm 6

  • 8/7/2019 Search Engine History and Algorithm

    7/18

    5/2/2011search engine history and algorithm 7

    Web crawlers are a central part of search engines, and details on

    their algorithms and architecture are kept as business secrets.

  • 8/7/2019 Search Engine History and Algorithm

    8/18

    Crawler based search engines:

    5/2/2011search engine history and algorithm 8

    Analysis SWD classifier ranking

    Semantic WebMetadata

    Index

    IR IndexerSWD Indexer

    Search services

    WebServer

    Web

    Service

    Human Machines

    The

    WEB

    Document cache

    CandidateURLs

    Discovery

    Swoogle bot

    Bounded web crawler

  • 8/7/2019 Search Engine History and Algorithm

    9/18

    Human powered directories:

    5/2/2011search engine history and algorithm 9

    Submit our website if ou are to be included

    Select an appropriate categor for our website to bepotentiall listed under.

    Submit a s ort description to t e director for our

    website.

    A searc looks for matc es onl in t e descriptions

    submitted.

  • 8/7/2019 Search Engine History and Algorithm

    10/18

    Search algorithm is an algorit m for finding an item wit

    specified properties among a collection of items.

    Searc engine algorit ms are some of t e most tig tl keptsecrets in t e world.

    Searc engines c ange t eir algorit ms man times eac

    mont to fig t off "spam.

    If people knew t e exact algorit m t en t e could

    manipulate rankings as t e please until t e searc results

    became so irrelevant t at t e searc engine became junk.

    5/2/2011search engine history and algorithm 10

  • 8/7/2019 Search Engine History and Algorithm

    11/18

    5/2/2011search engine history and algorithm 11

  • 8/7/2019 Search Engine History and Algorithm

    12/18

    ` A document processor

    ` A quer processor

    ` A searc and matc ing function

    ` A ranking capabilit

    ` Summarizing and Presenting documents.

    5/2/2011search engine history and algorithm 12

  • 8/7/2019 Search Engine History and Algorithm

    13/18

    Consider ( cat at mat )

    Select a word from quer ( cat )

    Retrieve t e list for t e word cat

    Process t e list and for eac document add weig ts to t eaccumulatorbased on TF,ITF, doc lengt .

    Find t e best ranked document and look up t e mapping table.

    Retrieve and Summarize t e docs.

    5/2/2011search engine history and algorithm 13

  • 8/7/2019 Search Engine History and Algorithm

    14/18

    Variety:An Internet searc can generate a variet of sources for information. This

    variet allows an one searching for information to choose the t pes of

    sources the would like to use, or to use a variet of sources to gain a

    greater understanding of a subject.

    Precision:Search engines do have the abilit to provide refined or more

    precise results. Putting quotations marks around a set of words will

    bring up results with the exact same words, excluding others.

    Organization:

    Search engines aid in organizing the vast amount of information

    that can sometimes be scattered in various places on the same web

    page into an organized list that can be used more easil .

    5/2/2011search engine history and algorithm 14

  • 8/7/2019 Search Engine History and Algorithm

    15/18

    You get laz and dependent on net.

    You get even what ou should not get.

    Eas access to the so called bad or restricted stuffs.

    5/2/2011search engine history and algorithm 15

  • 8/7/2019 Search Engine History and Algorithm

    16/18

    Search engine pla s important role in accessing the contentover the internet, it fetches the pages requested b the user.

    It made the internet and accessing the information just a

    click awa .

    The need for better search engines onl increases.

    The search engine sites are among the most popularwebsites.

    5/2/2011search engine history and algorithm 16

  • 8/7/2019 Search Engine History and Algorithm

    17/18

    ` Voice Command Search.

    `

    Graphical SearchResult.

    Scopes in this field are getting higher andhigher b the passing da s.

    5/2/2011search engine history and algorithm 17

  • 8/7/2019 Search Engine History and Algorithm

    18/18

    5/2/2011search engine history and algorithm 18