crawling the web

17
CRAWLING THE WEB

Upload: kent

Post on 23-Feb-2016

23 views

Category:

Documents


0 download

DESCRIPTION

CRAWLING THE WEB. CRAWLING THE WEB. What do you do when you need information from the internet? . Search Engines. directories. Open directory project (DMOZ). Meta-search engines. FINDING INFORMATION ON THE WEB. SEARCH ENGINES DIRECTORIES META-SEARCH ENGINES. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: CRAWLING THE WEB

CRAWLING THE WEB

Page 2: CRAWLING THE WEB

CRAWLING THE WEB

What do you do when you need information from the internet?

Page 3: CRAWLING THE WEB

SEARCH EN

GIN

ES

Page 4: CRAWLING THE WEB

DIRECTO

RIESOpen directory project (DMOZ)

Page 5: CRAWLING THE WEB

META-SEARCH

EN

GIN

ES

Page 6: CRAWLING THE WEB

FINDING INFORMATION ON THE WEB

SEARCH ENGINES

DIRECTORIES

META-SEARCH ENGINES

Page 7: CRAWLING THE WEB

HOW DOES A SEARCH ENGINE WORK? Search engines use a computer program

called a SPIDER to roam the World Wide Web pages and their links.

Page 8: CRAWLING THE WEB

HOW DOES A SEARCH ENGINE WORK? The spider collects the information and

then indexes all the information.

Page 9: CRAWLING THE WEB

HOW DOES A SEARCH ENGINE WORK? Each search engine’s spider indexes and organizes the

Web pages

While indexing, matches between keywords and Web pages are found.

The sites with the best matches are displayed first. Each search engine has a different way of identifying the best sites.

Page 10: CRAWLING THE WEB

HOW DOES A SEARCH ENGINE WORK?

Page 11: CRAWLING THE WEB

HOW DOES A SEARCH ENGINE WORK?1. ROAMS and COLLECTS INFORMATION

2. INDEXES ALL THE INFORMATION

3. MATCHES THE INFORMATION

These 3 tasks are all done WITHOUT ANY HUMAN INVOLVEMENT–so a huge number of sites are indexed quickly.

Page 12: CRAWLING THE WEB

HOW DOES A DIRECTORY WORK? In a DIRECTORY, PEOPLE, not

computers, put the index together.

Page 13: CRAWLING THE WEB

HOW DOES A DIRECTORY WORK? Editors evaluate Web sites and organize

them into subject categories.

Because people have chosen them, the sites in directories may be of higher QUALITY.

Page 14: CRAWLING THE WEB

HOW DOES A DIRECTORY WORK? The number of sites in a DIRECTORY is

usually much SMALLER than in a search engine’s index.

Many people use the term “SEARCH ENGINE” to describe either a search engine or a directory. That is because many search sites offer both services.

Page 15: CRAWLING THE WEB

HOW DOES A META-SEARCH ENGINE WORK? A META-SEARCH ENGINE sends your

keywords to several search engines at the same time.

The results from each searchengine are organized and displayed on one page.

Page 16: CRAWLING THE WEB

HOW DOES A META-SEARCH ENGINE WORK?

This type of service is useful when your topic is very NARROW and you want to search as many Web sites as possible.

Page 17: CRAWLING THE WEB

REMEMBER …

No one search engine, directory or meta-search engine covers the entire Web. So, don’t get stuck in a rut by using only one. Try them all!