hyper-searching the web. search engines basic search (index) cluster search (themes) meta-search...
TRANSCRIPT
![Page 1: Hyper-Searching the Web. Search Engines Basic Search (index) Cluster Search (themes) Meta-search (outsource) “Smarter” meta-search (themes + outsource)](https://reader035.vdocuments.us/reader035/viewer/2022062320/56649cdf5503460f949a8a40/html5/thumbnails/1.jpg)
Hyper-Searching the Web
![Page 2: Hyper-Searching the Web. Search Engines Basic Search (index) Cluster Search (themes) Meta-search (outsource) “Smarter” meta-search (themes + outsource)](https://reader035.vdocuments.us/reader035/viewer/2022062320/56649cdf5503460f949a8a40/html5/thumbnails/2.jpg)
Search Engines
Basic Search(index)
Cluster Search(themes)
Meta-search(outsource)
“Smarter” meta-search(themes + outsource)
![Page 3: Hyper-Searching the Web. Search Engines Basic Search (index) Cluster Search (themes) Meta-search (outsource) “Smarter” meta-search (themes + outsource)](https://reader035.vdocuments.us/reader035/viewer/2022062320/56649cdf5503460f949a8a40/html5/thumbnails/3.jpg)
Basic search engine
• Examples: AltaVista, InfoSeek, HotBot, Lycos, Excite, Google, etc
• Maintains an index for every word found
• Processes through crawling, indexing, and returning results
![Page 4: Hyper-Searching the Web. Search Engines Basic Search (index) Cluster Search (themes) Meta-search (outsource) “Smarter” meta-search (themes + outsource)](https://reader035.vdocuments.us/reader035/viewer/2022062320/56649cdf5503460f949a8a40/html5/thumbnails/4.jpg)
Basic search engine
• Different ranking systems used -most use heuristics (easiest solution) counts # of keywords that appear
-Google uses PageRank
![Page 5: Hyper-Searching the Web. Search Engines Basic Search (index) Cluster Search (themes) Meta-search (outsource) “Smarter” meta-search (themes + outsource)](https://reader035.vdocuments.us/reader035/viewer/2022062320/56649cdf5503460f949a8a40/html5/thumbnails/5.jpg)
Basic search engine
• No idea of searcher’s intent so “best” result hard to achieve
• Problems with synonymy and polysemy ex. car and automobile ex. jaguar
• One solution: store semantic relations -only can help w/synonmy
• Can’t identify concepts/author intent ex. IBM site does not say “computer”
![Page 6: Hyper-Searching the Web. Search Engines Basic Search (index) Cluster Search (themes) Meta-search (outsource) “Smarter” meta-search (themes + outsource)](https://reader035.vdocuments.us/reader035/viewer/2022062320/56649cdf5503460f949a8a40/html5/thumbnails/6.jpg)
Cluster search engine
• Example: Clusty
• Clusters results into categories/themes
• Can show results that would be ranked lower in another search engine -due to different meanings in words, can show the less searched-for
![Page 7: Hyper-Searching the Web. Search Engines Basic Search (index) Cluster Search (themes) Meta-search (outsource) “Smarter” meta-search (themes + outsource)](https://reader035.vdocuments.us/reader035/viewer/2022062320/56649cdf5503460f949a8a40/html5/thumbnails/7.jpg)
Meta-search engine
• Examples: Dogpile, Surfwax, Copernic, etc• Sends searcher’s query to a database of
search engines• Claimed to not be any better than
database; often the referenced search engines are small, free, commercial
• Users can create their own on Google of up to 5,000 URLs as “database”
![Page 8: Hyper-Searching the Web. Search Engines Basic Search (index) Cluster Search (themes) Meta-search (outsource) “Smarter” meta-search (themes + outsource)](https://reader035.vdocuments.us/reader035/viewer/2022062320/56649cdf5503460f949a8a40/html5/thumbnails/8.jpg)
“Smarter” meta-search engine
• Example: Clever project (n/a online yet)• Includes clustering and linguistic analysis
“cat”
AltaVista Yahoo
Clever“cat”
“cat”
Cat – feline
Cat – power
Cat – equipment
Cat – scans
etc.
![Page 9: Hyper-Searching the Web. Search Engines Basic Search (index) Cluster Search (themes) Meta-search (outsource) “Smarter” meta-search (themes + outsource)](https://reader035.vdocuments.us/reader035/viewer/2022062320/56649cdf5503460f949a8a40/html5/thumbnails/9.jpg)
The Clever Project
• Uses hyperlinks to locate hubs and authorities
“a respected authority is a page that is referred to by many good hubs; a useful hub is a location that points to many valuable authorities”
![Page 10: Hyper-Searching the Web. Search Engines Basic Search (index) Cluster Search (themes) Meta-search (outsource) “Smarter” meta-search (themes + outsource)](https://reader035.vdocuments.us/reader035/viewer/2022062320/56649cdf5503460f949a8a40/html5/thumbnails/10.jpg)
The Clever Project
• Obtains a list of webpages from a standard index & follows hyperlinks to increase own database
-resulting collection = “root set” -each page gets numerical hub & authority score
![Page 11: Hyper-Searching the Web. Search Engines Basic Search (index) Cluster Search (themes) Meta-search (outsource) “Smarter” meta-search (themes + outsource)](https://reader035.vdocuments.us/reader035/viewer/2022062320/56649cdf5503460f949a8a40/html5/thumbnails/11.jpg)
The Clever Project
• Similar to PageRank in determining method – guesses & constant calculations -useful by-product: clusters sites
• Adds to competition because competitors don’t have to acknowledge their competition through hyperlinks
![Page 12: Hyper-Searching the Web. Search Engines Basic Search (index) Cluster Search (themes) Meta-search (outsource) “Smarter” meta-search (themes + outsource)](https://reader035.vdocuments.us/reader035/viewer/2022062320/56649cdf5503460f949a8a40/html5/thumbnails/12.jpg)
Clever vs. Google
GOOGLE - gives initial rankings
- keeps pages indpt. of queries
- faster
- looks forward “link to link”
CLEVER - root sets per keyword
- page priority through query context
- forwards & backwards “hub and authority”
- sometimes too broad ex. Fallingwater