boss: hacku iit delhi

37
HackU: IIT Delhi 31 st Jan’ 2009 Chris Heilmann Saurabh Sahni Build your Own Search Service

Upload: saurabh-sahni

Post on 11-Nov-2014

5.521 views

Category:

Education


2 download

DESCRIPTION

An introduction to BOSS API

TRANSCRIPT

HackU: IIT Delhi 31st Jan’ 2009

Chris Heilmann Saurabh Sahni

Build your Own Search Service

- 2 -

Outline

•  Search engines using BOSS •  About BOSS API

–  What? –  Why? –  Features

•  How to use it –  BOSS API –  BOSS Mashup framework

- 3 -

Search engines using BOSS

- 4 -

hakia: http://hakia.com/

- 5 -

hakia: http://hakia.com/

- 6 -

hakia: http://hakia.com/

- 7 -

Cluuz: http://cluuz.com

- 8 -

Cluuz: http://cluuz.com

- 9 -

Cluuz: http://cluuz.com

- 10 -

Keyword finder - http://keywordfinder.org/

- 11 -

askBOSS: http://ask-boss.appspot.com/

- 12 -

askBOSS: http://ask-boss.appspot.com/

- 13 -

askBOSS: http://ask-boss.appspot.com/

- 14 -

askBOSS: http://ask-boss.appspot.com/

- 15 -

askBOSS: http://ask-boss.appspot.com/

- 16 -

About BOSS API

- 17 -

What?

•  Open Yahoo’s core search features via web services to let 3rd parties revolutionize Search

•  Unrestricted

http://developer.yahoo.com/search/boss

- 18 -

Usage

Opening the search technology stack

50B pages * 20ms page download = 31 years

CRAWL

EXTRACT

SPAM <-> Gold

Analyze

Index

Rank Assist

Index

Web Map

Retrieve

- 19 -

Usage

Opening the search technology stack

50B pages * 20ms page download = 31 years

CRAWL

EXTRACT

SPAM <-> Gold

Analyze

Index

Rank Assist

Index

Web Map

Retrieve

WEB API

Your App here

- 20 -

Why?

•  Barriers to entry are massive –  a massive capital investment –  access to top technical talent

•  Asset to Innovate – Develop new relevance models

• Leverage user insights • Use tags, bookmarks

–  Change presentation style

•  Search anywhere –  Improve Vertical Quality w/ Web comprehensiveness –  Fragment the market, foster more players, choice, competition

- 21 -

BOSS API features

•  Unlimited queries per day •  No branding or attribution •  No restrictions on presentation •  Ability to re-order results and blend-in addition content •  Access to multiple verticals (web search, image, news) •  Spell checks, keyword suggestions •  40+ supported language and region pairs •  Ability to monetize

- 22 -

How to use it?

- 23 -

Get Started

•  Register for an application id http://developer.yahoo.com/wsregapp/

•  Documentation http://developer.yahoo.com/search/boss/boss_guide/

•  Code samples: Javascript, PHP and Python http://www.saurabhsahni.com/boss-examples.zip

- 24 -

BOSS API

Searching Slumdog Millionaire

(Source: http://en.wikipedia.org/wiki/File:Slumdog_Millionaire_poster.jpg)

- 25 -

BOSS API

•  Search for slumdog millionaire: –  http://boss.yahooapis.com/ysearch/web/v1/slumdog+millionaire?appid=xyz&format=xml

•  Exact search for “slumdog millionaire” –  http://boss.yahooapis.com/ysearch/web/v1/%22slumdog+millionaire%22?appid=xyz&format=xml

- 26 -

BOSS API

•  Search for slumdog millionaire only on indiatimes.com: –  Add site:indiatimes.com to your query –  http://boss.yahooapis.com/ysearch/web/v1/slumdog

+millionaire+site%3Aindiatimes.com?appid=xyz&format=xml

•  Search for slumdog millionaire on selected movie sites –  Add param sites=indiatimes.com,movies.yahoo.com,imdb.com –  http://boss.yahooapis.com/ysearch/web/v1/slumdog

+millionaire?appid=xyz&sites=indiatimes.com%2Cmovies.yahoo.com&format=xml

- 27 -

BOSS API

•  Find related keywords –  Add parameter view=keyterms –  http://boss.yahooapis.com/ysearch/web/v1/slumdog

+millionaire?appid=xyz&view=keyterms&format=xml

•  Search images –  http://boss.yahooapis.com/ysearch/images/v1/slumdog

+millionaire?dimensions=small

•  Search news –  http://boss.yahooapis.com/ysearch/news/v1/slumdog

+millionaire?age=15d

- 28 -

BOSS API

Spell check request

http://boss.yahooapis.com/ysearch/spelling/v1/milionare?format=xml

Response

<ysearchresponse responsecode=”200”> <resultset_spelling count="1" start=“0" totalhits="1" deephits="1"> <result> <suggestion>millionaire</suggestion> </result> </resultset_spelling> </asearchresponse>

- 29 -

BOSS API REST Interface

•  {query}: term to look for (url-encoded) •  {vert} := {web, news, images, spelling} •  @ required

–  appid

•  @ optional –  start, count, lang, region, format, callback, sites

http://boss.yahooapis.com/ysearch/{vert}/v1/{query}

- 30 -

BOSS Mashup Framework

•  Python (v2.5+) library

•  BOSS Search SDK plus …

•  SQL for remixing arbitrary XML/JSON sources

http://developer.yahoo.com/search/boss/mashup.html

- 31 -

BMF + Google App Engine

•  Enhanced version of BMF to GAE platform

•  http://zooie.wordpress.com/2008/08/04/yahoo-boss-google-app-engine-integrated/

•  Enables quick deployment of BOSS applications online

- 32 -

One more thing…

- 33 -

BOSS in Academic Research

•  The biggest dataset available on web •  Very useful for Web-mining research experiments

–  Natural language processing –  Semantic extraction –  Related keywords –  Similarity detection –  Clustering algorithms –  Spelling corrections

- 34 -

Questions?

Thank You

More: http://developer.yahoo.com/search/boss/

- 35 -

Appendix

- 36 -

http://www.yahoo.com

Search UI Templates are Included in the BOSS Mashup Framework

BOSS Mashup Framework simplifies aggregating and presenting multiple data sources

- 37 -

BMF Features

•  select, group, sort, union, joins, udfs, where •  Text normalization and duplicate removal •  Auto-transformation of resource-oriented API results

into tables w/o parsing •  All-in-memory storage and retrieval operations •  Ability to join lists of tables via an arbitrary predicate

function (map-like)

•  Search UI template framework •  Single search function provides total access to

BOSS REST API