hack the boss ted drake yahoo! france. 2 boss basics “boss is a data api. it’s not a search...

32
Hack the BOSS Ted DRAKE Yahoo! France

Upload: robert-gilbert

Post on 28-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Hack the BOSS Ted DRAKE Yahoo! France. 2 BOSS Basics “BOSS is a data API. It’s not a search API” -Vik Singh, BOSS Architect  Conference, Madrid

Hack the BOSSTed DRAKE

Yahoo! France

Page 2: Hack the BOSS Ted DRAKE Yahoo! France. 2 BOSS Basics “BOSS is a data API. It’s not a search API” -Vik Singh, BOSS Architect  Conference, Madrid

2

BOSS Basics

“BOSS is a data API. It’s not a search API”

-Vik Singh, BOSS Architect

www2009 Conference, Madrid

Page 3: Hack the BOSS Ted DRAKE Yahoo! France. 2 BOSS Basics “BOSS is a data API. It’s not a search API” -Vik Singh, BOSS Architect  Conference, Madrid

3

•Change ranking

•Create your own look and feel

•Use your favorite ads

•Mash with external APIs

BOSS = Freedom

Page 4: Hack the BOSS Ted DRAKE Yahoo! France. 2 BOSS Basics “BOSS is a data API. It’s not a search API” -Vik Singh, BOSS Architect  Conference, Madrid

4

Coming Soon…

•SLA

•Customer Support

•Fees: -Free for most uses-Costs based on usage

Page 5: Hack the BOSS Ted DRAKE Yahoo! France. 2 BOSS Basics “BOSS is a data API. It’s not a search API” -Vik Singh, BOSS Architect  Conference, Madrid

5

BOSS Details

• REST based API.

• XML or JSON output

• Web, News, Image, SiteSearch, and Spelling Suggestion services

• Time span filtering for News Search

• Delicious Tags and Popularity

• Keyterm extraction

• Microformat and RDF data

• Extended abstracts

• Recognizes most search filters from Yahoo! and Google (backdoor hacks)

Page 6: Hack the BOSS Ted DRAKE Yahoo! France. 2 BOSS Basics “BOSS is a data API. It’s not a search API” -Vik Singh, BOSS Architect  Conference, Madrid

6

What is the most important part of your application?

• The results display?

• The text ads?

• The rounded borders?

• The smooth animations?

• The perfect URL?

THE QUERY STRING!!!

Page 7: Hack the BOSS Ted DRAKE Yahoo! France. 2 BOSS Basics “BOSS is a data API. It’s not a search API” -Vik Singh, BOSS Architect  Conference, Madrid

7

The Query

• Tells you what the user is looking for

• Generates related topics

• Powers secondary APIs

• Can be generated by a search box, URL, tags,or keyword extraction from the page.

• The Query is your BFF!

Page 8: Hack the BOSS Ted DRAKE Yahoo! France. 2 BOSS Basics “BOSS is a data API. It’s not a search API” -Vik Singh, BOSS Architect  Conference, Madrid

8

Let’s Start Hacking!

•Get an API key

•http://developer.yahoo.com

•You don’t need a URL for now.

•Update it later for better tracking and promotion.

Page 9: Hack the BOSS Ted DRAKE Yahoo! France. 2 BOSS Basics “BOSS is a data API. It’s not a search API” -Vik Singh, BOSS Architect  Conference, Madrid

9

Site Specific Results

Search only one site:

/ysearch/web/v1/golf+site:vw.com?

Search from a select group of sites:

/ysearch/web/v1/golf?sites=vw.com,vwtrendsweb.com,performancevwmag.com,caranddriver.com

Page 10: Hack the BOSS Ted DRAKE Yahoo! France. 2 BOSS Basics “BOSS is a data API. It’s not a search API” -Vik Singh, BOSS Architect  Conference, Madrid

10

Tag or Title Filters

Use the inurl: filter to simulate tag search:/ysearch/web/v1/inurl:golf?

Use intitle: to filter results with query in title/ysearch/web/v1/intitle:golf?

Page 11: Hack the BOSS Ted DRAKE Yahoo! France. 2 BOSS Basics “BOSS is a data API. It’s not a search API” -Vik Singh, BOSS Architect  Conference, Madrid

11

Get Related Sites

Use related:foo.html to find related sites

/ysearch/web/v1/related:http://www.caranddriver.com/car/2006-models/2006-golf.html?

Page 12: Hack the BOSS Ted DRAKE Yahoo! France. 2 BOSS Basics “BOSS is a data API. It’s not a search API” -Vik Singh, BOSS Architect  Conference, Madrid

12

BOSS Keyterms

• Keyterms are words used to find a site while searching on Yahoo!

• Listed in order of relevance.

• /web/v1/{query}?view=keyterms

Page 13: Hack the BOSS Ted DRAKE Yahoo! France. 2 BOSS Basics “BOSS is a data API. It’s not a search API” -Vik Singh, BOSS Architect  Conference, Madrid

13

Delicious Tags and Popularity

• How many times has a page been saved in Delicious?

• What tags have been associated with the page? How many times?

•view=delicious_saves,delicious_toptags

Page 14: Hack the BOSS Ted DRAKE Yahoo! France. 2 BOSS Basics “BOSS is a data API. It’s not a search API” -Vik Singh, BOSS Architect  Conference, Madrid

14

KeyTerms + Delicious Tags: What are they good for?

• Relevancy

• Related Searches

• Search Suggest

• Tag Clouds

• Trigger secondary APIs

• Highlight Popular Results

Page 15: Hack the BOSS Ted DRAKE Yahoo! France. 2 BOSS Basics “BOSS is a data API. It’s not a search API” -Vik Singh, BOSS Architect  Conference, Madrid

15

What it looks like

<keyterms><terms><term>Bucharest</term><term>city</term><term>Romanian</term><term>population</term><term>Romania</term><term>architecture</term><term>city centre</term><term>clubs</term></terms></keyterms>

Page 16: Hack the BOSS Ted DRAKE Yahoo! France. 2 BOSS Basics “BOSS is a data API. It’s not a search API” -Vik Singh, BOSS Architect  Conference, Madrid

16

BOSS Mashup Framework

• Python based framework to mash BOSS API with secondary web services and proprietary data

• Easy integration with Google APP Engine

• Powers the infamous YUIL (4 hour search) project.

• Fast prototyping with minimal code

Page 17: Hack the BOSS Ted DRAKE Yahoo! France. 2 BOSS Basics “BOSS is a data API. It’s not a search API” -Vik Singh, BOSS Architect  Conference, Madrid

17

BOSSY Code on BOSS Mashup Platform

__author__ = "Vik Singh ([email protected])"

from yos.util import text, typechecks

from yos.yql import db

from yos.boss import ysearch

def month_lookup(s):

for m in ["jan", "feb", "mar", "apr", "may", "jun", "jul", "aug", "sept", "oct", "nov", "dec"]:

if s.startswith(m): return m

def parse_month(s):

months = filter(lambda m: m is not None, map(month_lookup, text.uniques(s)))

if len(months) > 0:

return text.norm(months[0]).capitalize()

def parse_year(s):

years = filter(lambda t: len(t) == 4 and typechecks.is_int(t), text.uniques(s))

if len(years) > 0: return text.norm(years[0])

Page 18: Hack the BOSS Ted DRAKE Yahoo! France. 2 BOSS Basics “BOSS is a data API. It’s not a search API” -Vik Singh, BOSS Architect  Conference, Madrid

Relevancy Hacking

Page 19: Hack the BOSS Ted DRAKE Yahoo! France. 2 BOSS Basics “BOSS is a data API. It’s not a search API” -Vik Singh, BOSS Architect  Conference, Madrid

19

Location Based Relevancy

•Where am I?

•Where am I going?

•What can I find?

Map generated by FirePin application on iPhone

Page 20: Hack the BOSS Ted DRAKE Yahoo! France. 2 BOSS Basics “BOSS is a data API. It’s not a search API” -Vik Singh, BOSS Architect  Conference, Madrid

20

Location Based Relevancy

• Fire Eagle: Standardized location and sharing platform• Live location tracking• Find upcoming traffic cameras, landmarks, restaurants, headlines,

photos, twitter buzz, etc…• Shared locations with friends• Mining Interesting Locations and Travel Sequences from GPS

Trajectories for Mobile Users by Yu Zheng, Lizhu Zhang, Xing Xie and Wei-Ying Ma

Page 21: Hack the BOSS Ted DRAKE Yahoo! France. 2 BOSS Basics “BOSS is a data API. It’s not a search API” -Vik Singh, BOSS Architect  Conference, Madrid

21

Secondary SourcesWikipedia, Craigslist, Government Data…

1. Blah2. Foo3. Blah Blah

1. Baz2. Bar3. Foo

1. Foo

• Multiple sources to increase relevance

• DuckDuckGo.com = BOSS + Wikipedia (and other services)

• Understanding User's Query Intent with Wikipedia by Jian Hu, gang wang, Fred Lochovsky and Zheng Chen - www2009 conference

•OpenData: DataMob.org, TheInfo.org, InfoChimps.org

Page 22: Hack the BOSS Ted DRAKE Yahoo! France. 2 BOSS Basics “BOSS is a data API. It’s not a search API” -Vik Singh, BOSS Architect  Conference, Madrid

22

Real Time Events

• Tweet News: Twitter + News Search

• Twitter users share most timely articles

• Relevancy highlights tweeted storiesBOSS

Page 23: Hack the BOSS Ted DRAKE Yahoo! France. 2 BOSS Basics “BOSS is a data API. It’s not a search API” -Vik Singh, BOSS Architect  Conference, Madrid

23

Internal + External Data Sources

BOSS

• Tech Crunch Search: BOSS + Access to proprietary data

• Create custom tables in YQL

•BOSS “Vertical Lens” defines what internal data BOSS should index as well as your preferred external sources.

Page 24: Hack the BOSS Ted DRAKE Yahoo! France. 2 BOSS Basics “BOSS is a data API. It’s not a search API” -Vik Singh, BOSS Architect  Conference, Madrid

24

Offline Analysis

Coloralo• requests extra images • caches them• analyzes them for relevancy

Coloralo finds coloring book images.

QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.

Page 25: Hack the BOSS Ted DRAKE Yahoo! France. 2 BOSS Basics “BOSS is a data API. It’s not a search API” -Vik Singh, BOSS Architect  Conference, Madrid

25

Quick and Easy semantic Search

• Limit your results to sites with microformats or rdf data:searchmonkeyid:com.yahoo.page.uf.hreview

• Request structured data, keyterms, and Delicious data from BOSS:view=keyterms,searchmonkey_feed,searchmonkey_rdf,delicious_toptags,delicious_saves

• Sample request:http://boss.yahooapis.com/ysearch/web/v1/cocorosie+searchmonkeyid:com.yahoo.page.uf.hreview?appid=YourAppId&format=xml&start=0&count=15&view=keyterms%2Csearchmonkey_feed%2Csearchmonkey_rdf%2Cdelicious_toptags

Page 26: Hack the BOSS Ted DRAKE Yahoo! France. 2 BOSS Basics “BOSS is a data API. It’s not a search API” -Vik Singh, BOSS Architect  Conference, Madrid

26

Inurl and Intitle Hacks

• Use your favorite search engine hacks with BOSS.

• Most of the SERP advanced search tricks will work with your BOSS requests.

• This does not include Google, Yahoo!, or other specific patterns such as !sports

Page 27: Hack the BOSS Ted DRAKE Yahoo! France. 2 BOSS Basics “BOSS is a data API. It’s not a search API” -Vik Singh, BOSS Architect  Conference, Madrid

27

Website Description

• Get a more complete picture of a target web site by combining multiple requests

• Find the number of external sites linking to the site:/ysearch/se_inlink/v1/{site}?omit_inlinks=domain

• Find the pages within the site: /ysearch/se_pagedata/v1/{site}?

• Find related web pages:/ysearch/web/v1/related:{site}?view=delicious_saves,delicious_toptags

Page 28: Hack the BOSS Ted DRAKE Yahoo! France. 2 BOSS Basics “BOSS is a data API. It’s not a search API” -Vik Singh, BOSS Architect  Conference, Madrid

28

Filter News by Time

• Older, less timely articles may have more natural relevancy. Control this by selecting the age range for news articles.

• Use orderby=date to show latest instead of most relevant.

• What happened while you were asleep: /ysearch/news/v1/{query}?age=9h&orderby=date

• Limit news articles to 1-7 days old:/ysearch/news/v1/{query}?age=1d-7d

Page 29: Hack the BOSS Ted DRAKE Yahoo! France. 2 BOSS Basics “BOSS is a data API. It’s not a search API” -Vik Singh, BOSS Architect  Conference, Madrid

29

Vertical Focus

•Vertical Search Engines already have a niche audience.

•Limit searches to appropriate sites: InsiderFood

•Truevert creates a model of word relations in context to its niche: environmental.

Page 30: Hack the BOSS Ted DRAKE Yahoo! France. 2 BOSS Basics “BOSS is a data API. It’s not a search API” -Vik Singh, BOSS Architect  Conference, Madrid

30

Go Beyond the Web Site

•Desktop: Xobni for Outlok

•Tools: Zemanta finds related information for blogs and emails

•Modular: Create an application for Facebook, Yahoo, MySpace and more with the Open Social standard.

Page 31: Hack the BOSS Ted DRAKE Yahoo! France. 2 BOSS Basics “BOSS is a data API. It’s not a search API” -Vik Singh, BOSS Architect  Conference, Madrid

31

Go from Search to Action

•Keyword Finder uses BOSS keyterms to return the top 10 keywords used by successful sites for a query

•Bossy returns a single answer to questions. Where is Big Ben? London.

Page 32: Hack the BOSS Ted DRAKE Yahoo! France. 2 BOSS Basics “BOSS is a data API. It’s not a search API” -Vik Singh, BOSS Architect  Conference, Madrid

32

Resources

• Yahoo! BOSS: http://developer.yahoo.com/boss

• BOSS Mashup Framework: http://developer.yahoo.com/search/boss/mashup.html

• YQL: http://developer.yahoo.com/yql

• Fire Eagle: http://developer.yahoo.com/fireeagle/

• Google App Engine: http://appengine.google.com

• Amazon Web Services: http://aws.amazon.com

• oAuth: http://oauth.net/

• Open Social: http://www.opensocial.org/

• Open Data: http://theinfo.org

• Alt Search Engines: http://www.altsearchengines.com/

• BOSS Hacks: http://bosshacks.com- Add your hack to http://www.bosshacks.com/hacks/open-hack-day-london-2009