yahoo boss presentation london open hack day talk boss

Download Yahoo BOSS Presentation London Open Hack Day Talk   Boss

Post on 11-Nov-2014




1 download

Embed Size (px)


BOSS presentation of Open Hack Day Yahoo in London


  • 1. Hack the BOSS Ted DRAKE Yahoo! France
  • 2. BOSS Basics BOSS is a data API. Its not a search API -Vik Singh, BOSS Architect www2009 Conference, Madrid
  • 3.
    • Change ranking
    • Create your own look and feel
    • Use your favorite ads
    • Mash with external APIs
    BOSS = Freedom
  • 4. Coming Soon
    • SLA
    • Customer Support
    • Fees:
      • Free for most uses
      • Costs based on usage
  • 5. BOSS Details
    • REST based API.
    • XML or JSON output
    • Web, News, Image, SiteSearch, and Spelling Suggestion services
    • Time span filtering for News Search
    • Delicious Tags and Popularity
    • Keyterm extraction
    • Microformat and RDF data
    • Extended abstracts
    • Recognizes most search filters from Yahoo! and Google (backdoor hacks)
  • 6. What is the most important part of your application?
    • The results display?
    • The text ads?
    • The rounded borders?
    • The smooth animations?
    • The perfect URL?
  • 7. The Query
    • Tells you what the user is looking for
    • Generates related topics
    • Powers secondary APIs
    • Can be generated by a search box, URL, tags,or keyword extraction from the page.
    • The Query is your BFF!
  • 8. Lets Start Hacking!
    • Get an API key
    • You dont need a URL for now.
    • Update it later for better tracking and promotion.
  • 9. Site Specific Results
    • Search only one site: /ysearch/web/v1/golf ?
    • Search from a select group of sites: /ysearch/web/v1/golf?,,,
  • 10. Tag or Title Filters
    • Use the inurl: filter to simulate tag search: /ysearch/web/v1/ inurl:golf ?
    • Use intitle: to filter results with query in title /ysearch/web/v1/ intitle:golf ?
  • 11. Get Related Sites
    • Use related:foo.html to find related sites /ysearch/web/v1/ related:
  • 12. BOSS Keyterms
    • Keyterms are words used to find a site while searching on Yahoo!
    • Listed in order of relevance.
    • /web/v1/{query}? view=keyterms
  • 13. Delicious Tags and Popularity
    • How many times has a page been saved in Delicious?
    • What tags have been associated with the page? How many times?
    • view=delicious_saves,delicious_toptags
  • 14. KeyTerms + Delicious Tags: What are they good for?
    • Relevancy
    • Related Searches
    • Search Suggest
    • Tag Clouds
    • Trigger secondary APIs
    • Highlight Popular Results
  • 15. What it looks like BucharestcityRomanianpopulationRomaniaarchitecturecity centreclubs
  • 16. BOSS Mashup Framework
    • Python based framework to mash BOSS API with secondary web services and proprietary data
    • Easy integration with Google APP Engine
    • Powers the infamous YUIL (4 hour search) project.
    • Fast prototyping with minimal code
  • 17. BOSSY Code on BOSS Mashup Platform
    • __author__ = "Vik Singh ("
    • from yos.util import text, typechecks
    • from yos.yql import db
    • from yos.boss import ysearch
    • def month_lookup(s):
    • for m in ["jan", "feb", "mar", "apr", "may", "jun", "jul", "aug", "sept", "oct", "nov", "dec"]:
    • if s.startswith(m): return m
    • def parse_month(s):
    • months = filter(lambda m: m is not None, map(month_lookup, text.uniques(s)))
    • if len(months) > 0:
    • return text.norm(months[0]).capitalize()
    • def parse_year(s):
    • years = filter(lambda t: len(t) == 4 and typechecks.is_int(t), text.uniques(s))
    • if len(years) > 0: return text.norm(years[0])
  • 18. Relevancy Hacking
  • 19. Location Based Relevancy
    • Where am I?
    • Where am I going?
    • What can I find?
    Map generated by FirePin application on iPhone
  • 20. Location Based Relevancy
    • Fire Eagle: Standardized location and sharing platform
    • Live location tracking
    • Find upcoming traffic cameras, landmarks, restaurants, headlines, photos, twitter buzz, etc
    • Shared locations with friends
    • Mining Interesting Locations and Travel Sequences from GPS Trajectories for Mobile Users by Yu Zheng, Lizhu Zhang, Xing Xie and Wei-Ying Ma
  • 21. Secondary Sources Wikipedia, Craigslist, Government Data
    • Blah
    • Foo
    • Blah Blah
    • Baz
    • Bar
    • Foo
    1. Foo
    • Multiple sources to increase relevance
    • DuckDuckGo .com = BOSS + Wikipedia (and other services)
    • Understanding User's Query Intent with Wikipedia by Jian Hu, gang wang, Fred Lochovsky and Zheng Chen - www2009 conference
    • OpenData: DataMob .org , TheInfo .org , InfoChimps .org
  • 22. Real Time Events
    • Tweet News : Twitter + News Search
    • Twitter users share most timely articles
    • Relevancy highlights tweeted stories
  • 23. Internal + External Data Sources BOSS
    • Tech Crunch Search : BOSS + Access to proprietary data
    • Create custom tables in YQL
    • BOSS Vertical Lens defines what internal data BOSS should index as well as your preferred external sources.
  • 24. Offline Analysis
    • Coloralo
    • requests extra images
    • caches them
    • analyzes them for relevancy
    • Coloralo finds coloring book images.
  • 25. Quick and Easy semantic Search
    • Limit your results to sites with microformats or rdf data:
    • Request structured data, keyterms, and Delicious data from BOSS: view=keyterms,searchmonkey_feed,searchmonkey_rdf,delicious_toptags,delicious_saves
    • Sample request:
  • 26. Inurl and Intitle Hacks
    • Use your favorite search engine hacks with BOSS.
    • Most of the SERP advanced search tricks will work with your BOSS requests.
    • This does not include Google, Yahoo!, or other specific patterns such as !sports
  • 27. Website Description
    • Get a more complete picture of a target web site by combining multiple requests
    • Find the number of external sites linking to the site: /ysearch/ se_inlink/ v1/ {site}?omit_inlinks=domain
    • Find the pages within the site: /ysearch/se_pagedata/ v1/ {site}?
    • Find related web pages: /ysearch/web/v1/ related:{site}?view=delicious_saves,delicious_toptags
  • 28. Filter News by Time
    • Older, less timely articles may have more natural relevancy. Control this by selecting the age range for news articles.
    • Use orderby=date to show latest instead of most relevant.