building saas solutions for online media using apache solr - by alberto mijares

36
Building SaaS solutions with Apache Solr Alberto Mijares, Canoo Engineering AG [email protected], 26/05/2011 Twitter: @lemaiol

Upload: lucenerevolution

Post on 12-Jul-2015

377 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Building SaaS Solutions for Online Media Using Apache Solr - By  Alberto Mijares

Building SaaS solutions with Apache Solr

Alberto Mijares, Canoo Engineering [email protected], 26/05/2011

Twitter: @lemaiol

Page 2: Building SaaS Solutions for Online Media Using Apache Solr - By  Alberto Mijares

Bullet point time!

2

Page 3: Building SaaS Solutions for Online Media Using Apache Solr - By  Alberto Mijares

What I Will Cover

Practical applications of Apache Solr and Apache Lucene: how to increase the time spent by a user in an website and do website “cross-selling”.

Use case: how Canoo helped Axel Springer Switzerland to increased the page impressions, user permanence time and traffic in their financial online newspapers.

Key concepts:• How to achieve this using Lucene & Solr• How to profit from a SaaS business model

3

Page 4: Building SaaS Solutions for Online Media Using Apache Solr - By  Alberto Mijares

Who I am

Alberto Mijares Canoo Engineering AG Background in web applications and standards:

• Participated in W3C Semantic Web interest group (SWEO)

• Led web standards compliance tools development in the past (Web Accessibility and Mobile Web)

• Led enterprise information retrieval projects in the recent past

• Actually coaching Google Web Toolkit projects’ development

4

Page 5: Building SaaS Solutions for Online Media Using Apache Solr - By  Alberto Mijares

Who is Canoo

People:• Dirk Koenig: Groovy founder• Andres Almiray: Griffon project lead and Java

Champion• Hamlet D’Arcy: Groovy committer and enthusiast• … almost 40 more top software engineers

5

Products:• WebTest: framework for web functional testing• RIA Suite (aka ULC): Java based RIA framework• FindIT: information retrieval and search tools

• WMTrans: language analysis tools

Page 6: Building SaaS Solutions for Online Media Using Apache Solr - By  Alberto Mijares

Canoo FindIT

http://www.canoo.com/videos/FindIT.html

6

Page 7: Building SaaS Solutions for Online Media Using Apache Solr - By  Alberto Mijares

Stop “bullet-pointing”!

7

Page 8: Building SaaS Solutions for Online Media Using Apache Solr - By  Alberto Mijares

The facts

8

Axel Springer group is a market leader

Bilanz, Handelszeitung and Stocks

In Switzerland financials are important!

Financial language is German

Online media is the future

Page 9: Building SaaS Solutions for Online Media Using Apache Solr - By  Alberto Mijares

The facts

9

Axel Springer group is a market leader

Bilanz, Handelszeitung and Stocks

In Switzerland financials are important!

Financial language is German

Online media is the future

Page 10: Building SaaS Solutions for Online Media Using Apache Solr - By  Alberto Mijares

The gap

Make the online versions more profitable

10

Make all newspapers “market leaders”

Page 11: Building SaaS Solutions for Online Media Using Apache Solr - By  Alberto Mijares

The gap

Make the online versions more profitable

11

Make all newspapers “market leaders”

Page 12: Building SaaS Solutions for Online Media Using Apache Solr - By  Alberto Mijares

The how

Workshop

12

“Related articles”

“Cross-selling”

Page 13: Building SaaS Solutions for Online Media Using Apache Solr - By  Alberto Mijares

The how

Workshop

13

“Related articles”

“Cross-selling”

Page 14: Building SaaS Solutions for Online Media Using Apache Solr - By  Alberto Mijares

The analysis

Find a funding model

14

Use Lucene’s “More like this”

Integrate back the suggestions

Implement a selection mechanism

Page 15: Building SaaS Solutions for Online Media Using Apache Solr - By  Alberto Mijares

The analysis

Find a funding model

15

Use Lucene’s “More like this”

Integrate back the suggestions

Implement a selection mechanism

Page 16: Building SaaS Solutions for Online Media Using Apache Solr - By  Alberto Mijares

The issues

“More like this” was “experimental”

16

Works out-of-the-box only in English

Without “semantics” not always makes sense

Indexing full pages produces noise

Page 17: Building SaaS Solutions for Online Media Using Apache Solr - By  Alberto Mijares

The issues

“More like this” was “experimental”

17

Works out-of-the-box only in English

Without “semantics” not always makes sense

Indexing full pages produces noise

Page 18: Building SaaS Solutions for Online Media Using Apache Solr - By  Alberto Mijares

The key

18

Page 19: Building SaaS Solutions for Online Media Using Apache Solr - By  Alberto Mijares

The key

19

Page 20: Building SaaS Solutions for Online Media Using Apache Solr - By  Alberto Mijares

The functional requirements

Discover and index articles

20

Extract only content

Simple and flexible query service

Page 21: Building SaaS Solutions for Online Media Using Apache Solr - By  Alberto Mijares

The functional requirements

Discover and index articles

21

Extract only content

Simple and flexible query service

Page 22: Building SaaS Solutions for Online Media Using Apache Solr - By  Alberto Mijares

The funding model

22

Page 23: Building SaaS Solutions for Online Media Using Apache Solr - By  Alberto Mijares

The business model

23

SaaS

Page 24: Building SaaS Solutions for Online Media Using Apache Solr - By  Alberto Mijares

The “other” requirements

Lucene-based analysis pipeline

24

Web oriented platform

Multi-application platform

Reliable, fast and scalable

Plan B?

Page 25: Building SaaS Solutions for Online Media Using Apache Solr - By  Alberto Mijares

The “other” requirements

Lucene-based analysis pipeline

25

Web oriented platform

Multi-application platform

Reliable, fast and scalable

Plan B?

Page 26: Building SaaS Solutions for Online Media Using Apache Solr - By  Alberto Mijares

The search

Wraps Lucene in a nice way

26

It is mature and Open Source

Supports scheduling, REST API, DIH,…

Scalability out-of-the-box

Well documented and has professional support

Page 27: Building SaaS Solutions for Online Media Using Apache Solr - By  Alberto Mijares

The search

Wraps Lucene in a nice way

27

It is mature and Open Source

Supports scheduling, REST API, DIH…

Scalability out-of-the-box

Well documented and has professional support

Page 28: Building SaaS Solutions for Online Media Using Apache Solr - By  Alberto Mijares

The plan

From POC to PROD in “80 days”

28

Page 29: Building SaaS Solutions for Online Media Using Apache Solr - By  Alberto Mijares

The plan

From POC to PROD in “80 days”

29

Page 30: Building SaaS Solutions for Online Media Using Apache Solr - By  Alberto Mijares

The results

Google analytics

30

Page 31: Building SaaS Solutions for Online Media Using Apache Solr - By  Alberto Mijares

The results

Google analytics

31

Page 32: Building SaaS Solutions for Online Media Using Apache Solr - By  Alberto Mijares

The conclusions

32

Page 33: Building SaaS Solutions for Online Media Using Apache Solr - By  Alberto Mijares

The Q&A

33

Thanks!

Page 34: Building SaaS Solutions for Online Media Using Apache Solr - By  Alberto Mijares

Sources

Links• http://people.canoo.com/share• http://www.canoo.com• http://www.canoo.net• http://www.leo.org• http://www.bilanz.ch• http://www.handelszeitung.ch• http://www.stocks.ch

34

Page 35: Building SaaS Solutions for Online Media Using Apache Solr - By  Alberto Mijares

Contact

Alberto Mijares• [email protected]• Twitter: @lemaiol

35

Page 36: Building SaaS Solutions for Online Media Using Apache Solr - By  Alberto Mijares

Architecture

Platform: Apache Solr 1.4.1Architecture:

Solr container Web container

Springer Solr Springer WebApp

Customer 2 Solr Customer 2 WebApp

Customer 3 Solr Customer 3 WebApp

Extern accessIntern access

Requests