solr the intelligent search engine

20
SOLR, THE INTELLIGENT SEARCH ENGINE Benoît Largeau AGENDA: Stakes | Introduction | Indexing | Scalability | Searching | Admin tools | Conclusion

Upload: cs2-ag

Post on 12-Jan-2015

1.216 views

Category:

Technology


0 download

DESCRIPTION

Searching for products is a key operation for eCommerce sites, where both speed and flexibility are needed. Experience how Solr’s error tolerant Search helps the customers of House of Sound to find their products.

TRANSCRIPT

Page 1: Solr the intelligent search engine

SOLR, THE INTELLIGENT SEARCH ENGINE Benoît Largeau

AGENDA:

Stakes | Introduction | Indexing | Scalability | Searching | Admin tools | Conclusion

Page 2: Solr the intelligent search engine

WHAT ARE THE STAKES?

Considering:

- One user on two is a searcher one on two will use the internal search engine

- This searcher population transform more often than other visitors

- Less patient to browse need to find quickly otherwise they leave to another shop

INTERNAL SEARCH ENGINE IS ESSENTIAL.

SEARCH FIND ADD TO CART PAY

Page 3: Solr the intelligent search engine

• Open source enterprise search server Initiated by CNET in 2004

Openly published the source code in 2006

• the underlying engine

• Independent server using standards to communicate such as HTTP / XML / JSON

usable on every web project such as those based on Magento

SOLR PROJECT.

INTRODUCTION TO SOLR.

Page 4: Solr the intelligent search engine

INTRODUCTION TO SOLR.

SOME REFERENCES.

More references here: http://wiki.apache.org/solr/PublicServers

Page 5: Solr the intelligent search engine

Indexing data - Index the whole site (including files, …)

- Tolerance (stemmings, synonyms, …)

Searching data - Layered navigation

- Customizable relevance calculation

- Predictive search (different kinds)

- Stemming, Plurals, Synonyms,

Stop words, …

FEATURES OFFERED BY SOLR.

INTRODUCTION TO SOLR.

Admin tools Display more statistics

(most frequent requests

or search with no answer)

Scalability

Page 6: Solr the intelligent search engine

FEATURES OFFERED BY SOLR.

INDEXING DATA.

Indexing data - Index the whole site (including files, …)

- Tolerance (stemmings, synonyms, …)

Page 7: Solr the intelligent search engine

Schema

Define how to handle structured data

sent by Magento (no crawler such as Nutch)

Typing data

price & weight are floats, product name is a string, …

o Structured data in Solr allows faceted search

to filter by price range for example

Determined by the intended search behavior

if we need to filter per price range

-> prices have to be stored as floats and not strings to stay comparable

Text analysis

Text splitted in terms which are processed to calculate stemming, define synonyms, …

SCHEMA & TEXT ANALYSIS.

INDEXING DATA.

Page 8: Solr the intelligent search engine

INDEXING DATA.

Page 9: Solr the intelligent search engine

Generally indexing structured data e.g. products

Able to index binary formats

such as PDF, MS Office, images or music files

Using an interface Solr Cell

which is an adapter to Apache Tika

Apache Tika is a toolkit to detect and

extract metadata and text content from various documents

INDEXING FILES.

INDEXING DATA.

Page 10: Solr the intelligent search engine

FEATURES OFFERED BY SOLR.

Scalability

SCALABILITY.

Page 11: Solr the intelligent search engine

Suitably efficient and practical

when applied to large situations

With a bigger data index or more visitors

searches are slower!

Testing Solr performance with SolrMeter

Solutions to keep good performances with more data:

1. Scale up: Optimizing a single Solr server

2. Scale horizontally: Moving to multiple Solr Servers with replications

3. Scale deep: Combining replication and sharding (for distributed search)

DURABLE SOLUTION.

SCALABILITY.

Page 12: Solr the intelligent search engine

FEATURES OFFERED BY SOLR.

SEARCHING DATA.

Searching data - Layered navigation

- Customizable relevance calculation

- Predictive search (different kinds)

- Stemming, Plurals, Synonyms,

Stop words, …

Page 13: Solr the intelligent search engine

SEARCHING DATA.

Page 14: Solr the intelligent search engine

Factors influencing score:

1. Term frequency

2. Inverse document frequency

the rarer a term is in the whole index, the higher its score is.

3. Co-ordination factor

the greater the number of query clauses that match a document.

4. Field length

the shorter the matching field is, the greater the matching document‘s score is.

5. Boosting customized mathematical rules to increase score.

In Magento, based on attribute weights

E.g. name 5 -> manufacturer 4 -> sku 3 -> price 2 -> meta_keywords 1

SEARCH RELEVANCY.

SEARCHING DATA.

Page 15: Solr the intelligent search engine

FEATURES OFFERED BY SOLR.

ADMIN TOOLS.

Admin tools Display more statistics

(most frequent requests

or search with no answer)

Page 16: Solr the intelligent search engine

ADMIN TOOLS.

1) Available admin tool in solr but oriented developper

To check schema, index, general config, Solr server availability, to view

technical statistics…

2) Prefer to use Magento backend

To check frequent request or no answer request

Very helpful to analyse user expectations then to improve the catalog

ADMIN FEATURES.

Page 17: Solr the intelligent search engine

Steps:

1. Install and configure Solr

single or multiple servers

single or multiple languages, …

2. Adapt the standard Magento product schema

to your project context

3. Define additional customized data to index

such as other tables, files, …

4. Influence search relevance

defining attribute weights

5. Integrate in Magento frontend

CONCLUSION.

INTEGRATE SOLR IN YOUR PROJECT.

Page 18: Solr the intelligent search engine

CONCLUSION.

COMPARISONS.

Features Magento

Basic SE

Magento

with Solr

Product indexing ▲ ▲

Document indexing ▲

Synonyms ▲ ▲

Stemming ▲

Stop words ▲

Faceted search ▲ ▲

Relevance calculation ▲ ▲

Customizable relevance calculation ▲

Scalability ▲

Predictive search ▲

Admin tools (frequent requests, no answer…) ▲ ▲

No extra time needed to integrate ▲

Page 19: Solr the intelligent search engine

SOLR

clearly improves

User experience

which increases your

Transformation Rate

CONCLUSION.

Remember: 1 user on 2 is a searcher!

Page 20: Solr the intelligent search engine

CS2 AG

PLATINUM MEMBER TYPO3 ASSOCIATION

MAGENTO GOLD PARTNER

SUGAR SILVER PARTNER

CUSTOMER RELATIONSHIP MANAGEMENT

ELECTRONIC COMMERCE

ONLINE MARKETING

Gerbegässlein 1 | CH-4450 Sissach

Feldeggstrasse 55 | CH-8008 Zürich

Telefon: +41 61 333 22 22

Twitter: @CS2switzerland

www.CS2.ch