enterprise search in plone using solr

Post on 19-May-2015

1.192 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Enterprise Search in Plone using Solr

Calvin Hendryx-ParkerPlone Conference 2010

Wednesday, October 27, 2010

PLONE CONFERENCE 2010

• Java Based

• Full-Text Search

• Web Services API

• Standards Based Interfaces

• Scalable

• XML Configuration

• Extensible

What is Solr?

Wednesday, October 27, 2010

PLONE CONFERENCE 2010

• Indexing

• Query

Playing with Solr

Wednesday, October 27, 2010

PLONE CONFERENCE 2010

Wednesday, October 27, 2010

PLONE CONFERENCE 2010

Wednesday, October 27, 2010

PLONE CONFERENCE 2010

• Data Schema

• Faceted Search

• Administrative Interface

• Incremental Updates

• Supports Sharding

• Index Databases, Local Files and Web Pages

• Supports Multiple Indexes

Solr Features

Wednesday, October 27, 2010

PLONE CONFERENCE 2010

• Stopwords

• Synonyms

• Highlighted Context Snippets

• Spelling Suggestions

• More Like This Suggestions

• Supports Rich Documents

Solr Features

Wednesday, October 27, 2010

PLONE CONFERENCE 2010

Wednesday, October 27, 2010

PLONE CONFERENCE 2010

Wednesday, October 27, 2010

PLONE CONFERENCE 2010

Wednesday, October 27, 2010

PLONE CONFERENCE 2010Solr Performance

• Wiktionary Dataset

• 49.5 Millions lines of XML

• 1.3 GB of data

• 1.7 Million Pages Indexed in 5.5 hours

• ZODB Size after import 1.1GB

Wednesday, October 27, 2010

PLONE CONFERENCE 2010

collective.solr

Integration Options with Plone

Wednesday, October 27, 2010

PLONE CONFERENCE 2010

• Monkey Patching

• Relies on collective.indexing

• Duplicates all indexes

• Sub-Optimal Integration with Zope Transactions

• Relies on Thread Locals

collective.solr Issues

Wednesday, October 27, 2010

PLONE CONFERENCE 2010

What to do?

Wednesday, October 27, 2010

PLONE CONFERENCE 2010

Reevaluate

Wednesday, October 27, 2010

PLONE CONFERENCE 2010

• No Monkey Patching

• Simpler Code

Solr Integration as a Catalog Index

Wednesday, October 27, 2010

PLONE CONFERENCE 2010

• ZCatalog Index

• Doesn't depend on Plone

• Utilizes new foreign_connections Connection Method

• Pass through Solr Queries

• Direct access to the Solr Response

Enter alm.solrindex

Wednesday, October 27, 2010

PLONE CONFERENCE 2010

Wednesday, October 27, 2010

PLONE CONFERENCE 2010

Wednesday, October 27, 2010

PLONE CONFERENCE 2010

• Still handled by the ZCatalog

• Could change in the future

Sorting

Wednesday, October 27, 2010

PLONE CONFERENCE 2010

• Handle Parsing Attributes for Indexing

• Translate field-specific queries to Solr

• Registered as Zope Utilities

alm.solrindex Field Handlers

Wednesday, October 27, 2010

PLONE CONFERENCE 2010

<html><body><h3>Code Sample</h3><p>Replace this text!</p></body></html>

Example Handlerclass TextFieldHandler(DefaultFieldHandler):

def parse_query(self, field, field_query): name = field.name request = {name: field_query} record = parseIndexRequest(request, name, ('query',)) if not record.keys: return None

query_str = ' '.join(record.keys) if not query_str: return None

return {'q': u'+%s:%s' % (name, quote_query(query_str))}

Wednesday, October 27, 2010

PLONE CONFERENCE 2010

• GenericSetup Profile

• Tests

• Uses solrpy instead of the unsupported solr.py

Other alm.solrindex Features

Wednesday, October 27, 2010

PLONE CONFERENCE 2010

• Can replace several ZCatalog indexes

• Remove any indexes you have replaced

• Use it for all Text Indexes

• Still Utilize the ZCatalog Indexes for Everything Else

Tips

Wednesday, October 27, 2010

PLONE CONFERENCE 2010

DemoProject Gutenburg Data

Wednesday, October 27, 2010

PLONE CONFERENCE 2010

Questions?

Wednesday, October 27, 2010

Check out

sixfeetup.com/demos

Wednesday, October 27, 2010

top related