apache solr
Post on 11-Apr-2017
180 Views
Preview:
TRANSCRIPT
Apache
Open Source Full Text Search Server
• What is Solr?
• Solr Architecture
• Install & Configure
• Search, Index, Update & Delete
What is Solr ?
• Solr is full text search server with REST-like API
• Document index with JSON, XML, CSV or binary over HTTP
• Query document with HTTP GET
• Receive JSON, XML, CSV or binary result
Solr History
• 2004 - Solr was created by Yonik Seeley• 2006 - Solr was joined in Apache• 2006 - Solr version 1.1.0. was released• 2010 - Solr and Lucene merged• 2012 - Solr version 4.0 was released• 2015 - Solr version 5.0 was released• 2016 - Solr version 6.0 was released
Solr Features
• Fuzzy & Proximity Search• Filter Query• Faceting• Highlighting• Stats• Spellcheck• Grouping• Admin Panel
Who uses Solr ?
Solr Architecture
SolrTerminology
• Core
• Document
• Field
• FieldType
• Analyzer
• Filter
• Tokenizer
CommonField Attribute
• name
• type
• indexed
• stored
• multivalued
• required
• compressed
Install&configure
• brew install solr
• schema.xml
• solrconfig.xml
• solr start -p port
schema • uniquekey
• fieldtype
• analyzer
• filter
• tokenizer
• field
• dynamic Field
• copyField
solrconfig• Data directory
• Query Cache parameters
• Request Handlers
• Update Handler (update log, autocommit )
• Lucene version
Custom Field Type
<fieldType name="text_general" class="solr.TextField”positionIncrementGap="100"> <analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.StopFilterFactory" words="stopwords.txt" ignoreCase="true"/>
</analyzer> <analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/><filter class="solr.SynonymFilterFactory" expand="true" ignoreCase="true"
synonyms="synonyms.txt"/><filter class="solr.StopFilterFactory" words="stopwords.txt" ignoreCase="true"/>
</analyzer> </fieldType>
Loggingdefault log folder : solr/logs/*
log4j.properties : solr/resources/*
SearchDocument
• q
• fq
• start
• row
• sort
• fl
• wt
QuerySyntax
Keyword Matchingname:foo bar
name:”foo bar”
name:foo -name:bar
Wildcard Matchingtitle:foo*
title:foo*bar
Range Search ( field:[0 TO 1] )Boosts ( title:foo^1.5 OR body:foo )
Fuzzy&ProximitySearch
• Fuzzy Search
• title”iphone”~0.5
• Proximity Search
• Title:”foo bar”~2
• foo abc def bar
• bar abc foo
FilterQuery return result without influence document score
faster than query
Faceting • facet.query
• facet.field
• facet.mincount -> f.<field.name>.facet.mincount
• facet.limit -> f.<field.name>.facet.limit
• facet.offset -> f.<field.name>.facet.offset
• facet.sort count, facet.sort index
• tagging & excluding Filter
Faceting• Facet.range
• Facet.range.start
• Facet.range.finish
• facet.range.gap
Faceting
Highlighting hl=true
fl
simple.pre
simple.post
"highlighting": {
"37477": {
"name": ["Apple <em>IPhone</em> 6S"]
}
}
Statsstats=true&stats.field=field.name
• min
• max
• count
• sum
• sumOfSquares
Spellingspellcheck.q=Keyword&spellcheck=on
"spellcheck": {"suggestions":
["father",{"numFound": 3,"startOffset": 0,"endOffset": 6,"origFreq": 20,"suggestion": [
{"word": "feather","freq": 3},{"word": "farmer","freq": 4},{"word": "fisher","freq": 3}]
}],
"correctlySpelled": false}
Groupinggroup=true&group.field=year
"grouped":{ "year":{ "matches":10683, "groups":[{ "groupValue":1995, "doclist":{"numFound":361,"start":0,"docs":[ { "movie_id":"movie_32", "id":"32", "name":"12 Monkeys (Twelve Monkeys)", "year":1995, "genre":["Sci-Fi", "Thriller"], "_version_":1545364353246560258}] }}, { "groupValue":1994, "doclist":{"numFound":307,"start":0,"docs":[ { "movie_id":"movie_889", "id":"889", "name":"1-900 (06)", "year":1994, "genre":["Drama", "Romance"], "_version_":1545364353356660743}] }}}
IndexData • post command -c coreName -p port
• Rest API
• SolrJ, Spring Data Solr or Other libraries
• DataImportHandler
REST API Sample (XML)
curl -X POST "http://localhost:8080/solr/films/update?commit=true" -H "Content-Type: text/xml"
-d '<add> <doc> <field name="id">100000</field>
<field name="name">Toy2 Story</field> </doc>
</add>'
REST API Sample (JSON)
curl -X POST 'http://localhost:8983/solr/new_core/update?commit=true' -H 'Content-Type: application/json' -d'[
{"id": "1","name": "movie name 1"
},{
"id": "1","name": "movie name 2"}
]'
DataImportHandler (Mysql)
<dataConfig><dataSource type=”JdbcDataSource”
driver="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost:3306/db" user="username" password="password" />
<document><entity name="film"
query="select id,name from film" deltaQuery="select id from film where last_modified > '$
{dataimporter.last_index_time}'"> <field column="id" name="id" /><field column="name" name="name" /> </entity> </document></dataConfig>
Update Data
curl -X POST "http://localhost:8983/solr/new_core/update?commit=true" -H "Content-Type: text/xml" -d ‘[{"id":"1","movie_id":{"set":”new_movie_id"}}]'
Delete Data
http://localhost:8983/solr/new_core/update?commit=true&stream.body=<delete><query>*:*</query></delete>
Admin Panel&
Demo
top related