apache solr

Post on 11-Apr-2017

180 Views

Category:

Software

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Apache

Open Source Full Text Search Server

• What is Solr?

• Solr Architecture

• Install & Configure

• Search, Index, Update & Delete

What is Solr ?

• Solr is full text search server with REST-like API

• Document index with JSON, XML, CSV or binary over HTTP

• Query document with HTTP GET

• Receive JSON, XML, CSV or binary result

Solr History

• 2004 - Solr was created by Yonik Seeley• 2006 - Solr was joined in Apache• 2006 - Solr version 1.1.0. was released• 2010 - Solr and Lucene merged• 2012 - Solr version 4.0 was released• 2015 - Solr version 5.0 was released• 2016 - Solr version 6.0 was released

Solr Features

• Fuzzy & Proximity Search• Filter Query• Faceting• Highlighting• Stats• Spellcheck• Grouping• Admin Panel

Who uses Solr ?

Solr Architecture

SolrTerminology

• Core

• Document

• Field

• FieldType

• Analyzer

• Filter

• Tokenizer

CommonField Attribute

• name

• type

• indexed

• stored

• multivalued

• required

• compressed

Install&configure

• brew install solr

• schema.xml

• solrconfig.xml

• solr start -p port

schema • uniquekey

• fieldtype

• analyzer

• filter

• tokenizer

• field

• dynamic Field

• copyField

solrconfig• Data directory

• Query Cache parameters

• Request Handlers

• Update Handler (update log, autocommit )

• Lucene version

Custom Field Type

<fieldType name="text_general" class="solr.TextField”positionIncrementGap="100"> <analyzer type="index">

<tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.StopFilterFactory" words="stopwords.txt" ignoreCase="true"/>

</analyzer> <analyzer type="query">

<tokenizer class="solr.StandardTokenizerFactory"/><filter class="solr.SynonymFilterFactory" expand="true" ignoreCase="true"

synonyms="synonyms.txt"/><filter class="solr.StopFilterFactory" words="stopwords.txt" ignoreCase="true"/>

</analyzer> </fieldType>

Loggingdefault log folder : solr/logs/*

log4j.properties : solr/resources/*

SearchDocument

• q

• fq

• start

• row

• sort

• fl

• wt

QuerySyntax

Keyword Matchingname:foo bar

name:”foo bar”

name:foo -name:bar

Wildcard Matchingtitle:foo*

title:foo*bar

Range Search ( field:[0 TO 1] )Boosts ( title:foo^1.5 OR body:foo )

Fuzzy&ProximitySearch

• Fuzzy Search

• title”iphone”~0.5

• Proximity Search

• Title:”foo bar”~2

• foo abc def bar

• bar abc foo

FilterQuery return result without influence document score

faster than query

Faceting • facet.query

• facet.field

• facet.mincount -> f.<field.name>.facet.mincount

• facet.limit -> f.<field.name>.facet.limit

• facet.offset -> f.<field.name>.facet.offset

• facet.sort count, facet.sort index

• tagging & excluding Filter

Faceting• Facet.range

• Facet.range.start

• Facet.range.finish

• facet.range.gap

Faceting

Highlighting hl=true

fl

simple.pre

simple.post

"highlighting": {

"37477": {

"name": ["Apple <em>IPhone</em> 6S"]

}

}

Statsstats=true&stats.field=field.name

• min

• max

• count

• sum

• sumOfSquares

Spellingspellcheck.q=Keyword&spellcheck=on

"spellcheck": {"suggestions":

["father",{"numFound": 3,"startOffset": 0,"endOffset": 6,"origFreq": 20,"suggestion": [

{"word": "feather","freq": 3},{"word": "farmer","freq": 4},{"word": "fisher","freq": 3}]

}],

"correctlySpelled": false}

Groupinggroup=true&group.field=year

"grouped":{ "year":{ "matches":10683, "groups":[{ "groupValue":1995, "doclist":{"numFound":361,"start":0,"docs":[ { "movie_id":"movie_32", "id":"32", "name":"12 Monkeys (Twelve Monkeys)", "year":1995, "genre":["Sci-Fi", "Thriller"], "_version_":1545364353246560258}] }}, { "groupValue":1994, "doclist":{"numFound":307,"start":0,"docs":[ { "movie_id":"movie_889", "id":"889", "name":"1-900 (06)", "year":1994, "genre":["Drama", "Romance"], "_version_":1545364353356660743}] }}}

IndexData • post command -c coreName -p port

• Rest API

• SolrJ, Spring Data Solr or Other libraries

• DataImportHandler

REST API Sample (XML)

curl -X POST "http://localhost:8080/solr/films/update?commit=true" -H "Content-Type: text/xml"

-d '<add> <doc> <field name="id">100000</field>

<field name="name">Toy2 Story</field> </doc>

</add>'

REST API Sample (JSON)

curl -X POST 'http://localhost:8983/solr/new_core/update?commit=true' -H 'Content-Type: application/json' -d'[

{"id": "1","name": "movie name 1"

},{

"id": "1","name": "movie name 2"}

]'

DataImportHandler (Mysql)

<dataConfig><dataSource type=”JdbcDataSource”

driver="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost:3306/db" user="username" password="password" />

<document><entity name="film"

query="select id,name from film" deltaQuery="select id from film where last_modified > '$

{dataimporter.last_index_time}'"> <field column="id" name="id" /><field column="name" name="name" /> </entity> </document></dataConfig>

Update Data

curl -X POST "http://localhost:8983/solr/new_core/update?commit=true" -H "Content-Type: text/xml" -d ‘[{"id":"1","movie_id":{"set":”new_movie_id"}}]'

Delete Data

http://localhost:8983/solr/new_core/update?commit=true&stream.body=<delete><query>*:*</query></delete>

Admin Panel&

Demo

top related