faceting with lucene block join query: presented by oleg savrasov, grid dynamics

26

Upload: lucidworks

Post on 17-Jul-2015

617 views

Category:

Software


0 download

TRANSCRIPT

Faceting with LuceneBlock Join Query

Agenda

1. Why we need special faceting for Block Join queries?

2. Proposed Block Join facet component.

Introducing myself

Oleg Savrasov, PhD

A programmer

Working for Grid Dynamics (griddynamics.com)

Work and live in Saint-Petersburg, Russia

Online shopping

Jerrica is looking for a dress

Huge amount of dresses

Facet filters help

Facet filters

Reduced amount

Tasks to be solved

● Performant Search

● Facet calculation/filtering

FacetComponent ?

Product has many SKU

Aggregated facet counts

Facets should count products, not SKU.

Expected facets:

COLOR Blue : 1 Red : 1SIZE S : 1 M : 1

Flat documents don’t help

False positive match for

+COLOR:Blue +SIZE:M

Separate SKU documents

q = *:*facet.field = COLORfacet.field = SIZE

COLOR Blue : 1 Red : 2SIZE S : 2 M : 1

Wrong numbers!

There is only one product

Search products only

q = *:*fq = scope:productfacet.field = COLORfacet.field = SIZE

COLOR : 0

SIZE : 0

No such fields in product documents

Aggregated facet counts

Facets should count products, not SKU.

Expected facets:

COLOR Blue : 1 Red : 1SIZE S : 1 M : 1

Solr Block Join Support (since Lucene 3.4.0)

Gre

en

Blu

e

Yel

low

Yel

low

Blu

e

Gre

en

Pro

duct

Gre

en

Yel

low

Pro

duct

Gre

en

Blu

e

Yel

low

Yel

low

Pro

duct

docId

1 1 1

Query: {!parent which="scope:product"}COLOR:Blue

1 1

scope:product

COLOR:Blue

ToParentQuery 1 1

Child docs Parent doc

Block1

SOLR-5743 Faceting with Block Join support

● Create BlockJoinFacetComponent

● Only DocValues fields are supported

● Facet counts should correspond to amount of parent documents

● ToParentQuery is expected

Faceting over DocSet slicesG

reen

Blu

e

Yel

low

Yel

low

Blu

e

Gre

en

Pro

duct

Gre

en

Yel

low

Pro

duct

Gre

en

Blu

e

Yel

low

Yel

low

Pro

duct

docId

10 1 0 0 1 0DocSet Slice

DocSet Slice counts

COLOR Blue : 2

Aggregated counts

COLOR Blue : +1

Block Join Facet Component

BlockJoinFacetCollector

Facets counting

It works!

q = {!parent which="scope:product"}COLOR:Blue

child.facet.field = SIZE

<response> ... <lst name="facet_counts">

<lst name="facet_fields"> <lst name="SIZE"> <int name="S">14</int> <int name="L">22</int> <int name="XL">17</int> </lst>

</lst> </lst></response>

The dress is found

Further improvements

● Thorough profiling

● Performance improvements

● Algorithmic improvements

Big thanks!

Do you have any questions?

Please vote for SOLR-5743.