spcua 2013 alexey kozhemiakin enterprise search

40
May 22 nd 2013, Kiev Enterprise search portals SharePoint 2013 Alexey Kozhemiakin

Upload: alex-kozhemiakin

Post on 28-Jan-2015

104 views

Category:

Technology


1 download

DESCRIPTION

English version of my slides from SPCUA 2013

TRANSCRIPT

Page 1: Spcua 2013 Alexey Kozhemiakin Enterprise Search

May 22nd 2013, Kiev

Enterprise search portals SharePoint 2013

Alexey Kozhemiakin

Page 2: Spcua 2013 Alexey Kozhemiakin Enterprise Search

May 22nd 2013, Kiev

or “How to make a cool search”

Alexey Kozhemiakin

Page 3: Spcua 2013 Alexey Kozhemiakin Enterprise Search

3

Who’s speaking to you?

• Solution Architect @epam

• Focusing on search• Sharepoint Search FAST/2010/2013• Apache Lucene, Solr, elasticsearch,

Oracle Endeca…

• http://powersearching.wordpress.com

Page 4: Spcua 2013 Alexey Kozhemiakin Enterprise Search

4

Agenda

• Enterprise Search Portal• Insight into SP2013 Search• Key changes from SP2010• A bit of magic – relevancy calculation

• Search governance, useful hint & tips

Page 5: Spcua 2013 Alexey Kozhemiakin Enterprise Search

5

Key search patterns

• I know what I’m searching and where to find it

• I know what I’m searching but don’t know where to find it.

• I don’t‘ know what I’m searching

http://aghy.hu/AghyBlog_EN/Lists/Posts/Post.aspx?ID=199

Page 6: Spcua 2013 Alexey Kozhemiakin Enterprise Search

6

• Demand:• Fast growing enterprises• Zoo of internal systems

• Solution: • “google” inside enterprise

• Quick-wins for business:• Single point of smart search and information retrieval• Reduce search time by employee• Better inner communications and simplified reuse of

conent

Enterprise Search Portal

Page 7: Spcua 2013 Alexey Kozhemiakin Enterprise Search

7

But after deployment…

• «.. Search sucks»• Out of the box search knows nothing about you• «Typical But…• … Microsoft takes care of decent search algorithm»• … we’re not sure we can do better»• ... we don’t need search, everybody know where content is»• … make our search like in facebook/google/bing (instead of

requirements)»

Page 8: Spcua 2013 Alexey Kozhemiakin Enterprise Search

8

Why it’s hard

• Ambiguous short queries• Unstructured not optimized content• Different active vocabulary of content users and

creators• Limited resources ($), while in internet search:• Auto and manual testing of search quality (assessors)• Continuous improvement

Page 9: Spcua 2013 Alexey Kozhemiakin Enterprise Search

9

Search architecture in SP2013

Page 10: Spcua 2013 Alexey Kozhemiakin Enterprise Search

10

Search in two phase process

• Matching – all docs with keywords• Linguistics: stemming, phonetics• Synonyms

• Ranking• «Фичи»

• TF-IDF, BM25• Вес полей• Тип файла• Дата изменения• Популярность• …

Page 11: Spcua 2013 Alexey Kozhemiakin Enterprise Search

11

Ranking in FAST

• Linear combination of features

Page 12: Spcua 2013 Alexey Kozhemiakin Enterprise Search

12

Ranking in FAST

• Impact of each component to final rank

1st 2nd 3rd 4th0

1000

2000

3000

4000

5000

6000

7000

8000

term:fast term:search freshness static rank proximity

Page 13: Spcua 2013 Alexey Kozhemiakin Enterprise Search

13

Migration FAST->SP2013

Page 14: Spcua 2013 Alexey Kozhemiakin Enterprise Search

14

Ranking in SP2013

Page 15: Spcua 2013 Alexey Kozhemiakin Enterprise Search

15

Ranking in SP2013

• Default Relevancy Model• Two neural networks• Freshness in not included in ranking• Features Type Instance

BM25 BM25Static UrlDepthBucketedStatic InternalFileTypeBucketedStatic LanguageStatic ClickDistanceStatic QueryLogClicksStatic QueryLogSkipsStatic LastClicksStatic EventRateMinSpan - soft TitleMinSpan - soft TitleMinSpan - soft TitleMinSpan - soft Content

Page 16: Spcua 2013 Alexey Kozhemiakin Enterprise Search

16

Ranking in SP2013

• Default relevancy model

Page 17: Spcua 2013 Alexey Kozhemiakin Enterprise Search

17

Explain rank

• /_layout/15/explainrank.aspx• rankdetail property

Page 18: Spcua 2013 Alexey Kozhemiakin Enterprise Search

18

Explain rank

• Manual validation in excel

Page 19: Spcua 2013 Alexey Kozhemiakin Enterprise Search

19

Page 20: Spcua 2013 Alexey Kozhemiakin Enterprise Search

20

Search Governance

1. Search analytics2. Fine tuning and adaptation3. Regular testing4. Security assessment5. Promotion whithin company6. Content optimization and basic SEO

Page 21: Spcua 2013 Alexey Kozhemiakin Enterprise Search

21

1. Search analytics

• Search analytics• Search analytics• Search analytics

• Obey! Use Search analytics

Page 22: Spcua 2013 Alexey Kozhemiakin Enterprise Search

22

1. Search analytics

• OOTB in SP2013• Most popular queries• «No Results/abandoned» queries

• 3rd party tools (Google Analytics, Omniture, WebTrends)• Measure search quality (!)

• % click on results• Which results• Return after clicks

• Session analysis• Query segmantation

Page 23: Spcua 2013 Alexey Kozhemiakin Enterprise Search

23

Query segmantation

• Analyze and improve not only top N queries, but classes of queries

Page 24: Spcua 2013 Alexey Kozhemiakin Enterprise Search

24

2. Fine tuning

• Authoritative Pages• Quick win – content source priority

• Query Rules• Smart search for users

• Synonyms• Separate mapping file• Expansion only• Termsets synonyms NOT working

• Relevancy models

Page 25: Spcua 2013 Alexey Kozhemiakin Enterprise Search

25

Authoritative Pages

• Impacts ClickDistance• ClickDistance, UrlDepth have hich impact on total

score (see explain rank)• Configures in CA, CSOM

Page 26: Spcua 2013 Alexey Kozhemiakin Enterprise Search

26

Query Rules (Rule + Action)

• The tool to make search smarter• Interactive feedback to user queries• Post processing of queries• Leverage navigational queries• …

Page 27: Spcua 2013 Alexey Kozhemiakin Enterprise Search

27

Condition for Query Rules

• Query Matches Keyword Exactly• Advanced Query Text Match• Query Matches Dictionary Exactly

• Query Contains Action Term

• Query More Common in Source• Result Type Commonly Clicked

Page 28: Spcua 2013 Alexey Kozhemiakin Enterprise Search

28

Actions для Query Rules

• Create and display a result block• Change ranked search results• Best Bets• XRANK

• Works additive to total rank• Not explained in rankdetail• How to choose correct value?

Page 29: Spcua 2013 Alexey Kozhemiakin Enterprise Search

29

Templates for QueryRules

• Typical navigational keywords from our portal• Software, soft, download, install• How to• Policy, Blog• Portal• Music, Video• Presentation, Documents, Report• Training, tutorial• Book, ebook

• You will have different ones!

Page 30: Spcua 2013 Alexey Kozhemiakin Enterprise Search

30

Custom Rank Models

• Сбор Query Judgments• Tune neural network coefficients using machine

learning• Gradient Descent, Lambda Rank

• Microsoft.Office.Server.Search.RankerTuning

Page 31: Spcua 2013 Alexey Kozhemiakin Enterprise Search

31

Custom Rank Models

• Modify manually new model or very simple (not default one!)• A/B testing of weights• Measure, measure: Precision, NDCG

Page 32: Spcua 2013 Alexey Kozhemiakin Enterprise Search

32

Custom Rank Models

• Example of simple model – people search

Page 33: Spcua 2013 Alexey Kozhemiakin Enterprise Search

33

3. Search quality testing

• Why need? It’s your compass.• «Unit testing»• Periodical manual testing

Page 34: Spcua 2013 Alexey Kozhemiakin Enterprise Search

34

4. Security «audit»

• Search reveals breaches in security• Security by obscurity

• Examples of queries:• «confidential»• Salaries, performance reviews

• Solution – automatic monitoring of sensitive queries

Page 35: Spcua 2013 Alexey Kozhemiakin Enterprise Search

35

5. Adoption of content

• Use with departments• Get help with search monitoring of their queries

• Guideline to format content• Basic SEO• Titles• Friendly urls • Custom meta tags <meta name=…

• Title, description• Custom Automatically appear in crawled properties

Page 36: Spcua 2013 Alexey Kozhemiakin Enterprise Search

36

6. Promotion within company

• Image – «you will find everything here»• Integrate with other portals• Propose Search as a serivce• Widget «Global search»

• Badges, gamification

Page 37: Spcua 2013 Alexey Kozhemiakin Enterprise Search

37

Promotion

• Social Best-bets

Page 38: Spcua 2013 Alexey Kozhemiakin Enterprise Search

38

Semantic search

• Cannot be solved in general• Analytics + fine tuning• See practices above

• NLP – question answering• Rocket science• English only• Part of speech tagging, dependency parsing

• Stanford NLP, Open NLP, IR

Page 39: Spcua 2013 Alexey Kozhemiakin Enterprise Search

39

«References»

• Patents - http://goo.gl/20sbR

• Explain Rank page - http://goo.gl/o3ZmN

• How SP2013 relevancy models works - http://goo.gl/arf0P

• MS Enterprise Search approach - http://goo.gl/x8SDO

• Customizing ranking models in SP 2013 - http://goo.gl/lBJAp

Page 40: Spcua 2013 Alexey Kozhemiakin Enterprise Search

May 22nd 2013, Kiev

Thanks

Skype: Alexey_KozhemiakinEmail: [email protected]: http://powersearching.wordpress.com

40