more powerful solr search with semaphore - jeremy bentley

29
Smartlogic TM Apache Lucene Eurocon Jeremy Bentley, CEO

Upload: lucenerevolution

Post on 15-Apr-2017

933 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: More Powerful Solr Search with Semaphore - Jeremy Bentley

Smartlogic TM

Apache Lucene Eurocon    

Jeremy  Bentley,  CEO  

Page 2: More Powerful Solr Search with Semaphore - Jeremy Bentley

1st degree of order

Filing management • 80% of enterprise information is unstructured • Doubling every 19 months and accelerating [Gartner] • Increasing burden of compliance • Enterprise 2.0 additions

Page 3: More Powerful Solr Search with Semaphore - Jeremy Bentley

2nd degree of order

Index management • File plans and metadata schema • Mono- hierarchical standardised taxonomies • Manually applied classification • Low level of consistency and quality

Page 4: More Powerful Solr Search with Semaphore - Jeremy Bentley

3rd degree of order Computerised 1st and 2nd degrees

Page 5: More Powerful Solr Search with Semaphore - Jeremy Bentley

A 10 year Flatline Expectation Gap

• 2001,  IDC,  “Quan5fying  Enterprise  Search”    Searchers  are  successful  in  finding  what  they  seek  50%  of  the  9me  or  less    

 

• 2011,  MindMetre/SmartLogic  More  than  half    (52%)  cannot  find  the  informa9on  they  need  using  their  Enterprise  search  system    

5  

Page 6: More Powerful Solr Search with Semaphore - Jeremy Bentley

Terabytes  o

f  data  

Source:  the  Na5onal  Archives  

The explosion of information

2001-­‐2009  1993-­‐2001  

?  4Tb  

80Tb  

20  5mes  increase  in  Informa5on  volume  

Page 7: More Powerful Solr Search with Semaphore - Jeremy Bentley

Search Gets Harder as Data sets Grow

   

 

7  

Circa  1996  

Page 8: More Powerful Solr Search with Semaphore - Jeremy Bentley

Different vocabulary and ambiguity You  Say   I  Say  

Moon  Buggy   Lunar  Roving  Vehicle  Manned  Lunar  Surface  Vehicle  

Swine  Flu   Swine  Influenza  Virus  H1N1  

Touchscreen   Touch  screen  Mul5-­‐touch  

You  Say   What  do  you  mean?  

Apple   A  fruit?  Fiona  -­‐  A  singer  /  songwriter?  An  electronics  company?  

Rights   Employment  rights?  Equal  rights?  Right  of  way?  

Ford   Ford  Motor  Forward  Industrials  (5cker=FORD)  A  shallow  river  crossing  

Missing results

Too many results

Page 9: More Powerful Solr Search with Semaphore - Jeremy Bentley

Drawbacks Apparent

1 Needle in the Haystack

2 Multiple search terms

3 Irrelevant results

4 Out of date results

5 Multiple media forms

6 Unrestricted geography

7 Inappropriate ads

Not So Apparent

8 Can’t filter, select subset

9 No related topics

10 Missing results

11 No context or guidance

12 Best resource not clear

ü  Time consuming ü  Inefficient ü  Ineffective

1  

2  

3  

4  

5  

7  

6  

Conventional Search - Ineffective, Frustrating, and Inadequate

Page 10: More Powerful Solr Search with Semaphore - Jeremy Bentley

Knowing what you have

Page 11: More Powerful Solr Search with Semaphore - Jeremy Bentley

Web Enterprise

Metadata effort High Low

Result Quality requirement

Low High

Paradox of Effort

Metadata  is  to  search,  what  pistons  are  to  a  petrol  engine.  

Page 12: More Powerful Solr Search with Semaphore - Jeremy Bentley

How do I structure it?

Crea5on  Date  

Modified  Date  

Author  

Format  (PDF,DOC,XLS)  

Subject  

Loca5on  

Project  

Func5on  (IT,HR,Finance)  

Expe

rt  

Protec5ve  

Marker  

Reten5

on  

Expiry  

Publish

er  

Site  

Structural Process

Information

Page 13: More Powerful Solr Search with Semaphore - Jeremy Bentley

3rd degree content universe

Digital  Asset  

Management  

Publishing  Systems  

Social  collaboraFon  

eDiscovery  

Document    Management  

Content  Management  

Enterprise  Search  

Records  Management  

Portal  Infrastructure  

Process    Management  &  

Workflow  

Page 14: More Powerful Solr Search with Semaphore - Jeremy Bentley

4th degree of order

Digital  Asset  

Management  

Publishing  Systems  

Social  collaboraFon  

eDiscovery  

Document    Management  

Content  Management  

Enterprise  Search  

Records  Management  

Portal  Infrastructure  

Process    Management  &  

Workflow  

Content

Intelligence

Page 15: More Powerful Solr Search with Semaphore - Jeremy Bentley

4th degree of order Content Intelligence

Content  Intelligence  Plahorm  

     Solr  

Page 16: More Powerful Solr Search with Semaphore - Jeremy Bentley

Semaphore

Copyright  @  2011  Smartlogic  Semaphore  Limited   16  

Business    Vocabulary  

Classifica5on  Decision  User  

Ac5on  

Apply  

Inform  

Expose  

Page 17: More Powerful Solr Search with Semaphore - Jeremy Bentley

Semaphore

Copyright  @  2011  Smartlogic  Semaphore  Limited   17  

Business  Vocabulary  

Classifica5on  Decision  

Apply  

Inform  

Expose   Metadata  

Contextual  User  Experience  

Seman6c  models  

Seman6c  So7ware  

User  Ac5on  

Page 18: More Powerful Solr Search with Semaphore - Jeremy Bentley

Components • Metadata  • Seman5c  Models  • Contextual  User  Experience  • Seman5c  Sokware  

Copyright  @  2011  Smartlogic  Semaphore  Limited   18  

Page 19: More Powerful Solr Search with Semaphore - Jeremy Bentley

Metadata

Copyright  @  2011  Smartlogic  Semaphore  Limited   19  

Low  Quality  tags  High  cost  to  apply  

Manual  Process  

Single  Unified  ‘one  size  fits  all’  approach    

Long  5me  to  crak    &  build  ,  manually  applied  

Today  

High  Quality  tags  Low  cost  to  apply  

Automa5c  Process  

Mul5ple    approaches    for  various  domains/audiences  

Short  5me  to  build  &  deploy,  automa5cally    

With  Content  Intelligence  

Page 20: More Powerful Solr Search with Semaphore - Jeremy Bentley

Content-types available – Flashnotes

– Research reports – Trade ideas

Analytics available – Current bond price

– Relative bond spreads Influenced by – Credit ratings on

Ford Motor Credit Company – European and US economies – Changes in consumer demand

Automate compliance and

distribution tasks – ‘Watch list’ lookup

– Distribution according to preset rules

– Automated mapping to create aggregator metadata

Harnessing

User Experience – Conceptual relevance

– Related topics – Links to analytics

Search engine enhancement – Search results – Email alerts

Contextualising

Key competitors – BMW

– Daimler Chrysler – General Motors

– Toyota – Volkswagen Products

– Focus – Ka

– MX5

Preferred term (Agreed Label)

Ford Motor Company

Subsidiaries – Ford Motor Credit Company

– Mazda

Parent topics – Automotive sector

– Bond issuers

Also known as – Ford

– Ford Motor – F (Bloomberg)

– FoMoCo – blue oval

Covered by – Bob Smith

Location of fundamental data – Earnings estimates

– Historic sales and profits

Organising

Unstructured content integration

– Published reports – Related topics

– Links to analytics – Search results – Email alerts

Semantic Models

Page 21: More Powerful Solr Search with Semaphore - Jeremy Bentley

Key Features 1 Taxonomy enables

discovery, related searches

2 Related topics and content

3 Facets enable filtering results by:

4 -  Source

5 -  Numerous topics

6 - Date

7 Best Bets

8 Automated doc. Tagging

9 A-Z

ü  More relevant results ü  Fewer “bad hits” ü  Powerful navigation

1  

3  

5  

4  

2  

8  

9  

6  

7  

Contextual User Experience

Page 22: More Powerful Solr Search with Semaphore - Jeremy Bentley

Content  ExploraFon

Highligh5ng  rela5onships  in  a  result  set  greatly  improves  the  user  experience.  

Page 23: More Powerful Solr Search with Semaphore - Jeremy Bentley

Semantic Software

Semaphore  Ontology    &  Metadata  Management  

Text  Analysis  &  Extrac5on  Automa5c    and  assisted    Content  classifica5on  

Contextual  Naviga5on  Services  Seman5c  Reasoning  &  Processing  

Page 24: More Powerful Solr Search with Semaphore - Jeremy Bentley

Semaphore Search Integration

Search  Engine  

Query   Index  

Corpus  

Web  Services  API  

Search  Enhancement  

Server  

XML  API  

Classifica5on  Server  

Collector/Normalizer  

Extracted  Text  Document  “Tags”  

Ontology  Informa5on  

Text  Miner  

Ontology  Manager  

User  R

eque

sts  

Portal  

Search  Applica5on  Framework  

Sample  Interface  Co

de  

Semaphore  core  module  

Semaphore  op5onal  module  

Local  Term  Index  

Classifi

ca5o

n  Ru

les  

Page 25: More Powerful Solr Search with Semaphore - Jeremy Bentley

4th degree of order

Digital  Asset  

Management  

Publishing  Systems  

Social  collaboraFon  

eDiscovery  

Document    Management  

Content  Management  

Enterprise  Search  

Records  Management  

Portal  Infrastructure  

Process    Management  &  

Workflow  

Content

Intelligence

Page 26: More Powerful Solr Search with Semaphore - Jeremy Bentley

Content Intelligence

Informa5on  Manufacturing  

Knowledge  Recovery  

Content    Analy5cs  

Data  Loss  Preven5on  Risk  &  Compliance  

Mone5sa5on  

Metadata  

Page 27: More Powerful Solr Search with Semaphore - Jeremy Bentley

Content Intelligent Solutions

Web    Self  Service  

Knowledge    Acquisi5on  &  Recovery  

Governance  Risk    Compliance  

Cross  Plahorm  Content  Integra5on  

Micro-­‐Targe5ng  &  Distribu5on    

Page 28: More Powerful Solr Search with Semaphore - Jeremy Bentley

www.smartlogic.com   28  

Page 29: More Powerful Solr Search with Semaphore - Jeremy Bentley

Smartlogic TM

[email protected]

29  www.smartlogic.com