introduction to enterprise search
TRANSCRIPT
![Page 1: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/1.jpg)
INTRODUCTION TO ENTERPRISE SEARCH
Kristian Norling
![Page 2: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/2.jpg)
• Who is here?
• Your expectations?
• Kristian?
• 2 hours, one break
• Lifetime answer Guarantee on this class
Introduction
![Page 4: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/4.jpg)
• Problem
• History of (web) search
• How we search and !nd?
• Current state of Enterprise Search + stats
• Technical concept
• Information quality
• Feedback cycle
• Five dimensions of Findability
Agenda
![Page 5: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/5.jpg)
•List
mrflip
![Page 6: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/6.jpg)
nathansnider
![Page 8: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/8.jpg)
• Growing amounts of Information
• Changing patterns of information consumption
• Information silos
• Web like behaviour > Information !lters
• Internal information use is still in the Digital Stone Age
The Problems
![Page 9: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/9.jpg)
In Academia search is called Information Retrieval.
It is an old discipline, dating back thousands of years...
Basic concepts in Information Retrieval:
Recall and Precision, more later...
History of Search
![Page 10: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/10.jpg)
• Directories are manually compiled taxonomies of websites
• Directories are far more costly and time intensive to maintain
• Directories lack coverage, although it provides an important alternative, especially for novice surfers
• Search engines rely mainly on automated search algorithms
• Search engines rank pages by popularity on the web, the more referrals (links) the more relevant
Directories vs. Search Engines
![Page 11: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/11.jpg)
Yahoo – searchable directory (1994, ~10000 websites)
• Integrates search over its directory. Organized by subject ma8ers. Sites can be suggested, but human editors control quality of directory (~100 dedicated editors)
Ask – natural language search engine (1998)
•used human editors to match popular queries. Tried different algorithms to rank pages by popularity
Google – searchable index (1998)
•Developed Pagerank, popularity algorithm that hides bad content. Set standards (spellchecking, query suggesIon, search results page design)
Early days of Web Search
![Page 12: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/12.jpg)
First generation (1995-97) – AltaVista, Excite, WebCrawler
Uses mostly on-page data (text and formatting).
Informational queries.
Second generation (1998-2010) – Google, Yahoo
Use o"-page, web-speci!c data: link analysis, anchor-text, click-through data. Informational and navigational queries.
Third generation (2010-present) – Google, Wolfram-Alpha, Bing
Blend data from many sources, tries to answer ‘‘the need behind the query’’: semantic analysis, context determination, dynamic database selection etc. Informational, navigational, and transactional queries.
Web Search - evolution
![Page 13: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/13.jpg)
Find information assumed to be available on the web in a static form.
Seeking information modes:
Informational
![Page 14: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/14.jpg)
Reach a particular site that the user has in mind, either because they visited it in the past or because they assume that such a site exists. Have usually only one "right" result.
Seeking information modes:
Navigational
![Page 15: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/15.jpg)
Reach a site where further interaction will happen. This interaction constitutes the transaction de!ning these queries. The main categories for such queries are shopping, !nding various web-mediated services, downloading various type of !le (images, songs, etc), accessing certain data-bases (e.g. Yellow Pages type data), !nding servers (e.g.for gaming) etc.
Seeking information modes:
Transactional
![Page 16: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/16.jpg)
Finding something when I know what I want and have words to describe it.
Four modes of seeking information
![Page 17: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/17.jpg)
Exploring when I only have some idea of what I want and may lack the words to articulate it.
Four modes of seeking information
![Page 18: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/18.jpg)
Finding relevant items when I don’t know what I need.
Four modes of seeking information
![Page 19: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/19.jpg)
Finding something I have seen before, but can’t remember where.
Four modes of seeking information
![Page 20: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/20.jpg)
•Amount of information is growing everyday
•What to Search for?
•Where to Search?
•How to Search?
•Search is simple, complex and powerful
•Findability Dimensions
The State of Enterprise Search
![Page 21: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/21.jpg)
STATS FROM THE
“ENTERPRISE SEARCH AND FINDABILITY SURVEY 2012”
SIGN-UP
![Page 22: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/22.jpg)
HOW CRITICAL IS FINDING THE RIGHT INFORMATION TO BUSINESS GOALS AND
SUCCESS?
![Page 23: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/23.jpg)
EUROPE76.5%
IMPERATIVE/SIGNIFICANT
![Page 24: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/24.jpg)
Zoom Zoom
![Page 25: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/25.jpg)
IS IT EASY TO FIND THE RIGHT INFORMATION
WITHIN YOUR ORGANISATION TODAY?
![Page 26: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/26.jpg)
EUROPE77%
MODERATELY/VERY HARD
![Page 27: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/27.jpg)
LEVEL OF SATISFACTION?
![Page 29: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/29.jpg)
EUROPE18.5%
MOSTLY/VERY SATISFIED
![Page 30: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/30.jpg)
WHAT ARE THE OBSTACLES TO FINDING THE RIGHT
INFORMATION?
![Page 31: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/31.jpg)
63.4% POOR SEARCH FUNCTIONALITY
52.1% DON'T KNOW WHERE TO LOOK
51.4% INCONSISTENCY IN HOW WE TAG
CONTENT
50.0% LACK OF ADEQUATE TAGS
33.1% DON’T KNOW WHAT TO LOOK FOR
Globally
![Page 32: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/32.jpg)
“Enterprise search is the practice of making content from multiple enterprise-type sources, such as databases and intranets, searchable to a de!ned audience.”http://en.wikipedia.org/wiki/Enterprise_search
Wikipedia De!nition
![Page 33: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/33.jpg)
In the !eld of information retrieval, precision is the fraction of retrieved documents that are relevant to the search.
Precision takes all retrieved documents into account, but it can also be evaluated at a given cut-o" rank, considering only the topmost results returned by the system. This measure is called precision at n or P@n.
Source: Wikipedia
The Concept of Enterprise Search: Precision
![Page 34: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/34.jpg)
Recall in information retrieval is the fraction of the documents that are relevant to the query that are successfully retrieved.
For example for text search on a set of documents recall is the number of correct results divided by the number of results that should have been returned.
Source: Wikipedia
The Concept of Enterprise Search: Recall
![Page 35: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/35.jpg)
M number of relevant documents
N number of retrieved documents
R number of retrieved documentsthat are also relevant
Precision and Recall
![Page 36: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/36.jpg)
Recall = R / M =
Number of retrieved documents that are also relevant / Total number of relevant documents.
Precision = R / N =
Number of retrieved documents that are also relevant / Total number of retrieved documents.
Precision and Recall
![Page 37: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/37.jpg)
...enterprises typically have to use other query-independent factors, such as a document's recency or popularity, along with query-dependent factors traditionally associated with information retrieval algorithms. Also, the rich functionality of enterprise search UIs, such as clustering and faceting, diminish reliance on ranking as the means to direct the user's attention.
Relevance
Source: Wikipedia
![Page 38: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/38.jpg)
PageRank
![Page 39: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/39.jpg)
We do not have PageRank...
...but we have social!
Social Reconnects Enterprise Search
Emails, People Catalogues, Connections, Tagging, Sharing etc.
Relevance
![Page 40: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/40.jpg)
The Concept of Enterprise Search
![Page 41: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/41.jpg)
Examples of implementations:
- People Search
- Product Search
- Document Search
- Intranet and Website Search
- E-commerce
- Dashboard / Search as a Service
Search based Solutions
![Page 42: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/42.jpg)
• Good Data/Information hygiene
• Crap in = Crap out
• Metadata is very important!
• Taxonomy and Metadata demysti!ed
• TetraPak example (video)
• SimCorp example
• VGR example (video)
Information / Content
![Page 43: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/43.jpg)
•List
yeraze
![Page 44: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/44.jpg)
svenwerk
![Page 45: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/45.jpg)
HCE (SWEDEN)DEWEY DECIMAL CLASSIFICATION
![Page 47: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/47.jpg)
Author: Douglas CouplandTitle: Hej Nostradamus!Publisher: Norstedts
Printed by: SmedjebackenYear: 2003
Printed: 2004
KristianNorling
![Page 48: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/48.jpg)
Metadata
Semantic
KristianNorling
![Page 49: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/49.jpg)
Example: Ernst & Young
• Metadata
• Titles
• Content Quality
• Information Life Cycle Management
ESEO: Actionable activities
![Page 50: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/50.jpg)
But, an average Search budget is 100K Euro
• TCO
• ROI
• KPI
Search Analytics is key
Show me the Money
![Page 51: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/51.jpg)
Important, delivers actionable to-dos quickly
• 0-results
• Top Terms Searched for
Video: Search Analytics in Practice
Search Analytics
![Page 52: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/52.jpg)
• Feedback form
• KPI from Search Analytics
• Session time x n:o sessions = Time spent on search x hourly price = Cost per “answer”
• Add search re!nements + exit page (=is the right answer)
User Satisfaction
![Page 53: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/53.jpg)
Findability by Findwise
1. BUSINESS
Build solutions to support your business processes and goals
2. INFORMATION
Prepare information to make it !ndable
3. USERS
Build usable solutions based on user needs
4. ORGANISATION
Govern and improve your solution over time
5. SEARCH TECHNOLOGY
Build solutions based on state-of-the-art search technology
![Page 54: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/54.jpg)
• Analyze how your business goals and strategies can be met by improved information access
• Set Findability goals. Examples; increase the revenue on sales, raise productivity, improve knowledge sharing, better collaboration
• Specify your requirements
• De!ne KPI’s and measure the success of your investments
Business
![Page 55: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/55.jpg)
• Clean up and archive or delete outdated/unrelevant information
• Ensure good quality of information by adding structured and suitable metadata
• Create and use information models and taxonomies
• Tagging?
Information
![Page 56: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/56.jpg)
• Get to know your users and their needs
• Make sure your solution is easy to use
• Perform continuous usability evaluations, like usage tests and expert evaluations
• Make sure users !nd what they are looking for
• Enable feedback loops for complaints, feedback and praise
Users
![Page 57: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/57.jpg)
• Resources!
• De!ne processes, roles and routines to govern the solution
• Perform Search Analytics
• Create easy to use administration interfaces
• Perform training, technical and editorial
• Help publishers get started with processes for better !ndability
Organisation
![Page 58: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/58.jpg)
• Select a suitable search platform or make the most of your current solution• Design your architecture with search-as-a-service in mind• Utilise the full potential of the selected technology
Search Technology
![Page 59: Introduction to Enterprise Search](https://reader034.vdocuments.us/reader034/viewer/2022042602/55d4fbcbbb61ebaa528b45c7/html5/thumbnails/59.jpg)
Kristian Norling
@kristiannorling
@!ndwise
!ndwise.com
Findability Blog
Slideshare
Vimeo
Newsroom
Kristian Norling