language recognition… searching with precision santa clara, ca october 31, 2001 julian henkin vice...
Post on 24-Dec-2015
214 Views
Preview:
TRANSCRIPT
Language Recognition…Language Recognition…Searching with PrecisionSearching with Precision
Santa Clara, CAOctober 31, 2001
Julian HenkinJulian HenkinVice President, Vice President, Worldwide Customer Worldwide Customer ServicesServicesLexiQuest, Inc.LexiQuest, Inc.
Booth # 523Booth # 523
Topics for DiscussionTopics for Discussion
Critical Nature of SearchCritical Nature of Search
Importance of LinguisticsImportance of Linguistics
Language RecognitionLanguage Recognition
Case StudiesCase Studies
Critical Nature of SearchCritical Nature of Search
““At least one-third of your visitors are going to use the search At least one-third of your visitors are going to use the search function as soon as they enter your site.” function as soon as they enter your site.”
- Improving Your Site’s Search, - Improving Your Site’s Search, The Information StandardThe Information Standard, August 11, 2000, August 11, 2000
““On average, professional users spend 11 hours per week looking On average, professional users spend 11 hours per week looking for information. 71% said they could not find what they were for information. 71% said they could not find what they were looking for.” looking for.” - “Information Management Software,” - “Information Management Software,” Lazard Freres & Co. LLCLazard Freres & Co. LLC, February , February 20012001
““Ultimately, the return on investment (ROI) of corporate information Ultimately, the return on investment (ROI) of corporate information systems cannot be solely derived from the cost of building systems cannot be solely derived from the cost of building populating and maintaining these systems. True ROI also reflects populating and maintaining these systems. True ROI also reflects the ability of all classes of users to effectively use the information.”the ability of all classes of users to effectively use the information.”
- “Looking for a Lifesaver- “Looking for a Lifesaver?”?”, , KM MagazineKM Magazine, August 1999, August 1999
Challenges with Today’s SearchChallenges with Today’s Search
Traditional and advanced methods (key word, Boolean Traditional and advanced methods (key word, Boolean searches, statistical and probability algorithms, concept searches, statistical and probability algorithms, concept agents, neural networks and pattern recognition) are agents, neural networks and pattern recognition) are limited in their ability to retrieve accurate results:limited in their ability to retrieve accurate results:
Not intuitive for typical user so full breadth of capability is rarely Not intuitive for typical user so full breadth of capability is rarely utilizedutilized
Do not provide any level of “understanding” of the text or of the Do not provide any level of “understanding” of the text or of the concepts represented by the queries. concepts represented by the queries.
Search is based solely or largely on the comparison of the Search is based solely or largely on the comparison of the character strings in both queries and text. character strings in both queries and text.
Results often include a lot of “noise” (irrelevant results) and Results often include a lot of “noise” (irrelevant results) and “silence” (accurate results are not found). “silence” (accurate results are not found).
What if you don’t know what you are looking for? What if you don’t know what you are looking for?
Importance of LinguisticsImportance of Linguistics
Linguistic-based systems are knowledge-sensitive: the Linguistic-based systems are knowledge-sensitive: the more information there is in their “dictionaries”, the better more information there is in their “dictionaries”, the better the qualitythe quality: :
Natural Language interface is very intuitive for users, lets the Natural Language interface is very intuitive for users, lets the system do the worksystem do the work
Up to a 400% improvement in performance over traditional Up to a 400% improvement in performance over traditional search engines (greater relevance, and precision)search engines (greater relevance, and precision)
Can deliver multilingual and cross-lingual accessCan deliver multilingual and cross-lingual access
How Does Language Recognition Work?How Does Language Recognition Work?
CONCEPTSOrganizes concepts regardless of their language i.e., Table (Fr), Table (Eng), Mesa (Sp), Tavola (It)
SEMANTICUnderstands the meanings of words i.e., book=to register for a future activity vs. book= set of bound
sheets of paper
SYNTAXUnderstands a sentence’s or phrase’s structure and the “roles” of words i.e., subjects, verbs, objects; “to book” vs. “a book”
MORPHOLOGYWord structure. Recognizes words (simple and compound) i.e., “to buy”, “bought”
The Ladder of Language
1. P
ers
on
aliz
ati
on
(Sha
ring)
2. Codification(Capture,
Structured Storage)
3. Discovery (Search, Retrieval) 4
. Crea
tion
Inn
ov
atio
nLexiQuest Mine
LexiQuest Categorize
LexiQuest Guide
LexiQuest Respond
5. C
ap
ture
Mo
nito
r
“Knowledge Management is the collection of processes that govern the creation, dissemination, and utilization of knowledge.”
“Knowledge is one, if not THE, principal factor that makes personal, organizational, and societal intelligent behavior possible.”
“Organizations that have adopted this position (Chief Knowledge Officer) include Hoffman-LaRoche, GE Lighting, Xerox PARC, and several consultancies, including Ernst &Young, Gemini, and McKinsey”
Five KM ActivitiesFive KM Activities
Enterprise Document
Databases, Web sites or
Repositories
Domain 1Limited amount of
content
Domain 3Significant
amount and depth of content
Users w
ho browse via a
directory structure/taxonomy.
Many S
earch Engines now
leverage a taxonom
y: improved
accuracy
Users who know what they are looking for and prefer using a search
engine.
Use
rs w
ho c
olle
ctiv
ely
ask
the
sam
e na
rrow
set
of
ques
tions
ove
r an
d ov
er
agai
n
LexiQuest Mine
Lex
iQu
est
Res
po
nd
LexiQ
uest C
atego
rize
LexiQuest Guide
Users who don’t know what they are looking for and
need concepts illuminated. (Research)
Domain 2Limited amount of
content
Suite of CapabilitiesSuite of Capabilities
User ExperienceUser Experience
“Who are the main ISP’s in the Far East?”
Linguistic Analysis
Accurate Results: Taiwanese Access Service Provider
Mine: A Research ToolMine: A Research Tool
Electronic CommerceElectronic Commerce NEAR NEAR FraudFraud
Guide’s Linguistic ExpansionGuide’s Linguistic Expansion
““consumers’ fraud consumers’ fraud protection online” – 21 protection online” – 21
documentsdocuments
““Swindle” returns this Swindle” returns this relevant documentrelevant document
Pharmaceutical ExamplePharmaceutical Example
“What antibiotic treats anthrax”
Antibiotic expansions include ciprofloxicin, Cipro, ciprofloxicin hydrochloride
Quantitative ResultsQuantitative Results
60%
50%
40%
30%
20%
10%
0%
400% more Accurate than Current Solutions400% more Accurate than Current Solutions
5 15 20number of answers retrievedCustom LexiQuest Guide
Search Engine 10 30
% of correct answers of all answers retrieved
Ensures all relevant information is retrieved
Reduces “noise” from irrelevant results
top related