there's no ai without ia with seth earley
TRANSCRIPT
Copyright © 2017 Earley Information Science, Inc. All Rights Reserved.
Agenda
Perspective on Intelligent Technologies
Our Approach
Conversational Commerce
Search & Retrieval
Knowledge Portal
Virtual Agent
Human – Bot Collaborations
#idw2017
Copyright © 2017 Earley Information Science, Inc. All Rights Reserved.
SETH EARLEY - BIOGRAPHY
CEO and FounderEarley Information
Science
www.linkedin.com/in/sethearley
Over 20 years experience
Current work
Co-author
Editor
Member
Former Co-Chair
Founder
Former adjunct professor
Speaker
AIIM Master Trainer
Course Developer & Master Instructor
Data science and technology, content and knowledge
management systems, background in sciences (chemistry)
Enterprise IA and Semantic Search
Information Organization and Access
Industry conferences on knowledge and information management
Northeastern University
Boston Knowledge Management Forum
Academy of Motion Picture Arts and Sciences, Science and
Technology Council Metadata Project Committee
Editorial Journal of Applied Marketing Analytics
Data Analytics Department IEEE IT Professional Magazine
Practical Knowledge Management from IBM Press
Cognitive computing, knowledge and data management systems,
taxonomy, ontology and metadata governance strategies
#idw2017
Copyright © 2017 Earley Information Science, Inc. All Rights Reserved.
AI and The Hype Curve
“What seems to be AI, is really vast knowledge,
combined with a sophisticated UX”
https://www.theregister.co.uk/2017/01/02/ai_was_the_fake_news_of_2016/
http://www.rogerschank.com/fraudulent-claims-made-by-IBM-about-Watson-and-
AI
“The definition of “AI” has been stretched so
that it generously encompasses pretty much
anything with an algorithm”
#idw2017
Copyright © 2017 Earley Information Science, Inc. All Rights Reserved.
When it works, we don’t call it AI
5
“When [AI] finally works, it gets
co-opted by some other part of
the field. So, by definition, no AI
ever works; if it works, it’s not AI”
MIT AI Course*
*https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-825-techniques-in-artificial-intelligence-sma-5504-fall-2002/lecture-notes/Lecture1Final.pdf
#idw2017
Copyright © 2017 Earley Information Science, Inc. All Rights Reserved.
When it works, we don’t call it AI
6
AI is embedded in almost every technology we touch.
In fact, an early “AI application” was word processing- the software we take for granted applied
the judgment that a skilled typesetter would use when laying out a document.
Spell checkSelf driving cars
Speech recognition
#idw2017
Copyright © 2017 Earley Information Science, Inc. All Rights Reserved. 7
Turning your friends into creepy pictures of dogs and cats
#idw2017
Copyright © 2017 Earley Information Science, Inc. All Rights Reserved.
How is AI used in the workplace?
Most AI is behind the scenes, embedded in application functionality rather
than being used as stand alone tools.
There are few “pure” AI applications that the typical knowledge worker
leverages
Most are used to identify patterns in large data sets (for example, anomaly
detection, risk analysis, customer purchase patterns, market
segmentation, next best action, demand predictions)
Unless you are a data scientist, many of these applications are not readily
usable Text Analytics, a long time staple of KM and
content management, is now called “AI”
#idw2017
Copyright © 2017 Earley Information Science, Inc. All Rights Reserved.
What was once called Text Analytics is now called AI
9
Content/Text Analytics allows derivation of structure and identification of patterns within unstructured content and text.
• Knowledge extraction
• Mitigation of compliance risks
• Removal of Personally Identifiable Information (PII)
• Removal of Redundant, Outdated and Trivial (ROT) content
• Protection of intellectual property
• Identification of patterns of fraud
• Development of Question Answering systems
• Training of Intelligent Virtual Assistants and Chat bots
• Detection of customer sentiment
• Prediction of credit risks
• Feature extraction from product data
These approaches have always
leveraged some form of machine
learning algorithm
#idw2017
Copyright © 2017 Earley Information Science, Inc. All Rights Reserved.
Machine Learning
10
Machine learning algorithms iteratively use results of analysis to refine an outcome.
Outputs are fed back to refine how the algorithm produces an answer.
Example: Spell correction on your smart phone can learn unique spelling of words
• Unsupervised learning – look at this data and identify patterns and anomalies – “make sense
of the information”
• Supervised learning – look for this particular pattern of information based on examples
Providing multiple examples of user questions (“I need to change my password”, “I
forgot my password”, “I can’t log in”, etc) allows supervised learning to classify intent
– the user’s objective or goal.
#idw2017
Copyright © 2017 Earley Information Science, Inc. All Rights Reserved.
What is Cognitive Computing?
Cognitive Computing “…makes a new class of problem computable”
• Ambiguous, unpredictable
• Conflicting data
• Require exploration, not searching
• Need to uncover patterns and surprises
• Shifting situation, goals, information
• Best answers based on context
• Problem solving: beyond information gathering
By using diverse data sources as “signals”
• Analyze BIG data
• Understand human language on multiple levels
• Analyze and merge all formats and sources of
information
• Uncover relationships across sources
• Understand and filter by context
• Find patterns in the data that are both expected and
unexpected
• Learn from new information, new interactions
Source: Sue Feldman, Synthexis
…by leveraging machine learning and AI
#idw2017
Copyright © 2017 Earley Information Science, Inc. All Rights Reserved.
Perspective on Intelligent
Technologies
(There is no magic …)
#idw2017
Copyright © 2017 Earley Information Science, Inc. All Rights Reserved.
Reality versus Aspiration
Market place is crowded and noisy
Vendors hype is difficult to separate from reality
Significant amounts of functionality is aspirational
Vendors R&D will be at customer’s expense
Technology is quickly evolving and capabilities will accelerate
#idw2017
Copyright © 2017 Earley Information Science, Inc. All Rights Reserved.
Market Size vs Investment
Opus Research: Growth of
industry from $1 billion in 2016 to
$4.5 billion globally by 2021
CB Insights: $14.9 billion in investment
between 2012 and 2016
#idw2017
Copyright © 2017 Earley Information Science, Inc. All Rights Reserved.
What is the implication of $15b in funding
for companies chasing a $1b market?
You will be receiving a
lot of phone calls.
#idw2017
Copyright © 2017 Earley Information Science, Inc. All Rights Reserved.
However…
Most bot and virtual assistant vendors have not exploited scalable and portable knowledge
engineering approaches
We have not located any knowledge base driven solutions from credible vendors
Instead, content is embedded into, and fine tuned for, highly custom configurations
The approach of “give us 6 months, $2mm and all of your content” ensures lock in
The answer to “where does the data come from?”:
- “the customer has a knowledge base”
- “you need the right learning content”
Market Hype + Lack of Maturity = Many Failures
#idw2017
Copyright © 2017 Earley Information Science, Inc. All Rights Reserved.
Intelligent Virtual Assistants Are Evolving…#idw2017
Copyright © 2017 Earley Information Science, Inc. All Rights Reserved.
“But even those personalities required
proficiency in other facets of the technology
such as an expertly developed domain
model”
“Because intelligent virtual assistants are
focused within a domain model, they benefit
from a clearly defined knowledge base and are
able to go much deeper and stay within those
bounds…”
Source: Analyst Gigaom Research https://gigaom.com/2014/09/01/the-next-step-for-intelligent-virtual-assistants-its-time-to-
consolidate/
“…domain models and ontologies are important”
All Require Domain Modeling and Knowledge Base Development
#idw2017
Copyright © 2017 Earley Information Science, Inc. All Rights Reserved.
Governance
Models
There’s no magic
Knowledge Engineer
Knowledge Engineer
Knowledge Engineer
Assistant Supervisor
Integration Engine
Domain
Models
Knowledge
Bases
Harmonized
Metadata
Quality
Data
Curated
Content
Analytics
Programs
Content
Models
… and Knowledge Engineering Requires Human Intervention …
IPSoft’s Amelia Example
Example EIS
Knowledge
Artifacts
#idw2017
Copyright © 2017 Earley Information Science, Inc. All Rights Reserved.
The Knowledge Management challenge is usually put into
language that confuses the issue:
Vendors say that they need to “train the AI”
What do you “train the AI” with?
…high value knowledge assets (quality data and curated content)
The Knowledge Management Challenge
#idw2017
Copyright © 2017 Earley Information Science, Inc. All Rights Reserved.
Bots need good content and may
not always get it right…
Pizza ordering bot is “brittle”
Facebook search degrades inelegantly
#idw2017
Copyright © 2017 Earley Information Science, Inc. All Rights Reserved.
Watson Intelligent Assistant#idw2017
Copyright © 2017 Earley Information Science, Inc. All Rights Reserved.
Hey Facebook… …How About We Start with Search?
#idw2017
Copyright © 2017 Earley Information Science, Inc. All Rights Reserved.
The Bot Factory
Most bot approaches are analogous to hand built
automobiles at the turn of the century.
Applications are brittle, content is not reusable and
the process is costly and labor intensive.
Factories, standardized components and assembly
lines are needed to scale deployments.
This approach is one of a “bot factory”.
Virtual agent and bot technology
requires standardized treatment of
architecture, terminology and
reusable content driven by domain
specific use cases.
#idw2017
Copyright © 2017 Earley Information Science, Inc. All Rights Reserved.
APPROACHES
…include the following:
• Natural language processing of search queries
• Normalization of content with consistent domain modeling
• Classification of intent based on phrase variations
• Use of crowd sourcing to gather phrase variations and derive terminology
• Incorporation of user paths into learning process
• Analysis of click streams to identify recommendations
• Tuning of clustering, auto categorization and entity extraction algorithms
• Inference engines for reasoning algorithms
• Metrics for manual improvements of performance and for incorporation into automated approaches
• Integration of ontology modeling into downstream systems
#idw2017
Copyright © 2017 Earley Information Science, Inc. All Rights Reserved.
Even the short term challenges in managing bot information will quickly
multiply without proper planning
• It’s one thing to build a single bot, what about
when there are 10, 100, 1000, 10,000?
• Will there be a repeatable framework or a
series of projects?
• Without metrics, governance and design
elements abstracted from the tools,
organizations will find their bots out of control.
• Content and design elements (intents, entities,
utterances, responses) have to be managed
separately from the application
Organizations need to think of
bots not as a one off but a
series of channels and data
sources that have to be
managed over time
#idw2017
Copyright © 2017 Earley Information Science, Inc. All Rights Reserved.
Elements of a scalable bot approach
27
Automation – organizations need a “bot factory” model rather than a one off, special purpose bot.
This approach enables:
Reusability
• Investments in training content and knowledge assets will fully utilized
• Standardized assets and design elements can be repurposed in new bots
Scalability
• By using a combination of automated and manual approaches for extraction of ontologies and content
components, new bots can be deployed more quickly and cost effectively
• Managing design elements in a platform agnostic tool is the only way to control deployment of hundreds or
thousands of bots across multiple technologies
Portability
• Standardizing content, assets and design elements outside of bot platform prevents vendor lock in, allows
for new modules and best of breed components
• Refactored content and design elements will be managed in an ontology for migration into other bots, tools,
technologies
#idw2017
Copyright © 2017 Earley Information Science, Inc. All Rights Reserved.
Why a bot “factory” approach?
28
• Large complex problems need to be broken into smaller pieces
• Bots will be solutions to specific problems
• Alexa has 6,000 “skills” – a skill is a set of intents, triggers and content
• These components need to be managed
• Rather than programming bots one by one, creating reusable components will
reduce costs and effort
#idw2017
Copyright © 2017 Earley Information Science, Inc. All Rights Reserved.
Domain Models and Content Normalization Add Context
30
Invest in content, processes and knowledge architecture (industry specific domain models, ontologies, metadata, metrics and governance)
Questions are normalized into a vector space and matched
with responses from a knowledge base
Content gets refactored and componentized
User’s information need
(intended question)
Question variants also
form test use cases for
technology evaluation
12
3
Process efficiencies achieved by refactoring and componentizing content for reuse
the better the content the easier it will be to train the AI
A domain model is used to describe processes, products and organizational knowledge
structures
Pre-processing of content is required to add the correct knowledge context for AI programs to ingest
#idw2017
Copyright © 2017 Earley Information Science, Inc. All Rights Reserved.
Standardized/Normalized Content is Portable and Reusable
31
Standardized
domain specific
schemas for reuse
Field 1
Field 2
Field n…
Field 1
Field 2
Field 3
Field n…
ELearning, FAQ’s,
Troubleshooting
charts, support
articles
Componentized
content
Tagging for ingestion
Componentized content can
be repurposed across tools
and technologies Improved CSR
Information Access
Faster time to value for all
information access scenarios
Portability across AI and
Chatbot systems
Improved customer self
service
Metrics aligned with specific
content performance
#idw2017
Copyright © 2017 Earley Information Science, Inc. All Rights Reserved.
TRAINING AIDS
CALL LOGS CUSTOMER PROFILE DATA
ANALYTICS & ACTIVITY
SOCIALNETWORKS
DEMOGRAPHIC & ETHNOGRAPHIC
DATA
SENTIMENTANALYSIS
SERVICES & OFFERS
CUSTOMER EXPERIENCE ENRICHED BY KNOWLEDGE
BOT MATURITY & SCALABILITY
Combining Platform Independent Knowledge with
Agent-Bot Collaboration for Scalability & Customer Experience
#idw2017
Copyright © 2017 Earley Information Science, Inc. All Rights Reserved. 33
CONTACT US
GENERAL INFORMATION
www.earley.com
PO Box 292Carlisle, MA 01741
781-444-0287
Seth EarleyCEO/[email protected]://www.linkedin.com/in/sethearley
Jeanna GiordanoClient [email protected]://www.linkedin.com/in/jeannagiordano
#idw2017
Copyright © 2017 Earley Information Science, Inc. All Rights Reserved.
Detailed Process View
35
Document
Decomposition &
Learning &
Componentization
WordMap ML
Ontology Manager
Knowledge Engineer
Dialog Designer
Dialog Development
ELearning,
FAQ’s, Troubleshooting
charts, support articles
Componentization
Domain Modeling
AI Engineer
Content Processing
Content Analysis
Ingestion into Ontology
Dialog Tagging
(Redacted view
of a client
ontology)
#idw2017
Copyright © 2017 Earley Information Science, Inc. All Rights Reserved.
OUR CLIENTS
17
of the Fortune 10
Pharmaceuticals Companies
of the Fortune 10
Retailing Companies
of the Fortune 100
12
of the Internet Retailer Top 50
7
6
5
of the Fortune 20
Life Sciences Companies
#idw2017
Copyright © 2017 Earley Information Science, Inc. All Rights Reserved.
A BROAD SPECTRUM OF
SOLUTIONS
B2CDigital
Commerce
B2BDigital
Commerce
B2EDigital Workplace
CUSTOMER
EXPERIENCE
DIGITAL
ASSET
MGMT.
CONTEXTUAL
SEARCH
DIGITAL
COMMERCE
WEB
CONTENT
MGMT.
METRICS &
ATTRIBUTION
CONTENT
MARKETING
MASTER
DATA
MGMT.
BIG DATA
ANALYTICS
PRODUCT
INFORMATION
MGMT.
ENTERPRISE
CONTENT
MGMT.
BUSINESS
INTELLIGENCE
CONTEXT-AWARE
INFORMATION ARCHITECTURE
Strategy
Taxonomy
Metadata
Integration
Workflow
Governance
37
#idw2017
Copyright © 2017 Earley Information Science, Inc. All Rights Reserved.
“This is awesome, you
have exceeded my
expectations on what I
thought was possible.”
Mike Barton, President
Allstate Business Insurance
WHAT CLIENTS SAY ABOUT EIS
“We spent millions
upgrading technology ….
Looking back, I’d get the
taxonomy right from the
beginning.”
Chief Marketing Officer$8B Scientific Equipment Maker
“The value that Earley
brought was visible
from the beginning -
Helping us to arrive at a
consensus and a path
forward.”
VP Product Management
High-tech Manufacturer
38
#idw2017
Copyright © 2017 Earley Information Science, Inc. All Rights Reserved. 39
1994YEAR FOUNDED.
BostonHEADQUARTERED.
50+SPECIALISTS & GROWING.
Earley Information Science is a specialized information agency. We support measurable
business outcomes by organizing your data, content and knowledge assets.
Our proven methodologies are designed specifically to address product data, content
assets, customer data, and corporate knowledge bases. We deliver scalable governance-
driven solutions to the world’s leading brands, driving measurable business results.
We make information more
useable, findable, and valuable.
#idw2017
Copyright © 2017 Earley Information Science, Inc. All Rights Reserved. 4
0
AWARDS & RECOGNITION
2017 100 Companies that Matter in KM
2016 100 Companies that Matter in KM
2015 100 Companies that Matter in KM
2014 100 Companies that Matter in KM
2014 Trend-Setting Products Award
2013 Trend-Setting Products Award
2008 Trend-Setting Products Award (Wordmap)
2013 Applied Materials’ added to
InformationWeek 500 List of Business
Innovators
• “Cognitive Search Is Ready To Rev Up
Your Enterprise’s IQ”
• “Google-ize Your Site-Search Experience”
• “Polishing Up Your Products —
Why PIM Really Matters”
• “Artificial Intelligence Solution Landscape”
ANALYST MENTIONS
• “Unlocking the Hidden Value of
Information (Applied Materials)”
2015 KM Reality Award
(Allstate Business Insurance, ABIe project)
#idw2017
Copyright © 2017 Earley Information Science, Inc. All Rights Reserved.
THOUGHT LEADERSHIP
41
Founded in 2005
>3,400 members
worldwide
Founded in 2015
800 attendees
Educational Courses
Information
Organization and Access
Enterprise IA and
Semantic Search
Podcast on
Information
Science topics
#idw2017