Company Conf ident ia l – For Internal Use OnlyCopyright © SAS Inst itute Inc. A l l r ights reserved.
Text Analytics in Action Toronto Data Sciences Forum
2017.11.08
Cindy Zhong, Data Scientist, SAS Canada
Company Conf ident ia l – For Internal Use OnlyCopyright © SAS Inst itute Inc. A l l r ights reserved.
What is Text Analytics?
“A set of linguistic, statistical, and machine learning techniques that model and structure
the information content of textual sources for business intelligence, exploratory data
analysis, research, or investigation.”
“Using technology to scale the human acts of reading, organizing, and quantifying unstructured text data in meaningful ways.”
Or, put more simply…
Company Conf ident ia l – For Internal Use OnlyCopyright © SAS Inst itute Inc. A l l r ights reserved.
Call Center Notes Survey Feedback
Online Forums Blogs Consumer Reviews Online News Social Networks
Associate Comments Claims & Case NotesResearch & Publications
Live Chat Factory/Tech’n Notes Emails Medical/Health Records Contracts & Applications
Text Analytics OverviewCommon Textual Data Sources
Company Conf ident ia l – For Internal Use OnlyCopyright © SAS Inst itute Inc. A l l r ights reserved.
Text Analytics in ActionWhat Can We Learn From Text?
What we can learn?• Reviewer’s Attitude• Reviewer’s Opinion on
the Features (Display, Branding, Design, Price)
• Degree of their Opinion• Competitor Information
Now, add these to your existing knowledge…
Company Conf ident ia l – For Internal Use OnlyCopyright © SAS Inst itute Inc. A l l r ights reserved.
Text Analytics OverviewSteps for Text Analytics
Define the problem!!!
Obtain Relevant Text
Preprocessing of Text:Text ParsingText Filtering
Transformation
Apply Text Mining Algorithm Analyze Obtained Output
Is the data
properly mined?
No
Output Data for Further Analysis
Yes
Deployment of Analytics Dashboard
Event Stream ProcessingEnriched Model
Case ManagementVisualization & Reporting
Feedback
Company Conf ident ia l – For Internal Use OnlyCopyright © SAS Inst itute Inc. A l l r ights reserved.
Text PreprocessingStructuring the Unstructured
Dropping the
Stopwords
Automatically
detect misspellings
and parts-of-speech
Parsing the
document
into tokens
Stem/Lemmatize
the term so
different forms are
seen as one
Identify and extract
known or
discovered entities
Look at what
customer mention
about screen
Company Conf ident ia l – For Internal Use OnlyCopyright © SAS Inst itute Inc. A l l r ights reserved.
Exploratory Analysis on TextUnsupervised Learning
Look at the
sentiment by
review
Look at the
sentiment by review
topics
Look at what are
mentioned in a topic
Company Conf ident ia l – For Internal Use OnlyCopyright © SAS Inst itute Inc. A l l r ights reserved.
Text Analytics OverviewWhat Are Others Using Text Analytics For?
Business Use Cases
Product • Design
Feature/Function• Competitive Landscape
Marketing
• Reduce Attrition; Identify At-Risk Clients
• Enrich Customer segmentation models and “Path Analysis”
• Root Cause Analysis on Customer Complaints
SAS Text Analytics Solution
Analytics
Sentiment Analysis
Information Retrieval
Topic Detection
Content Categorization
Root Cause Analysis
Trend Analysis
Deployment
Enriched Model
Event Stream ProcessingRisk Alert
Competitive AlertCustomer Interactions
Case Management
Sentiment DashboardCommunity ViewProduct Line View
Region ViewBusiness Line View
Visualization & Reporting
Linguistic Rules Engines Statistical ModelsNatural Language Processing
Operations• Improve Call
Center Agent-Customer Interactions
• Improve Call/Support Agent Productivity and Workflow
• 1:many / 1:1 Messaging via Social Channels
Risk• Detect regulatory
and compliance violations
• Detect common themes in cases of fraudulent charges, ID theft, scams, and phishing schemes
• Enhance credit scoring & underwriting models
Company Conf ident ia l – For Internal Use OnlyCopyright © SAS Inst itute Inc. A l l r ights reserved.
What Makes Text Analytics Hard?And Interesting …
• Problem Specific
• Domain Specific
• Out-Of-Box tool does not work
• Ability to customize is critical
• Human Subjectivity• Same message conveyed in different ways
• Exactly the same statement in a different context may convey completely different meaning.
• Language & Cultural Specific• Requires deep knowledge about the language
and the culture
• Social media as a new source of data
Company Conf ident ia l – For Internal Use OnlyCopyright © SAS Inst itute Inc. A l l r ights reserved.
What You Want To Look For?Fully Customizable, Balanced Approach to Text Analytics Problem
In order to gain business value from unstructured text, we need:
• Structured + Unstructured
• To gain the lift in predictive accuracy from text
• Supervised + Unsupervised
• To shorten the Time to Value, minimizing the manual effort while maintaining granularity and specificity
• Linguistic Rules + Statistical Model
• To get desired level of granularity and customization ability