using matlab for sentiment analysis and text analytics€¦ · doc2 1 1 0 1 • word docs •...

Post on 23-Jan-2021

2 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1© 2017 The MathWorks, Inc.

Using MATLAB for Sentiment Analysis and Text Analytics

Seth DeLand

2

Sentiment Analysis

Understand sentiment of documents, emails, social media

Applications§ Predict prices, inform trading strategies§ Risk analytics§ Economics research§ Litigation

3

Text Analytics Toolbox

Media reported two trees blown down along I-40 in the Old Fort area.

Develop Predictive ModelsAccess and Explore Data Preprocess Data

Clean-up Text Convert to Numeric

media report two tree blown down i40 old fort area

cat dog run two

doc1 1 0 1 0

doc2 1 1 0 1

• Word Docs• PDF’s• Text Files

• Stop Words• Stemming• Tokenization

• Bag of Words• TF-IDF• Word Embeddings

• Latent DirichletAllocation

• Latent Semantic Analysis

4

StringsThe better way to work with text

§ Manipulate, compare, and store text data efficiently>> "image" + (1:3) + ".png"

1×3 string array

"image1.png" "image2.png" "image3.png"

§ Simplified text manipulation functions– Example: Check if a string is contained within another string

§ Previously: if ~isempty(strfind(textdata,"Dog"))

§ Now: if contains(textdata,"Dog")

§ Performance improvement– Up to 50x faster using contains with string than strfind with cellstr– Up to 2x memory savings using string over cellstr

5

Demo: Sentiment Analysis of Tweets

Goal:§ Analyze the sentiment of tweets for several

tech companies and compare the sentiment scores to stock prices

Approach§ Access data from Twitter§ Preprocess to clean-up text and deal with

domain-specific terms§ Predict sentiment from word embedding § Compare sentiment scores to prices

6

Additional Text Analytics Capabilities

§ Extract text from pdf’s and word documents

7

Bag of Words Models and TF-IDF

8

LSA and LDA

§ Unsupervised machine learning techniques§ Well-suited to wide feature matrices§ Both can be used for dimensionality reduction§ LDA is also useful for topic modeling

9

Learn More

https://www.mathworks.com/help/textanalytics/examples.html

10

Learn More

https://blogs.mathworks.com/loren/2017/09/21/math-with-words-word-embeddings-with-matlab-and-text-analytics-toolbox/

Math with Words – Word Embeddings with MATLAB and Text Analytics Toolbox

11© 2017 The MathWorks, Inc.

Questions?

top related