deep learning for text analytics...natural language processing other use cases •text-based...
TRANSCRIPT
![Page 1: Deep Learning for Text Analytics...Natural Language Processing Other Use Cases •Text-based applications •Searching for a certain topic in the database •Extracting information](https://reader030.vdocuments.us/reader030/viewer/2022041016/5ec7f6534d3aba34246b0cca/html5/thumbnails/1.jpg)
Com pany Confident ia l – For Internal Use OnlyCopyright © S AS Inst i tute Inc. A l l r i ghts reserved.
Deep Learning for Text AnalyticsSAS User Group Malaysia
3rd May 2018
![Page 2: Deep Learning for Text Analytics...Natural Language Processing Other Use Cases •Text-based applications •Searching for a certain topic in the database •Extracting information](https://reader030.vdocuments.us/reader030/viewer/2022041016/5ec7f6534d3aba34246b0cca/html5/thumbnails/2.jpg)
Com pany Confident ia l – For Internal Use OnlyCopyright © S AS Inst i tute Inc. A l l r i ghts reserved.
Agenda
• The Natural Language in Machines
• What is it?
• Why we need it?
• Deep Learning with Recurrent Neural Network (RNN)
• Basic RNN architecture
• Text classification
• Text generation
• Creating, training and scoring an RNN in Jupyter Notebook
![Page 3: Deep Learning for Text Analytics...Natural Language Processing Other Use Cases •Text-based applications •Searching for a certain topic in the database •Extracting information](https://reader030.vdocuments.us/reader030/viewer/2022041016/5ec7f6534d3aba34246b0cca/html5/thumbnails/3.jpg)
Com pany Confident ia l – For Internal Use OnlyCopyright © S AS Inst i tute Inc. A l l r i ghts reserved.
Natural Language in Machines
![Page 4: Deep Learning for Text Analytics...Natural Language Processing Other Use Cases •Text-based applications •Searching for a certain topic in the database •Extracting information](https://reader030.vdocuments.us/reader030/viewer/2022041016/5ec7f6534d3aba34246b0cca/html5/thumbnails/4.jpg)
Com pany Confident ia l – For Internal Use OnlyCopyright © S AS Inst i tute Inc. A l l r i ghts reserved.
SAS in a Chatbot
Source: https://becominghuman.ai/chatbots-using-aws-sas-viya-e8a7410ec256
![Page 5: Deep Learning for Text Analytics...Natural Language Processing Other Use Cases •Text-based applications •Searching for a certain topic in the database •Extracting information](https://reader030.vdocuments.us/reader030/viewer/2022041016/5ec7f6534d3aba34246b0cca/html5/thumbnails/5.jpg)
Welcome! I’m your virtual assistant.
How can I help you?
In The Oil and Gas IndustryAI to Assist Customers
Provide answers and information on over 3,000 company products using information based on 100,000 information data sheets, 1,000 different pack options and 1,100 different physical characteristics.
Delivering the correct answer to each possible question was spread out over a variety of different sources including an external vehicle database with over a million different vehicle and engine combinations.
Need to recognize if a piece of information is missing and ask the customer further questions to clarify or confirm.
![Page 6: Deep Learning for Text Analytics...Natural Language Processing Other Use Cases •Text-based applications •Searching for a certain topic in the database •Extracting information](https://reader030.vdocuments.us/reader030/viewer/2022041016/5ec7f6534d3aba34246b0cca/html5/thumbnails/6.jpg)
Com pany Confident ia l – For Internal Use OnlyCopyright © S AS Inst i tute Inc. A l l r i ghts reserved.
Natural Language ProcessingInteraction
Natural Language Processing (NLP)
Natural Language Understanding (NLU)
Natural Language Generation (NLG)
Natural Language Interaction (NLI)
![Page 7: Deep Learning for Text Analytics...Natural Language Processing Other Use Cases •Text-based applications •Searching for a certain topic in the database •Extracting information](https://reader030.vdocuments.us/reader030/viewer/2022041016/5ec7f6534d3aba34246b0cca/html5/thumbnails/7.jpg)
Com pany Confident ia l – For Internal Use OnlyCopyright © S AS Inst i tute Inc. A l l r i ghts reserved.
Natural Language Processing
NLP Layer(Natural Language
Processing)
Knowledge Base
(Source Content)
Data Storage(Interaction History &
Analytics)
![Page 8: Deep Learning for Text Analytics...Natural Language Processing Other Use Cases •Text-based applications •Searching for a certain topic in the database •Extracting information](https://reader030.vdocuments.us/reader030/viewer/2022041016/5ec7f6534d3aba34246b0cca/html5/thumbnails/8.jpg)
Com pany Confident ia l – For Internal Use OnlyCopyright © S AS Inst i tute Inc. A l l r i ghts reserved.
Natural Language ProcessingOther Use Cases
• Text-based applications
• Searching for a certain topic in the database
• Extracting information for a large document
![Page 9: Deep Learning for Text Analytics...Natural Language Processing Other Use Cases •Text-based applications •Searching for a certain topic in the database •Extracting information](https://reader030.vdocuments.us/reader030/viewer/2022041016/5ec7f6534d3aba34246b0cca/html5/thumbnails/9.jpg)
Com pany Confident ia l – For Internal Use OnlyCopyright © S AS Inst i tute Inc. A l l r i ghts reserved.
Deep Learning with Recurrent Neural Network (RNN)
![Page 10: Deep Learning for Text Analytics...Natural Language Processing Other Use Cases •Text-based applications •Searching for a certain topic in the database •Extracting information](https://reader030.vdocuments.us/reader030/viewer/2022041016/5ec7f6534d3aba34246b0cca/html5/thumbnails/10.jpg)
Com pany Confident ia l – For Internal Use OnlyCopyright © S AS Inst i tute Inc. A l l r i ghts reserved.
Deep LearningRecurrent Neural Network (RNN)
• Designed to handle sequential data
o Text
o Speech
o Time
• Performs the same task for every element of a sequence
• Output for each element depends on computations of its preceding element
• Common variants
o Gated Recurrent Unit (GRU)
o Long Short-Term Memory (LSTM)
Output
Input
![Page 11: Deep Learning for Text Analytics...Natural Language Processing Other Use Cases •Text-based applications •Searching for a certain topic in the database •Extracting information](https://reader030.vdocuments.us/reader030/viewer/2022041016/5ec7f6534d3aba34246b0cca/html5/thumbnails/11.jpg)
Com pany Confident ia l – For Internal Use OnlyCopyright © S AS Inst i tute Inc. A l l r i ghts reserved.
Text Classification
The 16th American President
number
order
entity
context
![Page 12: Deep Learning for Text Analytics...Natural Language Processing Other Use Cases •Text-based applications •Searching for a certain topic in the database •Extracting information](https://reader030.vdocuments.us/reader030/viewer/2022041016/5ec7f6534d3aba34246b0cca/html5/thumbnails/12.jpg)
Com pany Confident ia l – For Internal Use OnlyCopyright © S AS Inst i tute Inc. A l l r i ghts reserved.
Word Vector
Unlabeled Corpus
The 15th American President
The 16th American President
The 17th American President
Alex reads this sentence
Alex read this sentence
Alex is reading this sentence
15th
17th
16th
read
reading
reads
Word Vector Algorithm
Words with similar context should have
similar vectors
![Page 13: Deep Learning for Text Analytics...Natural Language Processing Other Use Cases •Text-based applications •Searching for a certain topic in the database •Extracting information](https://reader030.vdocuments.us/reader030/viewer/2022041016/5ec7f6534d3aba34246b0cca/html5/thumbnails/13.jpg)
Com pany Confident ia l – For Internal Use OnlyCopyright © S AS Inst i tute Inc. A l l r i ghts reserved.
Text Generation
Translating vector back to
text
Convert text into vectorized
input
RNN
Calculate vector weight
Vector representing a sentence
based on the text
Use weight vector to refine model
![Page 14: Deep Learning for Text Analytics...Natural Language Processing Other Use Cases •Text-based applications •Searching for a certain topic in the database •Extracting information](https://reader030.vdocuments.us/reader030/viewer/2022041016/5ec7f6534d3aba34246b0cca/html5/thumbnails/14.jpg)
Com pany Confident ia l – For Internal Use OnlyCopyright © S AS Inst i tute Inc. A l l r i ghts reserved.
Text GenerationText Structure – Word Order
Who is the 16th American President
The 16th President who is American
![Page 15: Deep Learning for Text Analytics...Natural Language Processing Other Use Cases •Text-based applications •Searching for a certain topic in the database •Extracting information](https://reader030.vdocuments.us/reader030/viewer/2022041016/5ec7f6534d3aba34246b0cca/html5/thumbnails/15.jpg)
Com pany Confident ia l – For Internal Use OnlyCopyright © S AS Inst i tute Inc. A l l r i ghts reserved.
Creating, Training, Scoring an RNNUsing Deep Learning
![Page 16: Deep Learning for Text Analytics...Natural Language Processing Other Use Cases •Text-based applications •Searching for a certain topic in the database •Extracting information](https://reader030.vdocuments.us/reader030/viewer/2022041016/5ec7f6534d3aba34246b0cca/html5/thumbnails/16.jpg)
Sample RNN ModelLoading the Action Sets
![Page 17: Deep Learning for Text Analytics...Natural Language Processing Other Use Cases •Text-based applications •Searching for a certain topic in the database •Extracting information](https://reader030.vdocuments.us/reader030/viewer/2022041016/5ec7f6534d3aba34246b0cca/html5/thumbnails/17.jpg)
Sample RNN ModelThe Dataset
![Page 18: Deep Learning for Text Analytics...Natural Language Processing Other Use Cases •Text-based applications •Searching for a certain topic in the database •Extracting information](https://reader030.vdocuments.us/reader030/viewer/2022041016/5ec7f6534d3aba34246b0cca/html5/thumbnails/18.jpg)
Sample RNN ModelThe Dataset
![Page 19: Deep Learning for Text Analytics...Natural Language Processing Other Use Cases •Text-based applications •Searching for a certain topic in the database •Extracting information](https://reader030.vdocuments.us/reader030/viewer/2022041016/5ec7f6534d3aba34246b0cca/html5/thumbnails/19.jpg)
Sample RNN ModelText Classification Model
![Page 20: Deep Learning for Text Analytics...Natural Language Processing Other Use Cases •Text-based applications •Searching for a certain topic in the database •Extracting information](https://reader030.vdocuments.us/reader030/viewer/2022041016/5ec7f6534d3aba34246b0cca/html5/thumbnails/20.jpg)
Sample RNN ModelTraining the Text Classification Model
![Page 21: Deep Learning for Text Analytics...Natural Language Processing Other Use Cases •Text-based applications •Searching for a certain topic in the database •Extracting information](https://reader030.vdocuments.us/reader030/viewer/2022041016/5ec7f6534d3aba34246b0cca/html5/thumbnails/21.jpg)
Sample RNN ModelScoring the Text Classification Model
![Page 22: Deep Learning for Text Analytics...Natural Language Processing Other Use Cases •Text-based applications •Searching for a certain topic in the database •Extracting information](https://reader030.vdocuments.us/reader030/viewer/2022041016/5ec7f6534d3aba34246b0cca/html5/thumbnails/22.jpg)
Sample RNN ModelWord Order
![Page 23: Deep Learning for Text Analytics...Natural Language Processing Other Use Cases •Text-based applications •Searching for a certain topic in the database •Extracting information](https://reader030.vdocuments.us/reader030/viewer/2022041016/5ec7f6534d3aba34246b0cca/html5/thumbnails/23.jpg)
Sample RNN ModelText Generation Model
![Page 24: Deep Learning for Text Analytics...Natural Language Processing Other Use Cases •Text-based applications •Searching for a certain topic in the database •Extracting information](https://reader030.vdocuments.us/reader030/viewer/2022041016/5ec7f6534d3aba34246b0cca/html5/thumbnails/24.jpg)
Sample RNN ModelTraining the Text Generation Model
NOTE: The Synchronous mode is enabled.
NOTE: The total number of parameters is 3822440.
NOTE: The approximate memory cost is 19739.00 MB.
NOTE: Loading weights cost 0.00 (s).
NOTE: Initializing each layer cost 147.37 (s).
NOTE: The total number of threads on each worker is 32.
NOTE: The total number of minibatch size per thread on each worker is 16.
NOTE: The maximum number of minibatch size across all workers for the synchronous mode is 512.
NOTE: Target variable: title
NOTE: Number of input variables: 1
NOTE: Number of numeric input variables: 2
NOTE: Batch nUsed Learning Rate Loss Fit Error Time (s) (Training)
NOTE: 0 512 0.05 11.679 1 0.79
NOTE: 1 512 0.05 11.414 1 0.77
NOTE: 2 512 0.05 11.2 1 1.39
NOTE: 3 512 0.05 11.093 1 0.72
NOTE: 4 512 0.05 10.858 0.9983 0.85
NOTE: 5 512 0.05 10.64 0.9835 1.18
NOTE: 6 512 0.05 10.64 0.9835 0.91
![Page 25: Deep Learning for Text Analytics...Natural Language Processing Other Use Cases •Text-based applications •Searching for a certain topic in the database •Extracting information](https://reader030.vdocuments.us/reader030/viewer/2022041016/5ec7f6534d3aba34246b0cca/html5/thumbnails/25.jpg)
Sample RNN ModelScoring the Text Generation Model
Out[15]: § Scoreinfo
Descr Value
0 Number of Observations Read 9635
1 Number of Observations Used 9635
2 Misclassification Error (%) 90.71156
3 Loss Error 9.162203
![Page 26: Deep Learning for Text Analytics...Natural Language Processing Other Use Cases •Text-based applications •Searching for a certain topic in the database •Extracting information](https://reader030.vdocuments.us/reader030/viewer/2022041016/5ec7f6534d3aba34246b0cca/html5/thumbnails/26.jpg)
Sample RNN ModelText Generation Output
review: Really Really enjoy playing this game. It makes you feel verys mart as you solve some puzzles quickly and then the nextone
will be a real stumper… play it ALL the time.
ground truth: Crazy Addicted
prediction: love this game
review: Don’t bother with this app. I wish I could delete it from my purchased app history so I’m not reminded that it was every on
my phone. I should have known better.
ground truth: Really?
prediction: i love this
![Page 27: Deep Learning for Text Analytics...Natural Language Processing Other Use Cases •Text-based applications •Searching for a certain topic in the database •Extracting information](https://reader030.vdocuments.us/reader030/viewer/2022041016/5ec7f6534d3aba34246b0cca/html5/thumbnails/27.jpg)
Sample RNN ModelText Generation Output
review: I really like this app! very easy to use and the audio is great. it is nice to see the spreading of God’s word is still free
for some people. a++++.
ground truth: awesome!
prediction: very app
![Page 28: Deep Learning for Text Analytics...Natural Language Processing Other Use Cases •Text-based applications •Searching for a certain topic in the database •Extracting information](https://reader030.vdocuments.us/reader030/viewer/2022041016/5ec7f6534d3aba34246b0cca/html5/thumbnails/28.jpg)
Com pany Confident ia l – For Internal Use OnlyCopyright © S AS Inst i tute Inc. A l l r i ghts reserved.
Useful Links
• What’s New In SAS Deep Learning (Documentation)
http://go.documentation.sas.com/?docsetId=casdlpg&docsetTarget=n0gv3jjm5obouun1uvducbzl8nlf.htm&docsetVersion=8.2&locale=en
• Understanding Recurrent Neural Networks
http://karpathy.github.io/2015/05/21/rnn-effectiveness/
• RNN Simplified
https://www.youtube.com/watch?v=_aCuOwF1ZjU