gpu accelerated natural language processing by guillermo molini
TRANSCRIPT
Powered by WAVECRAFTERS
RoadmapWhat’s NLP?
Traditional Search.
Modern NLP. Vector Embeddings.
Speech to Text
Demo
Powered by WAVECRAFTERS
Natural Language Processing
Computational techniques used for analysing and representingtext for the purpose of achieving human-like languageprocessing.
Powered by WAVECRAFTERS
Uses• Searching• Information Extraction• Summarization• Question Answering• Customer Interaction• Sentiment Analysis• Speech to Text
Powered by WAVECRAFTERS
RoadmapWhat’s NLP?
Traditional Search.
Modern NLP. Vector Embeddings.
Speech to Text
Demo
Powered by WAVECRAFTERS
How does traditional searching work?• Stemming• Synonyms• Tags• Misspelling support• Ranking
Powered by WAVECRAFTERS
RoadmapWhat’s NLP?
Traditional Search.
Modern NLP. Vector Embeddings.
Speech to Text
Demo
Powered by WAVECRAFTERS
How do Vector Embeddings work?
0.13 -0.01 0.56 0.32 0.39 -0.79 0.86 0.55 0.22 0.19
Seal
Vector of n dimensions
Powered by WAVECRAFTERS
How do Vector Embeddings work? (III) Training• Different training algorithms: GloVe (Socher, Standford University), Word2Vec (Google), Doc2Vec (Mikolov, Facebook).
• We will be releasing shortly our own GPU based version of GloVe as open-source.
Powered by WAVECRAFTERS
How do Vector Embeddings work? (IV)• Vectors cosines give us the semantic closeness.
Ball
Mars
Ball
Football
But we can also do much more! Adding, subtracting…
Powered by WAVECRAFTERS
Why aren’t Vector Embeddingswidespread?• Steep Learning curve. Math can be complicated.• Lots of computational power needed. Slow and expensive.
Powered by WAVECRAFTERS
Advantages of GPUs (II)
115ms
11632ms
0
2000
4000
6000
8000
10000
12000
14000
Semantic closeness to 10.000.000 documents. Lower is better!
GPU Execution Time CPU Execution Time
Powered by WAVECRAFTERS
RoadmapWhat’s NLP?
Traditional Search.
Modern NLP. Vector Embeddings.
Speech to Text
Demo
Powered by WAVECRAFTERS
RoadmapWhat’s NLP?
Traditional Search.
Modern NLP. Vector Embeddings.
Speech to Text
Demo
Powered by WAVECRAFTERS
Speech to Text.
Ability to automatically transcribe video / audio into its written form.
Powered by WAVECRAFTERS
Speech to Text. Uses• Information Extraction• Close Captioning• Summarization• Searching
Powered by WAVECRAFTERS
Speech To Text (II). From Phonemes to Words
A
Ball
Ballet
Bull
Market
Mars
Marsh
0.15
0.12
0.10
0.58
0.24
0.13
0.05
0.03
0.07
0.56
Powered by WAVECRAFTERS
Speech To Text (III). From Phonemes to Words
the
co
tea
mema
lo
Dictionary of probabilities
A ball market is a good chance for investors
Powered by WAVECRAFTERS
Speech to Text (VI). Improving the Error Rate• Do several rounds of processing.• In each one, use NLP to find out the theme of theconversation, then produce a new Language Model (Dictionary)that fits the theme.• Reprocess the input• Costly and slow!