hacking human language (pycon sweden 2015)
TRANSCRIPT
![Page 1: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/1.jpg)
Hacking!Human!Language!Hendrik HeuerPyCon !Stockholm!Sweden
![Page 2: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/2.jpg)
Hacking?!
![Page 3: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/3.jpg)
– Hacker Ethics
“Access to computers —
and anything which might !teach you something about !
the way the world works!—
should be unlimited and total.
Always yield to !the Hands-On Imperative!”
Levy, Steven (2001). Hackers: Heroes of the Computer Revolution (updated ed.). New York: Penguin Books. ISBN 0141000511. OCLC 47216793.
![Page 4: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/4.jpg)
Agenda
• Computational Social Science
• Natural Language Processing
• Word Vector Representations
• Visualising and comparing my Google searches
![Page 5: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/5.jpg)
D. Crandall and N. Snavely, ‘Modeling People and Places with Internet Photo Collections’, Commun. ACM, vol. 55, no. 6, pp. 52–60, Jun. 2012. DOI:
10.1145/2184319.2184336
![Page 6: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/6.jpg)
Computational Social Science Digital Humanities
• combines computer science & social sciences
• makes new research possible, e.g. the analysis of massive social networks and content of millions of books
immersion.media.mit.edu
![Page 7: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/7.jpg)
Massive-scale automated !analysis of news-content• 2.5 million articles from 498 different
English-language news outlets (Reuters & New York Times Corpus)
• automatically annotated into 15 topic areas
• the topics were compared in regards to readability, linguistic subjectivity and gender imbalances
I. Flaounas, O. Ali, T. Lansdall-Welfare, T. De Bie, N. Mosdell, J. Lewis, and N. Cristianini, ‘Research Methods in the Age of Digital Journalism: Massive-scale
automated analysis of news-content: topics, style and gender’, Digital Journalism, vol. 1, no. 1, 2013. DOI:10.1080/21670811.2012.714928
![Page 8: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/8.jpg)
Linguistic Subjectivity!Adjectives (Part-of-Speech Tagging) & SentiWordNet
![Page 9: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/9.jpg)
“Low level of political interest and engagement could be connected to the !
lack of subjectivity (adjectival excess)”
Linguistic Subjectivity!Adjectives (Part-of-Speech Tagging) & SentiWordNet
![Page 10: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/10.jpg)
Male-to-Female Ratio!Named Entity Recognition
![Page 11: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/11.jpg)
Male-to-Female Ratio!Named Entity Recognition
“Gender bias in sports coverage (...) females only account for between
only 7 and 25 per cent of coverage”
![Page 12: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/12.jpg)
scikit-learn
gensimNatural Language ToolkitspaCyword2vec
Machine Learning
Text ProcessingTopic Modeling
Visualizationd3.js
Google Chart APIHighcharts
![Page 13: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/13.jpg)
Introduction to Natural Language Processing
![Page 15: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/15.jpg)
Word Tokenization!Splitting a sentence into single words
>>> from nltk.tokenize import word_tokenize !>>> word_tokenize("All your base are belong to us")['All', 'your', 'base', 'are', 'belong', 'to', 'us']
![Page 16: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/16.jpg)
Sentence Tokenization!Splitting a text into sentences
>>> from nltk.tokenize import sent_tokenize !>>> sent_tokenize("Hello, Mr. Anderson. We missed you!") ['Hello, Mr. Anderson.', 'We missed you!']
![Page 17: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/17.jpg)
Sentence Tokenization!Splitting a text into sentences
>>> import nltk >>> import functools !>>> sent_tokenize = nltk.data.load(“tokenizers/punkt/swedish.pickle”)
![Page 18: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/18.jpg)
Stemming!Finding the word stem or root form
>>> import nltk >>> porter = nltk.PorterStemmer() >>> lancaster = nltk.LancasterStemmer() >>> wnl = nltk.WordNetLemmatizer() !>>> [wnl.lemmatize(w) for w in ['investigation','women']] ['investigation', ‘woman'] !>>> [porter.stem(w) for w in ['investigation','women']] ['investig', 'women'] !>>> [lancaster.stem(w) for w in ['investigation','women']] ['investig', 'wom']
![Page 19: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/19.jpg)
Part-of-Speech Tagging!Identifying nouns, verbs, adjectives…
>>> import nltk >>> text = "In the middle ages Sweden had the same king as Denmark and Norway." >>> words = nltk.word_tokenize( text ) !>>> nltk.pos_tag( words ) [('In', 'IN'), ('the', 'DT'), ('middle', 'NN'), ('ages', 'NNS'), ('Sweden', 'NNP'), ('had', 'VBD'), ('the', 'DT'), ('same', 'JJ'), ('king', 'NN'), ('as', 'IN'), ('Denmark', 'NNP'), ('and', 'CC'), ('Norway', 'NNP'), ('.', '.')]
NN* Noun VB* Verb JJ* Adjective RB* Adverb DT Determiner IN Preposition
![Page 20: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/20.jpg)
Named Entity Recognition!Identifying people, organizations, locations…
>>> import nltk >>> text = "New York City is the largest city in the United States." >>> words = nltk.word_tokenize( text ) !>>> nltk.ne_chunk( nltk.pos_tag( words ) ) Tree('S', [Tree('GPE', [('New', 'NNP'), ('York', 'NNP'), ('City', 'NNP')]), ('is', 'VBZ'), ('the', 'DT'), ('largest', 'JJS'), ('city', 'NN'), ('in', 'IN'), ('the', 'DT'), Tree('GPE', [('United', 'NNP'), ('States', 'NNPS')]), ('.', '.')])
ORGANIZATION Georgia-Pacific Corp., WHO PERSON Eddy Bonte, President Obama LOCATION Murray River, Mount Everest DATE June, 2008-06-29 TIME two fifty a m, 1:30 p.m. MONEY GBP 10.40 PERCENT twenty pct, 18.75 % FACILITY Washington Monument, Stonehenge GPE South East Asia, Midlothian (geo-political entity)
![Page 21: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/21.jpg)
Sentiment AnalysisTell if a sentence is positive or negative
![Page 22: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/22.jpg)
Stanford Core NLP Tools
![Page 23: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/23.jpg)
Vector Representations
![Page 24: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/24.jpg)
–J. R. Firth 1957
“You shall know a word by the company it keeps”
Firth, John R. 1957. A synopsis of linguistic theory 1930–1955. In Studies in linguistic analysis, 1–32. Oxford: Blackwell.
![Page 25: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/25.jpg)
–J. R. Firth 1957
“You shall know a word by the company it keeps”
Quoted after Socher
![Page 26: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/26.jpg)
![Page 27: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/27.jpg)
Vectors are directions in space
![Page 28: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/28.jpg)
Vectors are directions in space
Quoted after Socher
word2vecRepresenting a word with a vector
![Page 29: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/29.jpg)
T. Mikolov, K. Chen, G. Corrado, and J. Dean, ‘Efficient Estimation of Word Representations in Vector Space’, CoRR, vol. abs/1301.3781, 2013 [Online].
Available: http://arxiv.org/abs/1301.3781
Vectors can encode relationships
MAN
WOMANAUNT
UNCLEQUEEN
KING
word2vecRepresenting a word with a vector
![Page 30: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/30.jpg)
T. Mikolov, K. Chen, G. Corrado, and J. Dean, ‘Efficient Estimation of Word Representations in Vector Space’, CoRR, vol. abs/1301.3781, 2013 [Online].
Available: http://arxiv.org/abs/1301.3781
man is to woman as king is to ?
KINGS
KING
QUEEN
QUEENS
word2vecRepresenting a word with a vector
![Page 31: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/31.jpg)
T. Mikolov, K. Chen, G. Corrado, and J. Dean, ‘Efficient Estimation of Word Representations in Vector Space’, CoRR, vol. abs/1301.3781, 2013 [Online].
Available: http://arxiv.org/abs/1301.3781
word2vecRepresenting a word with a vector
![Page 32: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/32.jpg)
SwedenMost similar words
![Page 33: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/33.jpg)
SwedenMost similar words
![Page 34: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/34.jpg)
SwedenMost similar words
![Page 35: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/35.jpg)
SwedenMost similar words
![Page 36: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/36.jpg)
HarvardMost similar words
![Page 37: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/37.jpg)
![Page 38: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/38.jpg)
Link: https://radimrehurek.com/gensim/models/word2vec.html
![Page 39: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/39.jpg)
Link: https://radimrehurek.com/gensim/models/word2vec.html
![Page 42: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/42.jpg)
spaCy!Dependency-Based
Word representations by Levy and Goldberg
Gensim!word2vec
by Mikolov et al
![Page 43: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/43.jpg)
spaCy!Dependency-Based
Word representations by Levy and Goldberg
Gensim!word2vec
by Mikolov et al
2 words context window
![Page 44: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/44.jpg)
spaCy!Dependency-Based
Word representations by Levy and Goldberg
Gensim!word2vec
by Mikolov et al
5 words context window
2 words context window
![Page 45: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/45.jpg)
spaCy!Dependency-Based
Word representations by Levy and Goldberg
Gensim!word2vec
by Mikolov et al
![Page 46: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/46.jpg)
spaCy!Dependency-Based
Word representations by Levy and Goldberg
Gensim!word2vec
by Mikolov et al
![Page 47: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/47.jpg)
spaCy!Dependency-Based
Word representations by Levy and Goldberg
Gensim!word2vec
by Mikolov et al
![Page 48: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/48.jpg)
Link: https://code.google.com/p/word2vec/#Pre-trained_entity_vectors_with_Freebase_naming
![Page 49: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/49.jpg)
Applications
![Page 50: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/50.jpg)
Machine Translation
T. Mikolov, Q. V. Le, and I. Sutskever, ‘Exploiting Similarities among Languages for Machine Translation’, CoRR, vol. abs/1309.4168, 2013 [Online]. Available:
http://arxiv.org/abs/1309.4168
![Page 51: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/51.jpg)
Compare my Google searches
![Page 52: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/52.jpg)
Link: https://support.google.com/websearch/answer/6068625?hl=en
![Page 53: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/53.jpg)
{ "event":[ {"query": {"id":[ {"timestamp_usec":"1317002730153183"} ], "query_text":"google hangout" } }, {"query": {"id":[ {"timestamp_usec":"1316577601549660"} ], "query_text":"eurokrise" } }, {"query": {"id":[ {"timestamp_usec":"1315592145720230"} ], "query_text":"hoverboard" } }
parsed_json[‘event’][42]['query']['query_text']
![Page 54: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/54.jpg)
1. Find Word
Representations word2vec
2. Dimensionality
Reduction t-SNE
3. Output JSON
![Page 55: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/55.jpg)
1. Find Word
Representations word2vec
2. Dimensionality
Reduction t-SNE
3. Output JSON
![Page 56: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/56.jpg)
1. Find Word
Representations word2vec
2. Dimensionality
Reduction t-SNE
3. Output JSON
Link gensim: https://radimrehurek.com/gensim/!Link word2vec: https://code.google.com/p/word2vec/
![Page 57: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/57.jpg)
![Page 58: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/58.jpg)
![Page 59: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/59.jpg)
![Page 60: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/60.jpg)
1. Find Word
Representations word2vec
2. Dimensionality
Reduction t-SNE
3. Output JSON
![Page 61: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/61.jpg)
1. Find Word
Representations word2vec
2. Dimensionality
Reduction t-SNE
3. Output JSON
linguistics
![Page 62: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/62.jpg)
1. Find Word
Representations word2vec
2. Dimensionality
Reduction t-SNE
3. Output JSON
Link: http://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html
![Page 63: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/63.jpg)
1. Find Word
Representations word2vec
2. Dimensionality
Reduction t-SNE
3. Output JSON
![Page 64: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/64.jpg)
1. Find Word
Representations word2vec
2. Dimensionality
Reduction t-SNE
3. Output JSON
Link: https://github.com/mbostock/d3/wiki/Gallery
![Page 65: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/65.jpg)
My Google SearchesOct – Dec 2014 Jul – Sep 2011 Both, 2011 & 2014
![Page 66: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/66.jpg)
My Google SearchesOct – Dec 2014 Jul – Sep 2011 Both, 2011 & 2014
![Page 67: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/67.jpg)
My Google SearchesOct – Dec 2014 Jul – Sep 2011 Both, 2011 & 2014
![Page 68: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/68.jpg)
My Google SearchesOct – Dec 2014 Jul – Sep 2011 Both, 2011 & 2014
![Page 69: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/69.jpg)
My Google SearchesOct – Dec 2014 Jul – Sep 2011 Both, 2011 & 2014
![Page 70: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/70.jpg)
My Google SearchesOct – Dec 2014 Jul – Sep 2011 Both, 2011 & 2014
![Page 71: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/71.jpg)
Hacking!Human!Language!Hendrik HeuerPyCon !Stockholm!Sweden
[email protected]!http://hen-drik.de!@hen_drik
Thanks to Andrii, Jussi & Roelof
Slides: https://tinyurl.com/pycon-word2vec
![Page 72: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/72.jpg)
predict the current word!input!
wi-2, wi-1, wi+1, wi+2 !output !
wi!
![Page 73: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/73.jpg)
predict the current word!input!
wi-2, wi-1, wi+1, wi+2 !output !
wi!
predict the surrounding words!input
wi !output !
wi-2, wi-1, wi +1, wi +2.
![Page 74: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/74.jpg)
![Page 75: Hacking Human Language (PyCon Sweden 2015)](https://reader034.vdocuments.us/reader034/viewer/2022042716/55b6de5bbb61eb0c598b47fe/html5/thumbnails/75.jpg)