ar#ficial) intelligence)1950’s 1970’s stas#cal )paradigm) •...
TRANSCRIPT
Ar#ficial Intelligence
Natural Language Processing
Prof Alexiei Dingli
1
Aims of NLP?
• Trying to make computers talk • Give computers the linguis#c abili#es of humans
2
1940’s -‐ 1950’s
• Turing’s (1936) – model of algorithmic computa#on
• McCulloch-‐PiQs neuron (McCulloch and PiQs, 1943) – a simplified model of the neuron as a kind of
compu#ng element (proposi#onal logic)
• Kleene (1951) and (1956) – finite automata and regular expressions.
• Shannon (1948) – probabilis#c models of discrete Markov
processes to automata for language.
• Chomsky (1956) – finite state machines as a way to characterize a
grammar
3
1940’s -‐ 1950’s
Speech and language processing • Shannon
– metaphor of the noisy channel – entropy as a way of measuring the informa#on
capacity of a channel • Founda#onal research in phone#cs • First machine speech recognizers (early
1950s). – 1952, Bell Lab, sta#s#cal system that could
recognize any of the 10 digits from a single speaker (Davis et al., 1952)
4
1940’s -‐ 1950’s
One of the earliest applica#ons of computers
• Major aQempts in US and USSR – Russian to English and reverse
• George Town University, Washington system: – Translated sample texts in 1954
• The ALPAC report (1964) – Assessed research results of groups working on MTs • Concluded: MT not possible in near future • Funding should cease for MT ! • Basic research should be supported • Word to word transla#on does not work
– Linguis#c Knowledge is needed 5
1950’s -‐ 1970’s Symbolic paradigm
Formal language theory and genera#ve syntax
• 1957 Noam Chomsky's Syntac'c Structures – A formal defini#on of grammars and languages
– Provides the basis for an automa#c syntac#c processing of NL expressions
• 1967 : Woods procedural seman#cs – A procedural approach to the meaning of a sentence
– Provides the basis for a automa#c seman#c processing of NL expressions 6
1950’s -‐ 1970’s Symbolic paradigm
Parsing algorithms – top-‐down and boQom-‐up – dynamic programming – Transforma#ons and Discourse Analysis Project (TDAP) • Harris, 1962 • Joshi and Hopely (1999) and KarQunen (1999), • cascade of finite-‐state transducers
7
1950’s -‐ 1970’s Symbolic paradigm
AI • Summer of 1956 :John McCarthy, Marvin
Minsky, Claude Shannon, and Nathaniel Rochester – work on reasoning and logic
• Newell and Simon -‐ the Logic Theorist and the General Problem Solver Early natural language understanding systems – Domains – Combina#on of paQern matching and keyword
search – Simple heuris#cs for reasoning and ques#on-‐
answering
• Late 1960s -‐ more formal logical systems
8
1950’s -‐ 1970’s Sta#s#cal paradigm
• Bayesian method to the problem of op#cal character recogni#on. – Bledsoe and Browning (1959) : Bayesian text-‐
recogni#on • a large dic#onary • compute the likelihood of each observed leQer
sequence given each word in the dic#onary • Joshi and Hopely (1999) and KarQunen (1999)
– cascade of finite-‐state transducers likelihoods for each leQer.
• Bayesian methods to the problem of authorship aQribu#on on The Federalist papers – Mosteller and Wallace (1964)
• Testable psychological models of human language processing based on transforma#onal grammar
• Resources – First online corpora: the Brown corpus of American
English – DOC (Dic#onary on Computer) – an on-‐line Chinese dialect dic#onary. 9
Symbolic vs sta#s#cal approaches
Symbolic • Based on hand wriQen rules • Requires linguis#c exper#se • No frequencey informa#on • More briQle and slower than sta#s#cal approaches • Ohen more precise than sta#s#cal approaches • Error analysis is usually easier than for sta#s#cal
approaches
Sta#s#cal • Supervised or non-‐supervised • Rules acquired from large size corpora • Not much linguis#c exper#se required • Robust and quick • Requires large size (annotated) corpora • Error analysis is ohen difficult
10
1970-‐1983 Sta#s#cal paradigm
Speech recogni#on algorithms
• Hidden Markov model (HMM) and the metaphors of the noisy channel and decoding – Jelinek, Bahl, Mercer, and colleagues at IBM’s
Thomas J. Watson Research Center, – Baker at Carnegie Mellon University
11
1970-‐1983 Logic-‐based paradigm
• Q-‐systems and metamorphosis grammars (Colmerauer, 1970, 1975)
• Definite Clause Grammars (Pereira and Warren, 1980)
• Func#onal grammar (Kay,1979) • Lexical Func#onal Grammar (LFG) (Bresnan and Kaplan’s,1982)
12
1970-‐1983 Natural Language Understanding
• SHRDLU system : simulated a robot embedded in a world of toy blocks (Winograd, 1972a). – natural-‐language text commands
• Move the red block on top of the smaller green one
• complexity and sophis#ca#on
– first to aQempt to build an extensive (for the #me) grammar of English (based on Halliday’s systemic grammar)
13
1970-‐1983 Natural Language Understanding
• Yale School : series of language understanding programs – conceptual knowledge (scripts, plans, goals..)
– human memory organiza#on – network-‐based seman#cs (Quillian, 1968)
14
• Return of state models – Finite-‐state phonology and morphology (Kaplan and
Kay, 1981) – Finite-‐state models of syntax by Church (1980).
• Return of empiricism – Probabilis#c models throughout speech and language
processing, • IBM Thomas J. Watson Research Center: probabilis#c
models of speech recogni#on. • Data-‐driven approaches
– Speech -‐ part-‐of-‐speech tagging, parsing, aQachment ambigui#es, seman#cs.
• New focus on model evalua#on
• Considerable work on natural language genera#on
1983-‐1993
15
1994-‐1999 Major changes • Probabilis#c and data-‐driven models had become quite
standard
• Parsing, part-‐of-‐speech tagging, reference resolu#on, and discourse processing – Algorithms incorporate probabili#es – Evalua#on methodologies from speech recogni#on and
informa#on retrieval.
• Increases in the speed and memory of computers – commercial exploita#on (speech recogni#on, spelling and
grammar correc#on)
• Rise of the Web – need for language-‐based informa#on retrieval and
informa#on extrac#on.
16
1994-‐1999 Resources and corpora
• Disk space becomes cheap
• Machine readable text become common
• US funding emphasises large scale evalua#on on « real data »
• 1994 : The Bri#sh Na#onal Corpus is made available – A balanced corpus of Bri#sh English
• Mid 1990s : WordNet (Fellbaum & Miller) – A computa#onal thesaurus developed by psycholinguists
• The World Wide Web used as a corpus 17
2000-‐2008 Empiricist trends 1
• Spoken and wriQen material widely available – Linguis#c Data Consor#um (LDC) ... – Annotated collec#ons (standard text sources with
various forms of syntac#c, seman#c, and pragma#c annota#ons) • Penn Treebank (Marcus et al., 1993),) • PropBank (Palmer et al., 2005), • TimeBank (Pustejovsky et al., 2003b) • ....
– More complex tradi#onal problems castable in supervised machine learning • Parsing and seman#c analysis
– Compe##ve evalua#ons • Parsing (Dejean and Tjong Kim Sang, 2001), • Informa#on extrac#on (NIST, 2007a; Tjong Kim Sang, 2002;
Tjong Kim Sang and De Meulder, • 2003) • Word sense disambigua#on (Palmer et al., 2001; Kilgarriff
and Palmer, 2000) • Ques#on answering (Voorhees and Tice, 1999), and
summariza#on (Dang, 2006). 18
19
2000-‐2008 Empiricist trends 2
• More serious interplay with the sta#s#cal machine learning community – Support vector machines (Boser et al., 1992; Vapnik, 1995)
– Maximum entropy techniques (mul#nomial logis#c regression) (Berger et al., 1996)
– Graphical Bayesian models (Pearl, 1988)
20
2000-‐2008 Empiricist trends 2
Largely unsupervised sta#s#cal approaches – Sta#s#cal approaches to machine transla#on (Brown et al., 1990; Och and Ney, 2003) t
– Topic modelling (Blei et al., 2003)
• Effec#ve applica#ons could be constructed from systems trained on unannotated data alone
• Use of unsupervised techniques
21
Elements of a Language
• Phonemes • Morphemes • Syntax • Seman#cs
22
From sounds to language
• Linked with language understanding • Carried out by the auditory cortex
• Basic sounds of language are Phonemes (sound) – Smallest phone#c unit in a language – Capable of conveying a dis#nc#on in meaning.
• Eg: "M", in "man," and "c", in "can," are phonemes. – Every language has discrete set of phonemes – Describing all possible sounds
• Basic unit of words are Morphemes (to change form)
– A meaningful linguis#c unit – Consis#ng of a root word or a word element that
cannot be divided into smaller meaningful parts. • Eg: "Pick" and "s", in the word "picks," are morphemes
23
NATO Phone#c Alphabet A - Alpha K - Kilo U - Uniform 0 - Zero
B - Bravo L - Lima V - Victor 1 - Wun (One)
C - Charlie M - Mike W - Whiskey 2 - Two
D - Delta N - November X - X-ray 3 - Tree (Three)
E - Echo O - Oscar Y - Yankee 4 - Fower (Four)
F - Foxtrot P - Papa Z - Zulu 5 - Fife (Five)
G - Golf Q - Quebec 6 - Six
H - Hotel R - Romeo . - decimal (point) 7 - Seven
I - India S - Sierra . - (full) stop 8 - Ait (Eight)
J - Juliet T - Tango 9 - Niner (Nine)
24
Exercise
Word Morpheme Phoneme
Bay Bay (1) B + ay (2)
Pots Pot + s (2) P + o + t + s (4)
A A (1) A (1)
Teacher Teach + er (2) T + ea + ch + e + r (5)
?
?
?
? ?
?
?
?
25
Example
Word Morpheme Phoneme
Bay Bay (1) B + ay (2)
Pots Pot + s (2) P + o + t + s (4)
A A (1) A (1)
Teacher Teach + er (2) T + ea + ch + e + r (5)
26
Syntax structure of language
• Languages have structure: – not all sequences of words over the given alphabet are valid
– when a sequence of words is valid (gramma#cal), a natural structure can be induced on it
27
Syntax
• Describes the cons#tuent structure of NL expressions – (I (am sorry)), Dave, ( I ((can’t do) that))
• Grammars are used to describe the syntax of a language
• Syntac#c analysers and surface realisers assign a syntac#c structure to a string/seman#c representa#on on the basis of a grammar
28
Syntax
• It is useful to think of this structure as a tree: – represents the syntac#c structure of a string according to some formal grammar.
– the interior nodes are labeled by non-‐terminals of the grammar, while the leaf nodes are labeled by terminals of the grammar
29
Syntax tree example
S
NP VP
V NP
Adv
PP
NP V Det Prep
ohen
John
gives a book
n
to Mary
30
Methods in syntax Words -‐ syntac#c tree – Algorithm: parser
• A parser checks for correct syntax and builds a data structure.
– Resources used: Lexicon + Grammar – Symbolic : hand-‐wriQen grammar and lexicon
– Sta#s#cal : grammar acquired from treebank • Treebank : text corpus in which each sentence has been annotated with syntac#c structure.
• Syntac#c structure is commonly represented as a tree structure, hence the name treebank.
– Difficulty: coverage and ambiguity 31
Syntax applica#ons
• For spell checking – *its a fair exchange à No syntac#c tree
– It’s a fair exchange à ok syntac#c tree
• To construct the meaning of a sentence
• To generate a gramma#cal sentence
32
Syntax to meaning
John loves Mary
love(j,m)
33
Seman#cs – Where the hell ‘d you get that idea HAL
– Dave, although you took thorough precau'ons in the pod against my hearing you, I could see your lips move
34
Lexical seman#cs Meaning of words
1. come to have or hold; receive. 2. succeed in aQaining, achieving,
or experiencing; obtain. 3. experience, suffer, or be
afflicted with. 4. move in order to pick up, deal
with, or bring. 5. bring or come into a specified
state or condi#on. 6. catch, apprehend, or thwart. 7. come or go eventually or with
some difficulty. 8. move or come into a specified
posi#on or state ...
To get 1. a thought or sugges#on about a possible course of ac#on.
2. a mental impression. 3. a belief. 4. (the idea) the aim or purpose.
1. a place regarded in various religions as a spiritual realm of evil and suffering, ohen depicted as a place of perpetual fire beneath the earth to which the wicked are sent aher death.
2. a state or place of great suffering. 3. a swear word that some people use
when they are annoyed or surprised
An idea
The hell
35
Lexical seman#cs
Who is the master?
-‐ Context?
-‐ Seman#c rela#ons?
36
Composi#onal seman#cs
• Where the hell did you get that idea?
A swear word that some people use when they are annoyed or surprised or to emphasize something Have this belief
37
Seman#cs issues in NLP
• Defini#on and representa#on of meaning
• Meaning construc#on
• Seman#c rela#ons
• Interac#on between seman#c and syntax
38
Pragma#cs
• Knowledge about the kind of ac#ons that speakers intend by their use of sentences – REQUEST: HAL, open the pod bay door. – STATEMENT: HAL, the pod bay door is open.
– INFORMATION QUESTION: HAL, is the pod bay door open?
• Speech act analysis (politeness, irony, gree#ng, apologizing...)
39
Discourse Where the hell'd you get that idea, HAL? Dave and Frank were planning to disconnect me
à Much of language interpreta#on is dependent on the preceding discourse/dialogue
40
Linguis#cs knowledge in NLP summary
• Phone#cs and Phonology —knowledge about linguis#c sounds
• Morphology —knowledge of the meaningful components of word
• Syntax —knowledge of the structural rela#onships between word
• Seman#cs —knowledge of meaning • Pragma#cs — knowledge of the
rela#onship of meaning to the goals and inten#ons of the speaker
• Discourse —knowledge about linguis#c units larger than a single uQerance
41
Ambiguity
I made her duck
• I cooked duck for her. • I cooked duck belonging to her. • I caused her to quickly lower her body.
42
Ambiguity
• Sound-‐to-‐ text issues: – Recognise speech.
• Speech act interpreta#on – Can you switch on the computer?’
• Ques#on or request?
43
Ambiguity vs paraphrase • Ambiguity : the same sentence can mean different things
• Paraphrase: There are many ways of saying the same thing. – Beer, please. – Can I have a beer? – Give me a beer, please. – I would like beer. – I’d like a beer, please.
44
Applica#ons of NLP
• IE • IR • QA • Summariza#on • Sen#ment Analysis • Dialogue Systems
45
Ques#ons?
46
Crea#ng a conversa#onal agent
• AIML – Ar#ficial Intelligence Markup Language
– Used in ALICE (Ar#ficial Linguis#c Internet Computer En#ty)
– Won the • Leobner prize 3 #mes • ChaQerbox challenge
47
Cycle
Human inputs
sentence
Chatbot searches database
Chatbot produces response
48
Cycle
Human: “How are you
doing?”
Chatbot: “How are * … Template: I’m fine, thanks
Chatbot: I’m fine, thanks
49
PaQerns
<aiml> <category> <paQern> HELLO </paQern> <template> Hi! How are you? </template> </category>
… </aiml>
50
AIML shorthand
! A bot will not know this line is here p Hello t Hi there! I hope you are \ having a great day!
51
AIML Macros
! Remember...comments begin with the ! character ! The macro on the next line defines the bot's name $ my_name Larry p what is your name t My name is $$my_name.
52
AIML Macros
$ i_am_a robot p you are weird t Please don't hold my $$i_am_aness against me :-‐(
53
Loops pf hi hello "what is up" hola howdy "what is going on" t Hi, it's very nice to meet you! p hi t Hi, it's very nice to meet you! p hello t Hi, it's very nice to meet you! p what is up t Hi, it's very nice to meet you! p hola t Hi, it's very nice to meet you! …
54
Complex Loops
p I pf really " ” pf like love p being with pf dogs cats animals t Really? I enjoy being with animals, too!
55
Loops and Macros $ like_terms like love p i pf $$like_terms p fish t You like fish? I HATE fish! p i pf $$like_terms p dogs t I like dogs, too!
56
Advanced Loops p i pf " " really p love pf dogs cats fish animals t You love $$[1][*]? I love all kinds of $$[1][3]!
57
0 1 2 3
0 “ ” really
1 dogs cats fish animals
Random Lists
p hello t r l Hi! It's nice to meet you! l Hello, how are you today? l Gree#ngs! l Hi, what are you up to?
58
Random Lists
<paQern> HELLO </paQern><template><random> <li> Hi! It's nice to meet you! </li> <li> Hello, how are you today? </li> <li> Gree#ngs! </li> <li> Hi, what are you up to? </li> </random></template>
59
Condi#ons
p how do i look t I think you look cnv gender male handsome cnv gender female preQy t !
60
Exercise
Create a bus booking chatbot. 1. Gree#ng 2. From 3. To 4. Route 5. Bye
61