dr. radhika mamidi eng 270 lecture 2. history: 1940-1950’s major influences on the development of...

27
Dr. Radhika Mamidi ENG 270 Lecture 2

Upload: zain-orvis

Post on 31-Mar-2015

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Dr. Radhika Mamidi ENG 270 Lecture 2. History: 1940-1950’s Major influences on the development of CL -Development of formal language theory (Chomsky,

Dr. Radhika MamidiENG 270

Lecture 2

Page 2: Dr. Radhika Mamidi ENG 270 Lecture 2. History: 1940-1950’s Major influences on the development of CL -Development of formal language theory (Chomsky,

History: 1940-1950’sMajor influences on the development of CL

-Development of formal language theory (Chomsky, Kleene, Backus)– Formal characterization of classes of grammar (context-free grammar, regular grammar)– Association with relevant automata (finite state automaton)

Probability theory: language understanding asdecoding through noisy channel (Shannon)

Page 3: Dr. Radhika Mamidi ENG 270 Lecture 2. History: 1940-1950’s Major influences on the development of CL -Development of formal language theory (Chomsky,

1957-1983 Rule based vs. Statistical based

Rule based theory– Use of formal grammars as basis for natural

languageprocessing and learning systems. (Chomsky, Harris)

Statistical based theory– Probabilistic methods for early speech recognition,

OCR1983-1993:Return of Empiricism [Statistical techniques]

Use of statistical techniques for part of speech tagging, parsing, word sense disambiguation, etc.

Page 4: Dr. Radhika Mamidi ENG 270 Lecture 2. History: 1940-1950’s Major influences on the development of CL -Development of formal language theory (Chomsky,

1993-Present

Researchers are interested in both techniques.

Emphasis is on machine learning.Advances in software and hardware create NLP needs for search engines,

machine translation, spelling and grammar

checking, speech recognition and synthesis.

Page 5: Dr. Radhika Mamidi ENG 270 Lecture 2. History: 1940-1950’s Major influences on the development of CL -Development of formal language theory (Chomsky,

CL vs NLPCL and NLP are related with the focus being

different.Computational Linguistics aims to model

language as people do.Natural Language Processing is processing

language from a computational point of view in order to build different applications and tools.

Page 6: Dr. Radhika Mamidi ENG 270 Lecture 2. History: 1940-1950’s Major influences on the development of CL -Development of formal language theory (Chomsky,

Artificial Intelligence the branch of computer science which aims

to create the intelligence of machines "the study and design of intelligent agents”"the science and engineering of making

intelligent machines”

Page 7: Dr. Radhika Mamidi ENG 270 Lecture 2. History: 1940-1950’s Major influences on the development of CL -Development of formal language theory (Chomsky,

CL and AIOne way of finding out machine’s

intelligence is by finding out if it understands language.

AI and CL are related here.‘Chatbot’ is a reflection of this:

a computer program to simulate an intelligent conversation with one or more human users via auditory or textual methods.

a program with artificial intelligence to talk to people through voices or typed words.

Page 8: Dr. Radhika Mamidi ENG 270 Lecture 2. History: 1940-1950’s Major influences on the development of CL -Development of formal language theory (Chomsky,

Turing test: A test to judge the intelligence of a machine. It involves three entities: machine, human, and human judge

Judge asks questions of computer and human.-- Machine’s job is to act like a human-- Human’s job is to convince judge that he’s not

the machine.Machine is judged “intelligent” if it can fool

judge.Judgment of “intelligence” is linked to

appropriateanswers to questions from the system.

Language and Intelligence: Turing test

Page 9: Dr. Radhika Mamidi ENG 270 Lecture 2. History: 1940-1950’s Major influences on the development of CL -Development of formal language theory (Chomsky,

ELIZA, the first chatbot

A simple “Rogerian Psychologist”Uses pattern matching to carry on limited

form of conversation.It gives a feeling that it is “human”Seems to pass the “Turing Test”It is one of the first chatbots.The answers it gave showed its ‘intelligence’.

Page 10: Dr. Radhika Mamidi ENG 270 Lecture 2. History: 1940-1950’s Major influences on the development of CL -Development of formal language theory (Chomsky,

What’s involved in an “intelligent” Answer?

Analysis of answers:Decomposition of the signal (spoken orwritten) eventually into meaningful units.

This involves …Discourse Sentences Words Sounds

Page 11: Dr. Radhika Mamidi ENG 270 Lecture 2. History: 1940-1950’s Major influences on the development of CL -Development of formal language theory (Chomsky,

Levels of Language Processing

PhonologyMorphologySyntax SemanticsPragmaticsDiscourse Analysis

Page 12: Dr. Radhika Mamidi ENG 270 Lecture 2. History: 1940-1950’s Major influences on the development of CL -Development of formal language theory (Chomsky,

MemoryGeneral Knowledge

Lexicon Syntactic Rules

Semantic Rules

Discourse Rules

LexicalProcessing

INPUTSSyntactic

Processing Semantic

ProcessingDiscourseProcessing

OUTPUTS

Model of Language ProcessingTo derive meaning you need all kinds of rules – ‘building’, ‘blocks’Eg: The building blocks are made of plastic. The building blocks the sun.

Page 13: Dr. Radhika Mamidi ENG 270 Lecture 2. History: 1940-1950’s Major influences on the development of CL -Development of formal language theory (Chomsky,

Why is understanding language by a machine so difficult?Human language is:

Complex and Ambiguous

We use language creativelyWe don’t mean what we say!

Language Understanding needs contextual and general knowledge apart from linguistic knowledge.To know what we mean shared knowledge is

necessary.

Representing all this knowledge computationally is THE challenge.

Page 14: Dr. Radhika Mamidi ENG 270 Lecture 2. History: 1940-1950’s Major influences on the development of CL -Development of formal language theory (Chomsky,

Ambiguity at different language levels:

Pronounce “GHOTI” [one spelling – different sounds]I scream/ice-cream, a nameless man/an aimless man He showed me the mouse - rodent/objectThe leopard was spotted - verb/adjectiveShe hit the boy with the umbrellaI am reading a book on films - [now-a-days/right

now]Mary promised Sally(i) to go to her(i) partyMary(i) persuaded Sally to go to her(i) party

Human language is ambiguous

Page 15: Dr. Radhika Mamidi ENG 270 Lecture 2. History: 1940-1950’s Major influences on the development of CL -Development of formal language theory (Chomsky,

Human language is complex

teach – taught *preach - praughthe-his-him *she-shis-shimring – rang - rung *bring – brang - brungslim chance = fat chance ?slim girl = fat girl

No consistency. No regularity.

Page 16: Dr. Radhika Mamidi ENG 270 Lecture 2. History: 1940-1950’s Major influences on the development of CL -Development of formal language theory (Chomsky,

Let’s analyze this spoken sentence:

I made her duck.

How many meanings does the sentence have?

Page 17: Dr. Radhika Mamidi ENG 270 Lecture 2. History: 1940-1950’s Major influences on the development of CL -Development of formal language theory (Chomsky,

1. Speech technologies

Applications:1. Speech synthesis tools - Text to Speech

conversion2. Speech recognition tools – Speech to Text

conversion

Requires knowledge of phonological patternsText Speech conversion

Page 18: Dr. Radhika Mamidi ENG 270 Lecture 2. History: 1940-1950’s Major influences on the development of CL -Development of formal language theory (Chomsky,

UsesText to speechUse: Public announcements – airport, railway

stations, blind people, proofreading, when eyes are busy [drivers, writers etc.], speaking clocks etc.

Speech Recognition Use: Pronunciation dictionaries, voice commands

in pc, voice dialing (e.g., "Call home"), call routing (e.g., "I would like to make a collect call“), simple data entry (e.g., entering a credit card number) etc.

Page 19: Dr. Radhika Mamidi ENG 270 Lecture 2. History: 1940-1950’s Major influences on the development of CL -Development of formal language theory (Chomsky,

Some problemsGrapheme to Phoneme and vice-versa conversion

Different spellings – same pronunciation Example: reed-read, bear-bare, ear-year, I-eye, peace-piece

Same spellings – different pronunciation Example: read, bow, dove, does, minute, number

Numbers, Names, Acronyms – pronounced differently1980 --- uttered differently as year, quantity, currency St. --- street, saintPSU – public sector unit, prince sultan university

Give at least five ways of uttering your phone number.

Page 20: Dr. Radhika Mamidi ENG 270 Lecture 2. History: 1940-1950’s Major influences on the development of CL -Development of formal language theory (Chomsky,

2. Morphological AnalysisInflectional morphology

:word variation reflects features like tense, number, degree, gender :grammatical category remains sameeg. eat-eats, boy-boys, thin-thinner

Derivational morphology:word variation changes grammatical categoryeg. act-actor, boy-boyish

:word variation maintains grammatical category eg. fair-unfair, like-dislikeInflection follows Derivation: act--actor—actors

Tools built with morphological knowledge: Morphological analyzer [identifies roots and affixes] and Morphological generator [generates words from roots and affixes]

Page 21: Dr. Radhika Mamidi ENG 270 Lecture 2. History: 1940-1950’s Major influences on the development of CL -Development of formal language theory (Chomsky,

3. Syntactic ParsingProcess of identifying syntactic structure

of a valid sentenceRepresented by trees, rules and networks

Syntax ComponentsPhrase Structure RulesTransformational Rules

Tools built with syntactic knowledge: - Syntactic Parsers

[analyses a sentence automatically]e.g. Augmented Transition Networks

Page 22: Dr. Radhika Mamidi ENG 270 Lecture 2. History: 1940-1950’s Major influences on the development of CL -Development of formal language theory (Chomsky,

Syntax ComponentChomsky’s (1965) model of languagePhrase Structure rules generate deep structuresDeep Structure holds all the syntactic

information needed to derive the meaning of a sentenceThis is fed into the semantic component to

obtain acceptable combinations Transformational rules map deep structures to

surface structureSurface Structure has words in the right order

This is obtained after feeding surface structure into the phonological component

Page 23: Dr. Radhika Mamidi ENG 270 Lecture 2. History: 1940-1950’s Major influences on the development of CL -Development of formal language theory (Chomsky,

Chomsky’s model

SYNTAX COMPONENT

Surface structures

Transformational rules

Phrase Structure Rules

Deep structures

PHONOLOGICAL COMPONENT

Phonological rules

Selection restriction rules

Lexicon

SEMANTIC COMPONENT

Page 24: Dr. Radhika Mamidi ENG 270 Lecture 2. History: 1940-1950’s Major influences on the development of CL -Development of formal language theory (Chomsky,

RepresentationEg: Riyadh is a beautiful city.

1. Rules: S NP VP NP (Art) (Adj) N

VP V NP

Lexicon: Art – a Adj – beautiful N – Riyadh, city V – is

Page 25: Dr. Radhika Mamidi ENG 270 Lecture 2. History: 1940-1950’s Major influences on the development of CL -Development of formal language theory (Chomsky,

s1 s2 s3

NP VPS:

s1 s2 s3

article noun

Empty

Adj loop

NP:

s1 s2 s3

verb NPVP:

2. Representation by Networks

Page 26: Dr. Radhika Mamidi ENG 270 Lecture 2. History: 1940-1950’s Major influences on the development of CL -Development of formal language theory (Chomsky,

S

NP VP

N V NP

Riyadh isart

beautiful

Adja

Noun

place3. Representation by Trees

Page 27: Dr. Radhika Mamidi ENG 270 Lecture 2. History: 1940-1950’s Major influences on the development of CL -Development of formal language theory (Chomsky,

Example of automatic syntactic analysis by online ‘Link parser’.

Sentence given: Riyadh is a beautiful place.Output:(S (NP Riyadh) (VP is

(NP a beautiful place)) .)

http://www.link.cs.cmu.edu/link/submit-sentence-4.html