ix language and computer

36
IX Language and Computer

Upload: meadow

Post on 14-Jan-2016

56 views

Category:

Documents


0 download

DESCRIPTION

IX Language and Computer. Contents. 10.0 Introduction 10.1 Computer-assisted language learning 10.2 Machine translation 10.3 Corpus linguistics 10.4 Computer mediated communication. 10.0 Introduction: Computational linguistics. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: IX     Language and Computer

IX Language and Computer

Page 2: IX     Language and Computer

Contents10.0 Introduction

10.1 Computer-assisted language learning

10.2 Machine translation

10.3 Corpus linguistics

10.4 Computer mediated communication

Page 3: IX     Language and Computer

10.0 Introduction: Computational linguistics A branch of applied linguistics, dealing with computer

processing of human language (Johnson & Johnson 1999)

1. The analysis of language data so as to establish the order in which learners acquire various grammatical rules or the frequency of occurrence of some particular item

2. Electronic production of artificial speech and the automatic recognition of human speech.

3.Research on automatic translation between natural languages

4. Text processing and communication between people and computer.

Page 4: IX     Language and Computer

10.1 computer assisted language learning

Page 5: IX     Language and Computer

10.1.1 CAI / CAL vs. CALL CAI—computer-assisted instruction(计算机辅助教学 ): the use of computer in a teaching program.

1.A teaching program which is presented by a computer in a sequence. Students---responses—computer—correct or not.

2.The use of computer to monitor student’s progress, offer directions to students.

Page 6: IX     Language and Computer

CAL—computer-assisted learning (计算机辅助学习 ): emphasizing the use of computer in both teaching and learning I order to help learners to achieve educational objectives through their own reasoning and practice, a ref;lection of newly advocated autonomous learning.

1. Leading students through a learning task step by step, checking comprehension and further practice and materials.

2. Interaction through the exploration of a subject or problem

Page 7: IX     Language and Computer

CALL—computer-assisted language learning (计算机辅助语言学习) It refers to the use of computer in the teaching o

r learning of a second or foreign language. 1. Activities which parallel learning through oth

er media but which use the facilities of computer.

Activities which are extensions or adaptions of print-based or classroom based activities.

Activities which are unique to CALL.

Page 8: IX     Language and Computer

10.1.2 Phases of CALL development 1. Large mainframe machines in institution, conventio

nal traditional grammatical explanation, audio-lingualism; with a terminal

2. Small computers, taps or floppy disks, portable, eclectic, pragmatic and student-oriented

3. Cognitive problem solving techniques and interactions among students in a group: computer as a trigger

4. Word-processing enables students to compose and carry out their own writing, spoken and moving video available

Page 9: IX     Language and Computer

10.1.3 Technology Customizing, template, and authoring program

---Teachers use the program to design their own lessons which fit their own purposes.

Computer networks

---Local area network: More interaction between teachers and students

Compact disk technology Digitized sound USB (universal serial bus)

Page 10: IX     Language and Computer

10.2 Machine translation The use of machine to translate texts from one natur

al lg to another. Unassisted MT, which takes pieces of text and trans

late them into output for immediate use with no human involvement.

Assisted MT, where a human translator clean up after, and sometimes before, translation in order to get better quality results.

Philosophical, religious concern; Political concern Economical concern

Page 11: IX     Language and Computer

10.2.1 History of Development

1. The independent work by MT researchers

--early 1950sLimitation: hardware, memory, low access to storage, programming lg, assistance from linguistics.

--Crude dictionary-based approach, statistical methods. Low quality, thus human involvement

--Both pre-editing and post-editing were required

Page 12: IX     Language and Computer

2.Towards good quality output Improved hardware, first programming lg, development

in syntactic analysis Around 1960, good quality is achievable. Assumption: the goal of MT must be the development o

f fully automatic systems producing high quality translations and the use of human assistance was regarded as interim arrangement, and post-editing would be less and less.

Emphasis of research was on the search for theories and methods for the achievement of “perfect” translation.

Bar-Hillel: critical of Fully Automatic High Quality Translation, proposed “man-machine symbiosis”

Page 13: IX     Language and Computer

The development of translation tools Since the 1970s, development continued in three main

strands: 1. Computer-based tools for translators--1960s, real-time interactive computer environment;1970s,

word processing;1980s, microcomputer with networking and large storage capacity

--dictionaries and terminological databanks, multilingual word processing, management of glossaries and terminology resources, input and output communication

2. Operational MT systems involving human assistance in various ways

3.“pure” theoretical research towards the improvement of MT methods

Page 14: IX     Language and Computer

10. 2.2 Research methods 1. Linguistic approach--A test-bed for any kinds of linguistic theories which attem

pt to account for language or grammatical rules 2. The transfer approach 3.The interlingual approach An interlingua between any languages. 4.The knowledge-based approach--Linguistic knowledge independent of context—semantic

features--Linguistic knowledge that relates to context, pragmatic kn

owledge.--Common sense / real world knowledge (non-linguistic kn

owledge)

Page 15: IX     Language and Computer

10.2.3 MT quality:

still poor

Page 16: IX     Language and Computer

10.2.4 MT and the Internet: --an accelerating growth of real-time on-line

translation on the Internet itself.

--Internet with further profound impact on MT: stand-alone PC replaced by Network computers.

--Fewer “pure” MT systems but much more computer-based tools and applications where automatic translation is just one component.

Page 17: IX     Language and Computer

10.2.5 Speech translation: small-domain natural lg. application.

Page 18: IX     Language and Computer

10.2.6 MT and human translation They can and will co-exist in relative harmony. MT:large scale/rapid translation, repetitive

document,cost less, quality of out put is less important

Human translator: non-repetitive linguistically sophisticated texts, one-off texts in specific highly-specialized technical subjects, one-to-one interchange of information, spoken language translation

Page 19: IX     Language and Computer

10.3 Corpus linguistics

Page 20: IX     Language and Computer

10.3.1 Definition Corpus (corpora) : a collection of linguistic data, co

mpiled as written texts or as a transcription of recorded speech. The main purpose of a corpus is to verify a hypothesis about lg---for example, to determine how the usage of a particular sound, word, or syntactic construction varies.

Corpus linguistics deals with the principles and practices of using corpora in lg study. A computer corpus is a large body of machine-readable texts.

--Crystal, David. 1992:85. AN Encyclopedic Dictionary of Language and Languages

Page 21: IX     Language and Computer

Another definition CORPUS (corpora) (1) a collection of texts,

esp. if complete and self-command; the corpus of Anglo-Saxon verses. (2) plural also corpuses. In linguistics and lexicography, a body of texts, utterances or other specimens considered more or less representative of a language, and usu. Stored as an electronic database.

Corpus linguistics studies data in any such corpus.

Page 22: IX     Language and Computer

10.3.2 Criticism and revival of corpus linguistics Chomsky: empiricism vs. rationalism--invalidated corpus as a source of evidence in lin

guistic enquiry.--the description of rules in a language.--emphasis on competence rather than performanc

e--practicability--ungrammatical sentences vs. new sentences

Page 23: IX     Language and Computer

Revival of corpus linguistics Quirk (191) Survey of English Usage (SEU) Jan Svartvik (1975) London-Lund corpus (SEU

and the Brown corpus) Jan Svartvik: computerized the SEU

Page 24: IX     Language and Computer

10.3.3 Concordance (共现检索 ) Definition: The way of sorting data, for example,

alphabetically of words occurring in the immediate context of the word.

--Search for a particular word and retrieve all the examples of it.

--This is the tool more often implemented in corpus linguistics to examine corpora.

Usage: comparing different usage of the same word.--Analyzing word frequencies--Finding and analyzing phrases and idioms--Creating indexes and word lists

Page 25: IX     Language and Computer

10.3.4 Text encoding and annotation

Annotated corpora refer to those corpora which have been enhanced with various types of linguistics information.

--The implicit linguistic information has been made explicit through the process of concrete annotation.

--Claire_NP1 collects _VVZ shoes_NN2.

Page 26: IX     Language and Computer

Leech (1993): seven maxims in annotation of corpora

1. Possible to remove 2. Possible to extract the annotation b itself fro

m the text 3. Guidelines for the end-user 4. How/who carried out the annotation 5. Not infallible but potentially useful tool 6. Based n agreed and theory-neutral principles 7. No a priori standard

Page 27: IX     Language and Computer

10.3.5 Roles of corpus data Speech research

--A wide selection of variables: gender, age, class, etc. generalization

--Variation within a spoken lg.

--A sample of naturalistic speech

--Large scale of quantitative study Lexcial studies

--Dictionaries

--Definitions

--Word combinations, co-occurring words

Page 28: IX     Language and Computer

Semantics

--Objective approach of study of semantics: semantic distinction is context-related, and make it possible to examine the context

--Fuzziness and absoluteness: gradable Sociolinguistics: natural quantitative data Psycholinguistics:

Page 29: IX     Language and Computer

10.4 Computer mediated communication (计算机介入的信息交流 )

With a focus on lg and lg use in computer networked environment and by its use of methods of discourse analysis to address that focus.

It takes a variety of forms whose linguistic properties vary depending on the kind of messaging system used and the social and cultural context embedding particular instances of use.

Mails and news

Page 30: IX     Language and Computer

PowerPoint: an application which enables one to create slide shows on his or her computer screen. It is a presentation authoring software creating graphical presentations with or without audio.

PowerPoint as a tool can be used to write outlines or create presentation visuals on the slides.

PowerPoint as a text has been broadly understood as the product created visually, graphically, acoustically, or audio-visually.

PowerPoint as a genre refers to a recurring tpe of activities just like a letter, a note, etc.

Page 31: IX     Language and Computer

Blog: mid-1990s A weblog, or blog for short, is defined by Dan Gilmore as “ a

n online journal comprised of links and postings in reverse chronological order, meaning the most recent posting appears at the top of the page”.

Features of blogs. 1. Post-centric 2. Arranged in chronological order 3. Serial and cumulative, opened-ended 4. Brief and independent narratives, some fictional, some fra

me of the narratives 5. Great variety in quality, content and ambition 6. Free-access 7. Style is personal and informal 8. Genuine human passion

Page 32: IX     Language and Computer

Chatroom

--A chat room is an online forum where people can chat online.

Emoticons (表情符号 ) or smileys(笑眯眯 )

--Less punctuation and acronyms

U, 4 (for), r (are), brb (be right back)

--Short sentences, informal expressions

: ) :-) : ( : < : - > :c

Page 33: IX     Language and Computer

Summary CAI-CAL- CALL MT Corpus linguistcs Concordance CMC

Page 34: IX     Language and Computer
Page 35: IX     Language and Computer
Page 36: IX     Language and Computer