rk easg seminar 080211 plus demo
TRANSCRIPT
-
8/2/2019 RK EASG Seminar 080211 Plus Demo
1/41
Using Corpora for AutonomousCorrection and Improvement of
Academic Writing
Ramesh Krishnamurthy
Aston UniversityFebruary 8th 2011
[REPORT ON WORK IN PROGRESS]
-
8/2/2019 RK EASG Seminar 080211 Plus Demo
2/41
Abstract
1. All of ourstudents need to improve their academic writing
skills. This is true for Home students as well as for the increasing
numbers of EU and International students.
2. This talk looks at the possibilities of using corpora in this
process, and specifically reports on a case study involving aChinese-speaking student using the ACORN (the Aston Corpus
Network) corpora.
3. The method requires less teacher time, offers more scope for
autonomous student learning, and leads to a greater awareness
of academic writing as a cyclic editorial process rather than
merely as a product for assessment.
-
8/2/2019 RK EASG Seminar 080211 Plus Demo
3/41
UG1 students need to improve their
academic writing skills - 1
Examples from UG1
The same article can be reported differently, dependingon the type of newspaper it has been obtained from.
As the case when first reported concerned the death ofa young baby due to neglect and abuse, which legally
the public were not allowed to be made aware of the
full name of the child.
As expected from a headline the text still reads as astatement as opposed to a structured sentence in order
to grab the audiences attention.
-
8/2/2019 RK EASG Seminar 080211 Plus Demo
4/41
-
8/2/2019 RK EASG Seminar 080211 Plus Demo
5/41
UG2 students need to improve their
academic writing skills
Extracts from Feedback to UG2
Your written language needs some more work, as yourerrors sometimes impede communication somemistakes affect the clarity of the argument mistakesin spelling and grammar more noticeable aremistakes in the use ofterminology poor grammarmakes the analysis difficult to understandgrammatical errors, and poor choice of words
(especially terminology) spelling mistakes use ofcomplex sentences affect the clarity of argumentsome rather informal comments
-
8/2/2019 RK EASG Seminar 080211 Plus Demo
6/41
UG3 students need to improve their
academic writing skills
Examples from UG3
From my initial reading on this matter, I have readwithin Richard Dawkins (1976) book The Selfish
Gene and this gave me a valuable insight into OxfordDictionaryies.com it is easily transferrable from any
subject that is of slight annoyance, to an accident
these memes are an interesting cause for study,
particularly, as they are most widely recognised inyounger Internet communities The area in which I
propose to study is in politics and corpus
-
8/2/2019 RK EASG Seminar 080211 Plus Demo
7/41
UG3 students need to improve their
academic writing skills Extracts from Feedback to UG3 Some errors and weaknesses in expression (comprised
from) weak wordings cause loss of coherence Weakacademic style; poor proofreading; sometimes
repetitive/tautologous wordings Weak expressionoften obscures meaning Grammar not clear; manyseeming errors Major weaknesses in expression andstyle, obscures meaning at times Some weaknessesin use of terms s for plurals non-grammatical
sentences some poor wordings... errors and typospoor wordings, including informal, non-academicphrases weak grammar obscures meaning
-
8/2/2019 RK EASG Seminar 080211 Plus Demo
8/41
Masters students need to improve
their academic writing skills Extracts from Feedback to MA/MSc Unfortunately, the presentation suffered considerably from poor
wordings, weak academic style, and many typos and errors quitea lot of minor slips already noticeable in the first page, often to dowith word choice The consistently poor quality of English
throughout makes it very difficult to assess frequent lack oflinguistic clarity and cohesion the content is largely obscured bythe weaknesses in form at times repetitive, or overladen withconnectors The main weakness is in English expression, whichsometimes obscures the intended meaning English style is oftenpoor (the learnt from coursebooks language the employment of
the founding exerting the whole dialogues) problems in Englishexpression sometimes cause difficulty for the reader inconsistentand inaccurate use ofterminology very weak English academicwriting style and expression throughout, often leading toconsiderable difficulty in comprehension
-
8/2/2019 RK EASG Seminar 080211 Plus Demo
9/41
and its not just me!
-
8/2/2019 RK EASG Seminar 080211 Plus Demo
10/41
-
8/2/2019 RK EASG Seminar 080211 Plus Demo
11/41
-
8/2/2019 RK EASG Seminar 080211 Plus Demo
12/41
-
8/2/2019 RK EASG Seminar 080211 Plus Demo
13/41
http://nexus.aber.ac.uk/xwiki/bin/view/Main/HEA+Annual+Conference+2009
Higher Education Academy Annual Conference 2009
The Wiki Way to Develop Academic Writing Competence Dr Rob Spence (Edge Hill University)
This paper presented an account of an ongoing investigation intothe use of wikis to develop students academic writing skills
through collaborative work. Undergraduate students of English
were invited to collaborate on writing tasks with the specific aim of
developing their competence through peer review and appraisal.
The motivation for the wiki project arose from the widely-commented (if only anecdotal) decline in student writing
skills/literacy in HE. In particular, the wiki project sought to addressthree widely-perceived problems: students lack of confidence,
students inability to deal with complex issues, students
substandard written work and the tendency to Wikipedia cut-
and-paste.
http://nexus.aber.ac.uk/xwiki/bin/view/Main/HEA+Annual+Conference+2009http://nexus.aber.ac.uk/xwiki/bin/view/Main/HEA+Annual+Conference+2009 -
8/2/2019 RK EASG Seminar 080211 Plus Demo
14/41
http://www.humboldt.edu/english/GWPEGeneralInformation.htm
Humboldt State University, department of ENGLISH History of and Rationale behind the Graduation Writing
Proficiency Examination Requirement
Because of a noticeable decline in student writingskills, the CSU Chancellor appointed a Task Force on
Student Writing Skills in 1975 to investigate theproblem and recommend appropriate solutions. Themajor portion of the Task Force's recommendations,reviewed by the Educational Policies Committee andsupported by the CSU Academic Senate, was accepted
by the Board of Trustees in 1976. One of the centralaspects of this policy required the demonstration ofwriting proficiency at the upper-division level as arequirement for graduation from every campus withinthe CSU system.
http://www.humboldt.edu/english/GWPEGeneralInformation.htmhttp://www.humboldt.edu/english/GWPEGeneralInformation.htm -
8/2/2019 RK EASG Seminar 080211 Plus Demo
15/41
-
8/2/2019 RK EASG Seminar 080211 Plus Demo
16/41
Learner autonomy
autonomy 1620s, from Gk. autonomia "independence, livingby one's own laws" from auto- "self" + nomos "custom, law"
[http://www.etymonline.com/]
moral and political philosophy > sociology > education
Holec (1979) Autonomy and Foreign Language Learning
Boud (ed) (1981) Developing Student Autonomy in Learning
Grenfell and James (2004) Change in the field - changing the
field: Bourdieu and the methodological practice of educational
Research. British Journal of Sociology of Education,25/4, 507-523
http://www.etymonline.com/index.php?term=autonomyhttp://www.etymonline.com/index.php?term=autonomy -
8/2/2019 RK EASG Seminar 080211 Plus Demo
17/41
Learner autonomy: Holec (1979)
The autonomous language learner takes responsibility for the totality ofhis learning situation. He does this by determining his own objectives,defining the contents to be learned and the progression of the course,selecting methods and techniques to be used, monitoring this procedure,and evaluating what he has acquired. Objectives are specific to thelearner, and the learner's communicative needs determine the verbal
elements chosen. Learning thus proceeds from ideas to correctgrammatical, lexical, and phonological form. The self-directed learnerchooses the methods of instruction through trial-and-error. His selectionis based on the objectives set and its applicability to internal and externalconstraints. The student evaluates his attainment through his objectives,and this evaluation helps him to plan subsequent learning. The concept ofautonomous learning requires a redefinition of knowledge from anobjective universal to a subjective individual knowledge determined bythe learner. For teachers, it means new objectives which help the learnerdefine his personal objectives and help him acquire autonomy. Severalexperiments in autonomous learning are described.
-
8/2/2019 RK EASG Seminar 080211 Plus Demo
18/41
-
8/2/2019 RK EASG Seminar 080211 Plus Demo
19/41
Learner autonomy: Grenfell and James (2004)
methodological practice in educational research from theperspective of Bourdieus field theory (p507)
taking educational research itself to be a field (p508)
the briefest account of methodological developments in thetwentieth century would describe a move away from a
positivist towards a more qualitative, naturalistic paradigm.
Up until the 1960s, what educational research that did takeplace was mostly small, part-time and based on
psychometric tests of pupils intelligence and learning. The
alternative to this approach stemmed from a philosophical
critique of its founding assumptions to mimic the physicalsciences and stressed instead the social and contextual
aspects of education (see Hirst, 1966, 1974). What emerged
was a definition of educational theory in terms of the so-
called `foundational disciplines': sociology, philosophy, history,
psychology.
-
8/2/2019 RK EASG Seminar 080211 Plus Demo
20/41
Learner autonomy: Grenfell and James (2004)
The qualitative paradigm developed throughout the 1970s, 1980s and1990s, giving rise to a range of ethnographic and naturalist
methodologies, including the postmodernist. However, a sustained attack(see Hillage et al., 1998; Tooley & Darby, 1998) against this research was
mounted during this last decade of the century; claiming to find its
methods insufficiently rigorous, its data collection small scale and its
outcomes biased. Moreover, it was argued that such research had little
impact on institutional practice; while what was needed was research ofthe nature that answered questions such as how to improve pupil
achievement. Researchers were urged to return to quantitative methods,
with experiments and randomized controlled trials seen as capable of
producing sufficiently hard' evidence (see Fitz-Gibbon & Morris, 1987;
Boruch, 1997; Fitz-Gibbon, 2001). (p509) avant-garde rear-garde
process of time (p510) There are other features that follow from the
character offields and the avant-garde. First is the question of
autonomy (p510) [NB NO mention anywhere in the article oflearner
autonomy!]
academic products structure practice (p510)
-
8/2/2019 RK EASG Seminar 080211 Plus Demo
21/41
Focus on Product = Neglect of Process?
League tables
A level results
Marking systems (class distribution)
Equality (irrespective of motivation/performance)
Increasing instrumentality in attitudes toeducation
Grenfell and James (2004):
academic products structure practice(p510)
-
8/2/2019 RK EASG Seminar 080211 Plus Demo
22/41
-
8/2/2019 RK EASG Seminar 080211 Plus Demo
23/41
Initial Research: ACORN Case Studies 2006-7
This research was first reported in the ACORNCase Studies (2008), as ACORN Case Study 2:Self-Correction of Academic Writing
Case Study 4: Spanish Grammar Clinics was sincedeveloped and published: Yepes, G.R. &Krishnamurthy, R. (2010). Corpus Linguistics
and Second Language Acquisition the use ofACORN in the teaching of Spanish Grammar,Lebende Sprachen 55/1: 108122
-
8/2/2019 RK EASG Seminar 080211 Plus Demo
24/41
ACORN Case Study 2 Context : I worked closely with Steven, a Computer Science Placement Student on
ACORN, a Chinese native-speaker, who came to UK in 2002, did 9 months ofEnglish then 2 years A-level (Maths, Chinese, Physics) at an FE college, then startedat Aston in 2005. He submitted a weekly 1-page report to me on his ACORN work.
Aims: To help Steven to improve his English and produce better reports; to trial theACORN system with a view to software enhancements; to understand some of thepedagogic implications of the methodology
Procedure: This started very informally, but seemed to work extremely well, so we
started to preserve the data. Very rough estimates are: I spent 2-3 minuteshighlighting in green any marked usages; Steven spent 5-10 minutes correcting 10-15% silly mistakes, 30 minutes checking ACORN corpus and correcting 60-70% ofother items; We spent 15 minutes going through the 15-20% remaining complexitems, and 15 minutes discussing Chinese/English, corpus software design, andsearch procedures
Examples of items I highlighted in Stevens draft reports: I will take a deep lookinto it next week I replied him He was not an expert with MySQL The testingthat I am doing does not affect any of the current functions on ACORN exceptadding new records to the ACORN log The PHP engine on the server might out
putan error message.
ACORN Screenshots were provided for these items, showing how he found thecorrect wording to use
-
8/2/2019 RK EASG Seminar 080211 Plus Demo
25/41
ACORN Case Study 2 Initial Evaluation: Steven enjoys this method: he finds it empowering, and
incidentally learns other lexis and grammar; he perceives for himself the value offunctions missing in the ACORN software: e.g. phrase search, and this motivateshim to develop them; It saves me time, and turns a more mundane task into astimulating experimental procedure
Afterthoughts: We have records of the marked pages and the corrected pages. We need to
accurately record when Steven uses ACORN, which searches are quicker, which are
impactful on his learning, which steps through the data require externalprompting, etc.
An updated report from Steven suggests that, partly because of the restricted andrepetitive nature of his reports, and partly due to his past experience, theproportions of corrections are changing. The range/variety of errors has beenreduced. He now estimates that he is able to self-correct 30% of errors (e.g.omission of the; mismatch of tense sequences), only about 20% involve ACORN
searches, and perhaps up to 50% require discussion. I think this methodology could be used by many language teachers. It is quick for
the teacher, and results in a high proportion of self-correction by the student, aswell as some incidental learning. The procedure can of course also be used bystudents while drafting, rather than after correction, and for academic writing inFrench, German and Spanish.
-
8/2/2019 RK EASG Seminar 080211 Plus Demo
26/41
-
8/2/2019 RK EASG Seminar 080211 Plus Demo
27/41
AntConc software
for initial corpus analysis
Demo
-
8/2/2019 RK EASG Seminar 080211 Plus Demo
28/41
AntConc: Word List: Drafts corpus = 15694 tokens [avge length=402], 1821 types
-
8/2/2019 RK EASG Seminar 080211 Plus Demo
29/41
AntConc: Word List: Corrected corpus = 15643 tokens [avge length=401], 1806 types
-
8/2/2019 RK EASG Seminar 080211 Plus Demo
30/41
DATASET 2: ACORN usage
monitoring programs
1. Createlog the original monitor program
started 05/06/07 when ACORN was firstreleased to staff/students
but only recorded concordance searches
Designed to allow download as Excel file
but dataset has now outgrown the Excelmaximum record limit (c. 62k lines?)
-
8/2/2019 RK EASG Seminar 080211 Plus Demo
31/41
-
8/2/2019 RK EASG Seminar 080211 Plus Demo
32/41
DATASET 2: ACORN usage
monitoring programs
2. Monitor Log records ALL queries withinACORN (i.e. frequencies, etc as well asconcordance)
but only started on 13/03/08 written bySteven!
Was also supposed to allow saving as Excel file
But the Excel download does not work itcreates a file, but with only one line of data,always the same one!
-
8/2/2019 RK EASG Seminar 080211 Plus Demo
33/41
-
8/2/2019 RK EASG Seminar 080211 Plus Demo
34/41
Extracting Stevens searches
from the ACORN usage monitor logs
This was fairly straightforward, sorting the log
files on the username column
-
8/2/2019 RK EASG Seminar 080211 Plus Demo
35/41
-
8/2/2019 RK EASG Seminar 080211 Plus Demo
36/41
Aligning Stevens work and ACORN usage
for detailed analyses
This was slightly trickier! START/END DATES of Stevens work: 06/08/07 - 12/06/08Week 1 draft report = 06/08/07
Week 42 draft report =29/05/08
[Week 45 draft report = 13/06/08]
Week 1 corrected report = 07/08/07Week 42 corrected report = 12/06/08
BUT(1) Corrected versions were often submitted in batches, whenever
Steven found the time in between his ACORN programming tasks,
hence the detailed analyses are also initially conducted in batches
(2) Change in ACORN usage monitoring program: As Steven only
launched the monitor log program on 13/03/08, I can only check
Stevens use of Concordances (and no other features) before that date
-
8/2/2019 RK EASG Seminar 080211 Plus Demo
37/41
Week twelve
I updated the contents of the tutorial and case studies files by following Rameshs corrections and then recreated them in new designedlayout. And finally, I uploaded them to the server in order to allow Ramesh to them to show Professor Alison Halstead.
For the existing parallel text files on the server, there are marks between paragraphs, where # indicates theparagraph number, so that the parallel indexing program knows that where a new paragraph starts and what the paragraph number is.
However, for the new parallel indexing program which compiled last week, it recognizes a new paragraph by an empty line of String
and then increments the paragraph number by 1. The reason for why I did it this way was because if it gave me the correct paragraph
number, then I would not have to run the paraAlign.java program to produce the marks before running the parallel
indexing program, this would shorten the time required for the whole indexing processes.
The contents of the new created databas were changed slightly after using the new compiled program. The sequence of the values
under the field ID in table tokens used to be in numerical order, from 1 to the number of total tokens in the file. But after the new
compiled program was used, the sequence was not in numerical order. The reason for that was because the tables contents were
ordered by the frequency of tokens, which means the most frequent word appeared on the top of the table rather than the first token in
the file.
To test whether the new database could work properly with the parallelResult.php file, I had to upload the database and the parallel
text files from localhost to the live server and then move the existing parallel text files on the server to a different directory so that only
the new uploaded files were read, and finally test them by using the parallel function on the website. Unfortunately the test result
suggested that there were some problems because no text was shown on the parallelResult web page. While I was thinking what the
problems may be, I emailed Husman to explain what I have done and what the result was, to see if he knew what had gone wrong. The
reasons that I could think of were either there was something else that I had not yet done or the values under the field ID had to be in
numerical order. But I did not think the possibilities were high for both of the reasons.
I had a look at the parallelResult.php file and tried to find out what commands were used to retrieve the data from the database. But Ihave not resolved anything yet.
Stevens draft
-
8/2/2019 RK EASG Seminar 080211 Plus Demo
38/41
Stevens draft with Rameshs green highlights
Createlog Createlog + Items highlighted by Ramesh in draft Items searched in
-
8/2/2019 RK EASG Seminar 080211 Plus Demo
39/41
Createlog
ONLY
05/06/07
12/03/08
Createlog +
Monitor Log
13/03/08
12/06/08
Items highlighted by Ramesh in draft Items searched in
ACORN
Week 12
draft
25/10/07 following Rameshs corrections corrections
Week 12
corrected
26/10/07 I did not think the possibilities were high for both of the reasons reason
the new compiled program compiled
the new uploaded files [corrected by
analogy?]
The reason for that was because reason
NOT IN WEEK 12 DRAFT 1346 chenz English eng_general_db research 18/10/2007
NOT IN WEEK 12 DRAFT 1359 chenz English eng_general_db negative 21/10/2007
1366 chenz English eng_general_db the 23/10/2007
1376 chenz English eng_general_db reason 25/10/2007
1377 chenz English eng_general_db compiled 25/10/2007
1378 chenz English eng_general_db numerical 25/10/2007
1379 chenz English eng_general_db may 25/10/2007
1380 chenz English eng_general_db might 25/10/2007
1381 chenz English eng_general_db top 25/10/2007
1382 chenz English eng_general_db reason 25/10/2007
1383 chenz English eng_general_db the 25/10/2007
1394 chenz English eng_general_db corrections 26/10/2007
NOT IN WEEK 12 DRAFT 1395 chenz English eng_general_db webpage 26/10/2007
1396 chenz English eng_general_db text 26/10/2007
-
8/2/2019 RK EASG Seminar 080211 Plus Demo
40/41
Week twelve
I updated the contents of the tutorial and case studies files by implementing the corrections that Ramesh suggested, and then recreated them ina newly designed layout. And finally, I uploaded them to the server in order to allow Ramesh to show them to Professor Alison Halstead.
For the existing parallel text files on the server, there are marks between paragraphs, where # indicates the paragraphnumber, so that the parallel indexing program knows that where a new paragraph starts and what the paragraph number is. However, the new
parallel indexing program (compiled last week) recognizes a new paragraph by empty lines, and then increments the paragraph number by 1.
The reason for doing it this way was that if it gave me the correct paragraph number, then I would not have to run the paraAlign.java program
to produce the marks before running the parallel indexing program. This would shorten the time required for the whole
indexing process.
The contents of the newly created database were changed slightly after using the newly compiled program. The values under the field ID in
table tokens used to be in numerical order, from 1 to the number of total tokens in the file. But after the newly compiled program was used,
the sequence was not in numerical order. That was because the tables contents were ordered by the frequency of tokens, which means the
most frequent word appeared at the top of the table rather than the first token in the file.
To test whether the new database could work properly with the parallelResult.php file, I had to upload the database and the parallel text files
from localhost to the live server and then move the existing parallel text files on the server to a different directory so that only the newly
uploaded files were read, and finally test them by using the parallel function on the website. Unfortunately the test result suggested that there
were some problems because no text was displayed on the parallelResult webpage. While I was thinking what the problems might be, I
emailed Husman to explain what I have done and what the result was, to see if he knew what had gone wrong. The reasons that I could think
of were either there was something else that I had not yet done or the values under the field ID had to be in numerical order. But I did not
think the possibilities were high for either of the reasons.
I had a look at the parallelResult.php file and tried to find out what commands were used to retrieve the data from the database. But I have notresolved anything yet.
Stevens corrected version
-
8/2/2019 RK EASG Seminar 080211 Plus Demo
41/41
NEXT STEPS:
I need to search the same items that Stevensearched, and try to work out, by following his
search path, which screen displays could have ledhim to make successful corrections. This will help
me to evaluate the query strategy he used,
think of quicker/better strategies, ways toimprove the user interface, and helpfiles to train
users in successful search strategies