handout presentation jinan university - inputlog presentation shandong...0 1 2 3 4 5 6 7 8 9 10 11...
TRANSCRIPT
Shandong University, China
Sunday, October 14
Keynote presentation
Using keystroke logging to better define
writing fluency in L1 and L2
www.inputlog.net
Van Waes, L. (2012). Using keystroke logging to better define writing fluency in L1 and L2 Paper presented at the The Eighth International Symposium on EFL Writing Research and Teaching in China, Shandong University, Jinan, China.
Luuk Van Waes University of Antwerp Department of Management [email protected] www.ua.ac.be/luuk.vanwaes
Using keystroke logging to better define writing fluency in L1 and L2
Luuk Van Waes – University of Antwerp | Belgium
The Eighth International Symposium on EFL Writing Research and Teaching in China
Introduction
Department of Management Master of Multilingual Professional Communication Teaching: Business Communication (Dutch) Research group on ‘Writing and Digital Media’ writing from multiple (digital) sources reading during writing speech recognition & live subtitling online Writing Center (Calliope: www.calliope.be) Journal of Writing Research WritingPro
Mariëlle Leijten, PhD
Flanders Research Foundation | [email protected] www.ua.ac.be/marielle.leijten
?
V. Berninger
J.R. Hayes
K. Schriver
R. Kellogg
M. Stevenson
D. Galbraith T. Olive G. Rijlaarsdam A. Wengelin E. Lindgren
Keystroke logging
State of the art
Inputlog
Fluency: L1 vs. L2
Methods in writing research
product analysis thinking aloud protocols: simultaneous or retrospective second or triple task techniques: direct or indirect video recording eyetracking neuro imaging (fMRI – PET – ERP)
Methods in writing research
product analysis thinking aloud protocols: simultaneous or retrospective second or triple task techniques: direct or indirect video recording eyetracking neuro imaging (fMRI – PET – ERP)
keystroke logging
KSL programs
Scriptlog Inputlog Translog uLog TraceIt/JEdit Eyewrite Eye&Pen
Writing process research
Process observation provides data for research on: cognitive writing processes writing strategies writing development pausing and revision behavior translation processes live subtitling manuscript (sub)versioning L1 versus L2 writing etc.
photo finish
course of the stage
Picasso | the process
Um zu erfahren was Im kopf eines Malers vor sich geht, genügt es seine Hand zu beobachten.
To experience what is going on in a painter's head, it is sufficient to observe his hand.
Picasso | the product
Inputlog
Inputlog 5
Windows (e.g. MS Word) Writing modes keyboard and mouse movements & clicks speech: Dragon Naturally Speaking focus: window monitoring
Analyses Graphs Pre and post processing Play‐back
Free download for researchers
Eric Van Horenbeeck | Robbe Block Joris Roovers | Tom Pauwaert
o New documento Other documento Previous document
Session identification
Record
Analyse
Selection of file and destination Analyses: summary general process graph pause linear source revision S‐Notation (linguistic)
Beginning writer
0
500
1000
1500
2000
2500
3000
0 2 3 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 23 23 23 24 26 26 27 28 29 30 31 32 36 36 36 36 37 38
0
10000
20000
30000
40000
50000
60000
70000 document length
cursor position
characters produced
pauses
Professional writer
0
500
1000
1500
2000
2500
3000
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 22
0
10000
20000
30000
40000
50000
60000
70000 document length
cursor position
characters produced
pauses
Characters produced
Document length
Beginning writer
0
500
1000
1500
2000
2500
3000
0 2 3 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 23 23 23 24 26 26 27 28 29 30 31 32 36 36 36 36 37 38
0
10000
20000
30000
40000
50000
60000
70000 document length
cursor position
characters produced
pauses
General logging file
Summary
> 2000 ms > 200 ms
Linear text
S‐Notation
While composing a text, writers often reread their emerging text in order to comp{l}2ete what they intended to say, but when they detect an er[roor]1|1 ror they sometimes prioritize error correction. |2
REVISIONS1: Deletion […]2: Insertion {…}
Play
Start ‐ stop Realtime or Percentage of realtime speed Stepwise forward/backward (revision based) To end/beginning of document
中国
Collaboration to adapt Inputlog to Chinese script
● Antwerp
At the moment: only Roman and Greek script
Fluency: L1 vs. L2
Research study: Fluency in L1 vs L2
participants 68 proficiency in L2 EFR level C1 two tasks L1 + L2 expository (knowledge telling) time on task 10 min max observation Inputlog 5.0 ~ [230 000 observations] analysis GLM repeated measures
Study is used as a stepping stone to further define fluency.
L1 L2
Fluency
Distance 30km: Average speed of 25 km per hour
Characters per minute
product | p< .01
L1 L2 L1 L2
225
188
146180
process | p< .01
Ratio product/process
77,9
79,6 79,6
75,3
77,4 77,4
70
72
74
76
78
80
82
characters (incl. spaces) characters (excl. spaces) words
L1
L2
Indicators for fluency
product process duration; length in words / characters per min
ratio process / product
Different routes
Distance 30 or 40km: Average speed of 30 km per hour
Number of pauses per min (log scale)
L1 > L2 L1 < L2251
60
17
6,3
2,4
4248
213
66
20
7,3
2,9
4956
1
10
100
1.000
all >200 ms > 500 ms >1000 ms >2000 ms 220<500 ms 200<1000 ms
L1L2
200
Percentage of Pause Time
0
10
20
30
40
50
60
70
80
>200 ms > 500 ms >1000 ms >2000 ms
L1
L2η2= .245
η2= .152
η2= .689
η2= .425
Pauses per level (ms ‐ log reconverted)
130
308
829
150
377
891
0
100
200
300
400
500
600
700
800
900
1000
within words between words between sentences
L1L2
significant at all levels
Indicators for fluency
product process duration; length in words / characters per min pauses: length, number * [level] * [thresholds] percentage of pause time * [thresholds]
ratio process / product
Flat and hilly roads
Distance 40km: Average speed of 30 km per hour
… and resting
… and taking the wrong road
Distance 35km: Average speed of 20 km per hour
… and return
Distance 110km: Average speed of 90 km per hour
Indicators for fluency
product process duration; length in words / characters per min pauses: length, number * [level] * [thresholds] percentage of pause time * [thresholds] P‐Bursts and R‐Bursts: length and number * [thresholds] Intervals
ratio process / product
Process graph
0
500
1000
1500
2000
2500
3000
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 22
0
10000
20000
30000
40000
50000
60000
70000
Process: absolute benchmark
0
10
20
30
40
50
60
70
80
90
100
1 2 3 4 5 6 7 8 9 10
L1
L2
Interval benchmarking
1. Theoretical optimum
2. Personal optimum
576 738 630 900 792 468 936 432 144 41410 intervals
theor. optimum10 min
38:20 min
400 400 400 400 400 400 400 400 400 400
150 193 164 235 207 122 244 113 38 1080,38 0,48 0,41 0,59 0,52 0,31 0,61 0,28 0,09 0,27
char. per min
char per interval
perc. ~ opt.
Process: absolute benchmark
0
10
20
30
40
50
60
70
80
90
100
1 2 3 4 5 6 7 8 9 10
L1
L2
Interval benchmarking
2. Personal optimum divide process in 10 equal intervals allocate number of characters produced in each interval recalculate to characters per min define personal optimum calculate proportion
define personal optimum
• divide process in periods of 10 sec• calculate characters produced within this period• calculate moving average (3 periods) * 6• identify maximum
32 41 35 50 44 26 52 24 8 23 38 23 18
36 42 43 40
258
Personal percentage of personal optimum
0
10
20
30
40
50
60
70
80
90
100
1 2 3 4 5 6 7 8 9 10
L1L2
Indicators for fluency
product process duration; length in words / characters per min pauses: length, number * [level] * [thresholds] percentage of pause time * [thresholds] P‐Bursts and R‐Bursts: length and number * [thresholds] intervals absolute and personal optimum …
ratio process / product
more than 400 variables
Principal Component Analysis
Exploration of main components: converted selected L1 variables to Z‐Scores checked correlations (singularity) built components iteratively applied varimax rotation
Result: 4 componentsbased on 50 variablesexplaining 84 % of variance
Fluency components
1 Process number of pauses/words/characters per min intervalsP‐bursts
2 Pauses > 200 ms pause length and numberpauses between and within wordspause interval 200<500<1000
3 Pauses > 2000 ms pause length and numberpauses between words and sentences
4 Ratio product/process
word/characters product‐process ratioR‐bursts
But, writing is not only ... “writing”!
Multiple sources
Distance 40km: Average speed of 20 km per hour
?
GPS
Tweet vs. Email
vs.
Writing process tweet
Legend Production Cursor Position Sources
sourcesdocument
Writing process tweet
othersubscriptionmapprogramright treeleft treetop bannervisualbody textcontent
Main page
Writing process e‐mail
novice vs. expert
Writing process e‐mail: fragmentation
Pictures developed by Nikki Van De Keere
Helen Gilles
Fragmentation index
0 10 20 30 40 50 60 70 80 90 100
0 10 20 30 40 50 60 70
sources
revisions
pauzes
intervals
Characters per burst
Fragmentation %
P-Burst
R-Burst
S-Burst
Conclusion: Fluency index
Fluency can/should be described from different perspectives:
1. process2. pauses >200 ms3. pauses >2000 ms..4. process/product5. fragmentation
Result: Fluency index
60
55
60
78
34
42
74
62
87
48
0 20 40 60 80 100
process
pauses > 200 ms
pauses > 2000 ms
process/product
fragmentation
L1
L2
Further research
explore fluency index using other data sets (knowledge telling – knowledge transforming)
add linguistic process characteristics (Inputlog 6) incorporate fluency index in Inputlog
... start collaboration with Chinese writing researchers.
Thank you
… and alsoMariëlle Leijten (FWO post‐doc researcher) Jasmine Robbé (Master student Professional Communication) Nikki Van De Keere (Master in Professional Communication) Eric Van Horenbeeck (technical coordinator Inputlog) Tom Pauwaert (programmer Inputlog)
More information
Luuk Van Waes, University of Antwerp, Belgium
[email protected] ~ www.ua.ac.be/luuk.vanwaes
Mariëlle Leijten, Flanders Research Foundation, Belgium
[email protected] ~ www.ua.ac.be/marielle.leijten
www.writingpro.eu
www.inputlog.net
www.jowr.org
Number of pauses per level
0
50
100
150
200
250
within words betweenwords
betweensentences
betweenparagraphs
> 200 ms
0
10
20
30
40
50
60
within words betweenwords
betweensentences
betweenparagraphs
> 500 ms
0
5
10
15
20
25
within words betweenwords
betweensentences
betweenparagraphs
> 1000 ms
0
2
4
6
8
10
within words between words betweensentences
betweenparagraphs
> 2000 ms
Pause length per level
0
1
2
3
4
5
within words between words betweensentences
> 200 ms
0
1
2
3
4
5
within words between words between sentences
> 500 ms
0
1
2
3
4
5
within words between words between sentences
> 1000 ms
0
1
2
3
4
5
within words between words between sentences
> 2000 ms
Process: Fluency percentage of personal optimum
0
10
20
30
40
50
60
70
80
90
100
1 2 3 4 5 6 7 8 9 10
L1
L2
P‐Burst
21
6
26
7
1
10
100
2000 ms 5000 ms
Number of P‐Bursts (log scale)
L1
L2
80
284
55
232
1
10
100
1000
2000 ms 5000 ms
Length of P‐Bursts (log scale)
L1
L2
57,71 60,26
0
10
20
30
40
50
60
70
1 2
Number of R‐bursts
30,92
24,96
0
5
10
15
20
25
30
35
40
1 2
Length of R‐bursts