experimental research design and methodology in tpr
DESCRIPTION
Experimental research design and methodology in TPR. PhD Course in Translation Process Research Copenhagen, July 2014. Outline. Research design – basic concepts Experimental tools and methods to collect translation process data Examples of experimental TPR studies - PowerPoint PPT PresentationTRANSCRIPT
Experimental research design and methodology in TPR
PhD Course in Translation Process ResearchCopenhagen, July 2014
Outline Research design – basic concepts Experimental tools and methods to
collect translation process data Examples of experimental TPR studies Some practical considerations about
carrying out experiments
2
Outline Research design – basic concepts Experimental tools and methods to
collect translation process data Examples of experimental TPR studies Some practical considerations about
carrying out experiments
3
Design Starting point: I-wonder-question Rephrase as research question/hypothesisConsider What type of question/hypothesis it is Sample and population Which variables are involved
4
Design Starting point: I-wonder-question Rephrase as research question/hypothesisConsider What type of question/hypothesis it is Sample and population Which variables are involved
5
Design: Research question/hypothesis
Formulate your I-wonder-question clearly and unambiguously
Make it falsifiable Consider whether your starting point is
Question: Is there a difference between students and professional translators in terms of ST reading?
Open hypothesis: There is a difference between students and professionals in terms of ST reading
Directional hypothesis: There is a difference between students and professional translators in terms of ST reading, such that professionals spend less time on the ST 6
Design Starting point: I-wonder-question Rephrase as research question/hypothesisConsider What type of question/hypothesis it is Sample and population Which variables are involved
7
Design: Type of question/hypothesis
Differences Repeated measures: measuring effect of some
difference within one group, e.g. same translators working under different conditions or
over a period of time (longitudinal study) Independent groups: group difference between
different groups doing same task, e.g. students vs. professionals training vs. control group
Functional relations Between response and some manipulated variable
8
Design Starting point: I-wonder-question Rephrase as research question/hypothesisConsider What type of question/hypothesis it is Sample and population Which variables are involved
9
Design: Sampling and population
Inferential statistics assumes random sampling
In practice, balance between randomness and possibility
Consider population and sample Which population does my question pertain to? Is it realistic to sample from that population? Could a realistic sample pertain to a different,
but still relevant population?10
Design Starting point: I-wonder-question Rephrase as research question/hypothesisConsider What type of question/hypothesis it is Sample and population Which variables are involved
11
Design: Variables
Two important distinctions Independent/explanatory (EV),
dependent (DV) and control (CV) variables
Categorical and numerical variables
12
Design: DVs Dependent/response variable (DV): what you
are measuring or counting, e.g. Translation time
Overall Individual fixations
Translation quality Number of occurrences of e.g.
metaphors specific syntactic constructions …
13
Design: EVs Independent/explanatory variables (EVs):
Variables which according to your hypothesis may have an effect on your DV
Also called predictors Types
Item-related: e.g. task difficulty, translation direction, translation tool
Participant-related: e.g. sex/gender, L1, professional status, L2 experience
14
Design: CVs Control variables (CVs): variables to
control in order to be sure that EV is responsible for DV Experimental control Statistical control
Avoid confounds15
Design: Categorical and numerical variables
Categorical Unordered categories (nominal): e.g. sex/gender, word
class Ordered categories (ordinal): e.g. lower/middle/upper class
Numerical Discrete
Integers, finite values E.g. counts of word in a corpus
Continuous Real numbers, infinitely many values on scale E.g. reading time
16
Design: Categorical and numerical variablesTranslation experience may be construed as Nominal scale: student/professional Ordinal scale: beginning / advanced
student / professional Discrete numerical: number of years of
experience (1, 2, 3, 4…) Continuous numerical: amount of
experience (time, output)17
Design: Categorical and numerical variablesImportant ramifications for
The questions asked
The type of statistical test to be applied
18
Outline Research design – basic concepts Experimental tools and methods to
collect translation process data Examples of experimental TPR studies Some practical considerations about
carrying out experiments
19
Experimental TPR tools and methods
eye-tracking keylogging audio recording (in Translog) (video recording) (think-aloud protocols) retrospective interviews and
questionnaires
20
Eye-tracking eye-mind assumption (Just and Carpenter 1980) cogntive attention cognitive load areas of interest (AOI) eye-tracking measures
fixation count total gaze time fixation duration pupil dilation eye movements (transitions, attention shifts)
21
Keylogging
transient versions of target text revision/editing navigation pauses production speed final target texts
22
Audio recording (available in Translog) oral translations think-aloud comments
23
Questionnaires/retrospective interviews language background professional background perception of source text difficulty perception of different tasks translation challenges experienced etc.
24
Assessing the product translation quality assessments examination of translation of individual
words (e.g. metaphors, terminology, specific word classes, number of alternative translation solutions, etc.)
25
Outline Research design – basic concepts Experimental tools and methods to
collect translation process data Examples of experimental TPR studies Some practical considerations about
carrying out experiments
26
Example 1: The Process of Post-Editing:
a Pilot Study
Example 1: goal to find out (‘I wonder’) how translators,
with no post-editing training, would perform when asked to post-edit MT-produced output in comparison with the performance of a group of translators who translated the same texts manually, without any dictionary or technical assistance.
28
29
30
Example 1: research questions what are the differences in quality between
manual translations and post-edited MT output? do more corrections lead to higher quality in the
post-edited texts? what are the time differences between manual
translations and post-editing? what are the differences in allocation of
cognitive resources between manual translation and post-editing
31
Example 1: design/set-up experimental research design with
manipulation of circumstances to measure the effect on participants’ behaviour
lab environment simulating natural conditions translation rankings
32
Example 1: variables Dependent/response variables
translation time translation quality allocation of cognitive resources (ST vs. TT)
Independent/explanatory variable translation mode (manual translation vs.
post-editing)
33
Example 1: variables control variables
using the same participants with the same text for both tasks might have created an unintended repetition effect
using the same participants but different texts might have created an unintended effect of textual differences (e.g. one text more difficult than the other)
34
Example 1: experiments Modes
manual translation and post-editing Participants
8 translators and 7 post-editors Texts
three English source texts (same for both groups)
35
Example 1: experiments one group of participants translated
three texts (from scratch) from English into Danish and
one group of participants post-edited machine-translated (Google Translate) Danish versions of the the same three source texts
36
Example 1: tools/methods eye-tracking
allocation of cognitive resources (total gaze time on ST vs. TT)
keyloggingtask timekeystrokes (edit distance)final output
translation evaluationstranslation quality
37
Example 1: quality assessment QA method and procedure
7 evaluators presentation of source sentence together with four
candidate translations two sentences had been produced using manual translation
and two had been produced using post-editing (randomised and blinded)
evaluators were instructed to rank candidate translations from best to worst quality (ties permitted).
inter-rater and intra-rater agreement did evaluators agree with each other were evaluators consistents in their rankings
38
Example 1: design weaknesses sample size participant qualifications (not all worked as
professional translators) quality assessments (assessment task too
difficult, inter-rater and intra-rater agreement too low)
39
Example 2: Speaking your translation
students’ first encounter with speech recognition technology
Example 2: goal to measure the impact on the translation
process and product of using an automatic speech recognition (ASR) system compared with typing a translation and producing a sight translation without ASR
to measure the effect of training/practice with ASR on task time and quality of translations produced with ASR
41
42
Example 2: research questions (quantitative) What are the task times in the three
translation modalities (written, sight, ASR)? Is there any difference in translation quality in
the three modalities? Is there any difference in cognitive load? What is the effect on time and quality of
participants training the system and gaining more experience using it?
43
Example 2: research questions(qualitative) What are the students’ own perception of
working with an ASR system? What kind of strategies are employed by
students who experience positive effects on time and quality?
44
Example 2: design/set-up experimental research design with manipulation of
circumstances to measure the effect on participants’ behaviour
lab environment analysis of process and product longitudinal study experimental group compared with control group qualitative analyses
45
Example 2: variables Dependent/response variables
translation time translation quality cognitive load (average fixation durations)
Independent/explanatory variables translation mode (written, sight, ASR) training period
46
Example 2: variables Control variables
texts had to be as similar as possible to ensure that process/product differences across translation tasks were caused by the mode and not by the text
sequence of presentation was rotated to ensure that differences between the written and oral modalities were owing to the translation mode and not, for instance, to varying levels of difficulty
47
Example 2: experiments participants
14 translation students divided into two groups of seven: an experimental (training) group and a control group
modes written translation sight translation sight translation with speech recognition
text text excerpts taken from the same longer text to
ensure the highest possible level of similarity48
Example 2: experiments Longitudinal study:
phase 1 (baseline): all participants translated texts under three different conditions
interim period: half of the participants (experimental group) worked with the ASR program at home (partly under controlled conditions) and the other half did not (control group)
phase 2 (follow-up): all participants translated texts under three different conditions (similar to phase 1), and results from experimental group were compared with control group and related to phase 1
49
Example 2: tools/methods eye-tracking
cognitive load (fixation duration) Translog
timingsoral and written outputtransient versions of oral and written translations
evaluations of translation outputtranslation quality
retrospective interviewsstudents’ perceptions of ASR vs. written/sight
50
Example 2: quality assessment QA method and procedure
3 evaluators each evaluator assessed all three texts for all 14
participants (blinded with respect to mode) global scores were given on a scale from 1-5 comments were provided to back up scores
inter-rater agreement did evaluators agree with each other
51
Example 2: retrospective interviews
All participants: general impression of using ASR benefits and drawbacks/problems
Experimental group (training group) total training time type of texts produced using ASR general impression problems encountered
52
Example 2: case studies small samples weaken the validity of
quantitative results large individual differences between
translators (individual translator profiles) in-depth analyses are extremely time-
consuming identification of participants who confirm
(reject) the hypothesis detailed analysis of strategies, gaze and
keystroke patterns, choice of words etc. 53
Example 2: design weaknesses ecological validity
unfamiliar setting too little control during training period
54
Outline Research design – basic concepts Experimental tools and methods to
collect translation process data Examples of experimental TPR studies Some practical considerations about
carrying out experiments
55
Setting up the experiment make detailed description (’protocol’) of
experiment, including every action that needs to be carried out
write instruction for participants (so that they all receive the same information)
run pilot(s)
56
Running the experiment increase eye-tracking data quality by
calibrating subjects before each new session
optimising light conditions (no direct sunlight)
checking distance to screen check settings in Translog check audio quality if using audio
recordings57