usability of continuous speech recognition programs
DESCRIPTION
Usability of Continuous Speech Recognition Programs. Hsin Eu Committee: Alan Hedge, Ph.D. Geri Gay, Ph.D. Design and Environmental Analysis Cornell University. Overview. - PowerPoint PPT PresentationTRANSCRIPT
Usability of ContinuousSpeech Recognition Programs
Hsin EuCommittee: Alan Hedge, Ph.D.
Geri Gay, Ph.D. Design and Environmental Analysis
Cornell University
2
Overview
Continuous speech recognition programs
were brought to market at the end of
1997, with claims that they were capable
of recognizing users’ continuous speech
and translating this into text processing
software accurately.
3
Research Goal
The research goal was to determine the
critical factors that affect the usability of
speech recognition programs in order to
generate universal guidelines for the
future design of continuous speech
recognition software.
4
Literature Review
1. Speech Recognition Technology
• Terminology
• History of Speech Recognition
• Components of Speech Recognition
• Factors Influence the Performance of Speech Recognition
5
Literature Review (Cont.)
2. Using Speech Recognition
• Strengths and Limitations
• Applications of Speech Recognition
6
Literature Review (Cont.)
3. Current Speech Recognition Software
• Setup, Training, and Dictation
• Features of Current Speech Recognition Programs
• Product Performance
7
Literature Review (Cont.)
4. Human Computer Interaction in Speech Recognition
• The Interaction between Users and Recognition Programs
• Program and Human Errors
• User Characteristics and Task Performance
8
Literature Review (Cont.)
Human Computer Interaction in Speech Recognition (cont.) • Guidelines for the Interface Design
(excerpted from McLeod, 1988)
- Procedures for developing and implementing an application to meet the needs of the users, including vocabulary design, feedback and error recovery strategies and training techniques.
9
Literature Review (Cont.)
• Guidelines for the Interface Design (excerpted from McLeod, 1988)
-Procedures for identifying and controlling sources of inter- and intra- person variability.
-Consideration of the implications of the technology on the organization of working groups.
-Techniques for assessing the usability of a recognition system, including overall task performance, physical and mental workload and users subjective responses.
10
Research I: Web Survey
11
Research I: Web Survey (Cont.)
I-1. Methods
• Subjects: 351 respondents (including 143 CSRP-users)
Gender CSRP- User Non- User TotalFemale 34 (23.9%) 88 (42.9%) 122 (35.2%)Male 108 (76.1%) 117 (57.1%) 225 (64.8%)Total 142 (100.0%) 205 (100.0%) 347 (100.0%)Age CSRP- User Non- User Total18-25 17 (12.0%) 59 (28.6%) 76 (21.9%)26-50 95 (66.9%) 116 (56.3%) 211 (60.6%)51 Plus 30 (21.1%) 31 (15.1%) 61 (17.5%)Total 142 (100.0%) 206 (100.0%) 348 (100.0%)
12
Research I: Web Survey (Cont.)
• Survey Instrument
Section A: General Computer Use13 questions/ 45 items,
completed by all respondents (approx. 3-5 minutes)
Section B: Usability of CSRP31 questions / 201 items,
completed by CSRP-users (approx. 15-20 minutes) • Procedure
13
Research I: Web Survey (Cont.)
I-2. Results and Discussion on Findings• General Computer Use
General Computer Use Subject Count(%)
Valid N
How long have you been using a computer?Less than 1 year 2 (0.6%)1-3 years 18 (5.1%) 347More than 3 years 327 (93.2%)
How many days a week do you use acomputer?
1-3 days 4 (1.2%)4-5 days 51 (14.6%) 3466-7 days 291 (82.9%)
14
Research I: Web Survey (Cont.)
• General Computer Use (Cont.)
The time a day computer use occurs in eachplace
Mean (hour) SD(hour)
Office 3.79 2.84School 0.66 1.62Home 2.16 1.96Other 0.25 1.16
Total time a day on average computer useoccurs
Users (%) Valid N
1-3 hours 48 (13.7%) 3514-6 hours 99 (28.2%)7-9 hours 165 (47%)More than 10 hours 39 (11.1%)
15
Research I: Web Survey (Cont.)
Tasks CSRP-users Non-users Significance Allrespondents
Composingdocuments
65.03 % 36.76 % T=8.233,df=332,p=.000
48.31 %
Database input 45.10 % 27.79 % T=4.780,df=286,p=.000
34.84 %
Computerimagemanipulation
18.87 % 11.59 % T=2.986,df=271,p=.003
14.54 %
Searchinginformation
48.53 % 30.82 % T=4.881,df=349,p=.000
38.03 %
Browsinginformation
49.30 % 30.29 % T=5.174,df=349,p=.000
38.03 %
16
Research I: Web Survey (Cont.)
• Usability of CSRP
Dragon NaturallySpeaking IBM Via Voice L&H VoiceXpress(Kurzweil)
⟨ Personal 2.0 with Corel ⟨ Executive ⟨ Standard⟨ Preferred 2.0 ⟨ Gold ⟨ Advanced⟨ Preferred 3.0 ⟨ Office ⟨ Professional⟨ Standard 3.0 ⟨ Home ⟨ for Medicine, General
Medicine Edition⟨ Professional ⟨ Topic ⟨ for Medicine, Specialty
Edition⟨ for Teens ⟨ IBM
MedSpeak/ Radiology⟨ Legal Suite ⟨ Professional/ Specialty
Vocabularies⟨ Medical Suite⟨ Developer Suite
17
Research I: Web Survey (Cont.)
• Usability of CSRP
CSRP Use Users (%) Valid N
How long have/ had you used your CSRP?1-6 months 64 (44.8%)7-11 months 27 (18.9%) 1411-2 years 38 (26.6%)More than 2 years 12 (8.4%)
How many days a week do/ did you useyour CSRP?
1-2 days 42 (29.4%)3-5 days 28 (19.6%) 1366-7 days 66 (46.2%)
18
Research I: Web Survey (Cont.)
• Usability of CSRP (Cont.)
The time a day CSRP use occurs in eachplace
Mean (hr) SD (hr)
Office 1.50 1.76School 0.14 0.64Home 0.92 1.40Other 0.20 0.91
Total time a day on average CSRP useoccurs
Users (%) Valid N
1-3 hours 93 (65.1%)4-6 hours 28 (19.6%) 1437-9 hours 10(7.0%)More than 10 hours 1 (0.7%)
19
Research I: Web Survey (Cont.)
• Usability of CSRP (Cont.)
Dictation (speech to text) Average Score (SD)
2.76 (1.05)
2.92 (1.05)
2.79 (1.06)
2.82 (1.03)
Emails
Letters
Notes
Reports or papers
Slides 2.08 (1.24)
Navigation (control)
Within a specific program 2.45 (0.96)
Between programs 2.12 (1.05)
Editing/ correcting documents 2.42 (1.01)
Read back (text to speech)
Check content of documents 2.71 (1.13)
Review my works 2.72 (1.03)
20
Research I: Web Survey (Cont.)
• Usability of CSRP (Cont.)
CSRP Characteristics Average Score (SD)
Varieties of functions 2.92 (0.74)
Product accuracy 3.81 (0.63)
Vocabulary capacity 3.39 (0.70)
Dictation speed 3.42 (0.76)
Ability to expand vocabulary 3.53 (0.73)
Easy to use 3.44 (0.77)
Price 2.48 (0.86)
Compatibility with other software 3.30 (0.81)
User (technical) support 2.81 (0.86)
Program upgrade support 2.99 (0.86)
21
Research I: Web Survey (Cont.)
• Usability of CSRP (Cont.)
Aspects Average Score (SD)
Varieties of functions 2.66 (0.86)
Product accuracy 2.86 (1.03)
Vocabulary capacity 3.32 (0.84)
Dictation speed 2.92 (0.95)
Ability to expand vocabulary 3.33 (0.81)
Easy to use 2.99 (0.92)
Price 2.59 (1.01)
Compatibility with other software 2.76 (0.90)
User (technical) support 2.44 (1.01)
Program upgrade support 2.72 (0.97)
Overall satisfaction 2.91 (0.99)
22
Research I: Web Survey (Cont.)
• Usability of CSRP (Cont.)
Most preferred (%) 2nd most preferred (%) Valid NComposingdocuments
Voice (51.0%) Voice & Keyboard(25.9%)
137
Correctingmistakes indocuments
Keyboard (24.5%) Voice (23.1%) 138
Editingdocuments
Voice (23.8%) Keyboard (23.1%) 136
Database input Keyboard (42.7%) Voice (23.1%) 123
23
Research I: Web Survey (Cont.)
• Usability of CSRP (Cont.)
Most preferred (%) 2nd most preferred (%) Valid NComputerimagemanipulation
Mouse (30.1%) Keyboard & Mouse(19.6%)
113
Searching &browsing
Keyboard (25.2%) Mouse (16.8%) 132
Navigatingwithin aprogram
Voice (25.2%) Mouse (21.0%) 135
Navigatingbetweenprograms
Voice (21.7%) Mouse (21.0%) 133
24
Research I: Web Survey (Cont.)
• Usability of CSRP (Cont.)
DNS- Preferred3.0
DNS-Professional
Significance
Composingdocuments
59.35% 74.24% T=-2.063, df= 62,P<0.05
Databaseinput
45.48% 46.67% Not significant
Computerimagemanipulation
13.87% 23.44% No significant
Searchinginformation
37.10% 58.18% T=-2.874, df=62,p<0.01
Browsinginformation
40.00% 58.18% T=-2.203, df=62,p<0.05
25
Research I: Web Survey (Cont.)
I-3. Discussion• Limitations
- Survey distribution
• Future Research
- Survey length
- Survey format
- Qualitative information
26
Research II: Usability Testing
II-1. Methods• Subjects: 10 Cornell students
- 5 females and 5 males- 8 CSRP-novices and 2 CSRP-users- Age ranged 21-30
• Setting and Instruments- MVR computer lab- Dell Pentium II MMX PC/ Windows 98- Dragon NaturallySpeaking Preferred 3.0
27
Research II: Usability Testing (Cont.)
II-1. Methods (cont.)
• Procedure- Setup and training
28
Research II: Usability Testing (Cont.)
• Procedure (cont.) - Research design
Method of Transcription / Editing/ Readability of Document*Subj.#
Level ofExperienceon CSRP
Section 1 Section 2 Section 3
1 Novice Dictate/ Type/ Ec Type/ Type/ Ea Type/ Type/ D
2 Novice Type/ Type/ Ec Dictate/ Type/ Ea
3 Novice Dictate/ Type/ Ea Type/ Type/ Ec
4 Novice Type/ Type/ Ea Dictate/ Type/ D Dictate/ Type/ Ec
29
Research II: Usability Testing (Cont.)
- Research design (cont.)
Method of Transcription / Editing/ Readability of Document*Subj.#
Level ofExperienceon CSRP
Section 1 Section 2 Section 3
5 Some Dictate/ Type/ Ec Type/ Type/ Ea
6 Novice Type/ Type/ Ec Dictate/ Type/ Ea Dictate/ Type/ D
7 Novice Dictate/ Type/ Ea Type/ Type/ Ec
8 Novice Type/ Type/ Ea Dictate/ Type/ Ec
9 Much Dictate/ Voicing/ D Dictate/ Voicing/ Ea
10 Novice Dictate/ Type/ Ec Type/ Type/ D Type/ Type/ Ea
30
Research II: Usability Testing (Cont.)
II-1. Methods (cont.)
• Procedure (cont.)- Dependent variables
1. Transcription time2. Number of transcription
errors3. Editing time4. Total completion time
31
Research II: Usability Testing (Cont.)
II-2. Results and Discussion on Findings• Modality of Transcription
• Gender
Dictating Typing Significance
Transcription Time(sec/ word)
.526 (.105) 1.069 (2.60)t=-6.428, df=7,p=. 000
Number ofTranscription Errors(errors/ word)
.131 (.067) .041 (.022)t=-3.636, df=7,p=. 008
Type-Editing Time(sec/ word)
1.058 (.244) .577 (.198)t=-5.444, df=7,p=. 001
Total Completion Time(sec/ word)
1.584 (.209) 1.645 (.394) Not significant
32
Research II: Usability Testing (Cont.)
II-2. Results and Discussion on Findings (cont.)
• Modality of Editing
Modality of Editing, Easy Documents
Edit by Typing (N=8) Edit by Voicing (N=1)
Transcription Time(sec/ word)
0.526 (.105) .438
Number ofTranscription Errors(errors/ word)
.131 (.067) .128
Editing Time(sec/ word)
1.058 (.244) 1.203
Total Completion Time(sec/ word)
1.584 (.209) 1.641
33
Research II: Usability Testing (Cont.)
II-2. Results and Discussion on Findings (cont.)
• Modality of Editing (cont.)
Modality of Editing, Difficult Documents
Edit by Typing (N=2) Edit by Voicing (N=1)
Transcription Time(sec/ word)
0.611 (.017) .549
Number ofTranscription Errors(errors/ word)
.195 (.084) .111
Editing Time(sec/ word)
1.633 (.554) 2.062
Total Completion Time(sec/ word)
2.244 (.536) 2.611
34
Research II: Usability Testing (Cont.)
II-2. Results and Discussion on Findings (cont.)
• Experience on CSRP/DNSTranscription by Dictating
Experience of CSRP/DNS
None (N=8) Some (N=1) Much (N=1)
Transcription Time(sec/ word)
.526 (.105) .480 .438
Number ofTranscription Errors(errors/ word)
.131 (.067) .037 .128
Type-Editing Time(sec/ word)
1.058 (.244) .471 N. A.
Total Completion Time(sec/ word)
1.584 (.209) .951 N. A.
35
Research II: Usability Testing (Cont.)
II-2. Results and Discussion on Findings (cont.)
• Experience on CSRP/DNS (cont.)Transcription by Typing
Experience of CSRP/DNS
None (N=8) Some (N=1)
Transcription Time(sec/ word)
1.069 (2.60) 1.191
Number ofTranscription Errors(errors/ word)
.041 (.022) .036
Type-Editing Time(sec/ word)
.577 (.198) .414
Total Completion Time(sec/ word)
1.645 (.394) 1.606
36
Research II: Usability Testing (Cont.)
II-2. Results and Discussion on Findings (cont.)
• Readability of ArticlesTranscription by Typing
Readability of Article (N=2)
Easy DifficultSignificance
Transcription Time(sec/ word)
1.211 (.383) 1.525 (.440) Not significant
Number ofTranscription Errors(errors/ word)
.030 (.031) .051 (.003) Not significant
Type-Editing Time(sec/ word)
.510 (.332) .943 (.272) Not significant
Total Completion Time(sec/ word)
1.721 (.716) 2.467 (.713)t= -397.339,df=1, p= .002
37
Research II: Usability Testing (Cont.)
II-3. Discussion • Compare Findings to Previous Research
Research Tested program Accuracy (%) Corrected words per minute
Present study DNS Preferred 3.0 < 95% 29.8
Karat et al., 1999 DNS Preferred 2.0 N. A. 25.1
Poor, 1998 DNS Preferred 2.0 < 95% N. A.
Linderholm, 1998 DNS Preferred 3.0 N. A. 43.0
38
Research II: Usability Testing (Cont.)
II-3. Discussion (cont.)
• Limitations - Sample size
• Future Research
- CSRP-users
- Testing time
- Article readability
- Human performance v.s. program performance
39
Conclusion
- Program accuracy- Program reliability- Requirement of user-dependent training - Requirement of memorization- Ease of error correction - Ability to learn from mistakes- Accommodation for people with disabilities- Hardware compatibility- Environmental noise level
• Critical Factors that affect CSRP usability
40
Conclusion (Cont.)
A continuous speech recognition program should- have high program accuracy- have high program reliability- eliminate the requirement of user-
dependent training - reduce the requirement of memorization- maximize the ease of error correction - have the ability to learn from mistakes- accommodate the needs of people with
disabilities- provide a wide range of hardware
compatibility- minimize the sensitivity to environmental
noise
• Guidelines for Future Design