novice and expert performance of keyscretch: a gesture-based...

13
IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS, VOL. 44, NO. 4, AUGUST 2014 511 Novice and Expert Performance of KeyScretch: A Gesture-Based Text Entry Method for Touch-Screens Vittorio Fuccella, Mattia De Rosa, and Gennaro Costagliola, Member, IEEE AbstractKeyScretch is a text entry method for devices equipped with touch-screens, based on a menu-augmented soft keyboard. In these keyboards, a menu containing a small number of fre- quent characters is shown, while a key is pressed, allowing further character entry by menu selection. KeyScretch improves the previ- ously studied menu-based methods by enabling the interpretation of compound strokes, which allow the input of text chunks longer than two characters. The performance of the method is analyzed on different kinds of touch-screens: First, we present a 25-session user study on a stylus-based device, showing that an instance of the method optimized for Italian can be learned in a reasonable time by the users and significantly outperforms the traditional method based on the tapping interaction. Then, we define and validate a model for predicting expert text entry rates on finger-based devices. The predicted rates for instances of KeyScretch optimized for dif- ferent Western languages vary from about 44–50 words/min on the Qwerty layout, enabling improvements in the range of 30–49% as compared with the traditional method. Index Terms—KeyScretch, menu, soft keyboard, stroke, text entry. I. INTRODUCTION T HE spread of interactive surfaces has prompted new solu- tions for text entry on touch-screens. The most common technique to enter text on smartphones and tablets is tapping on soft keyboards. Alternative techniques include handwriting [1] and speech recognition. The use of gestures (instead of simple taps) on soft keyboards is an alternative to the aforementioned techniques, and, according to some (e.g., [2]), gesturing is a more natural interaction than tapping for a human user. A soft keyboard is drawn on a touch-screen and is character- ized by a layout that establishes a mapping between characters and 2-D regions on the keyboard (the keys). The keys can have different shapes and dimensions. As the devices are small, a pointer (pen or finger) or two thumbs are generally used for text entry, resulting in a low text entry speed. Although more efficient keyboard layouts exist [3]–[6], the traditional Qwerty layout is the most common, due to user familiarity. Manuscript received received November 20, 2013; revised February 6, 2014, March 7, 2014, March 19, 2014; accepted March 27, 2014. Date of publication April 25, 2014; date of current version July 11, 2014. This paper was recom- mended by Editor-in-Chief E. J. Bass. The authors are with the Dipartimento di Informatica, Universit` a degli Studi di Salerno, 84084 Fisciano, Italy (e-mail: [email protected]; [email protected]; [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/THMS.2014.2314447 The use of gestures on a soft keyboard has been referred to as gesture-based [7] or stroke-based [8], [9] text entry. In [9], it has been modeled as the alternation of strokes and jumps on the keyboard. A stroke starts with a pointer-down event and ends with a pointer-up event and produces a sequence of characters. A jump brings the pointer from the location of the pointer-up event of the previous stroke to the location of the pointer-down event of the subsequent one and does not produce any character. According to the text entry method, an input gesture can produce an entire word (as in Shapewriter [10]) or a shorter text chunk (as in menu-augmented soft keyboards [11]). Both have pros and cons. With the former methods, the input of the (particularly frequent) space character can be spared, since it is automatically entered at the end of a gesture. Nevertheless, a one-to-one mapping between strokes and words would include many gestures, likely more than that could be easily learned [12]. With the latter methods, the number of different gestures to learn is limited, but text must be suitably segmented by the user, inevitably expanding the text entry time in the early stages of practice. Another disadvantage of having a large number of gestures concerns the greater ambiguity in their interpretation (e.g., in Shapewriter, the disambiguation of a gesture similar to more than one template is obtained through an additional selection of a word from a list of candidates). KeyScretch is a gesture-based text entry method that improves menu-augmented soft keyboards [11]. In these keyboards, a radial menu is shown around a character as soon as the char- acter is pressed. Each menu item is associated with a char- acter. The previous menu-based methods enabled the input of digraphs through linear gestures called flicks. The innovation of KeyScretch lies in the introduction of compound strokes con- necting in a sequence more characters located in the menu. A longer text than a simple digraph can be entered with a sin- gle stroke. An example of a soft keyboard augmented with menus containing four items associated with the first four vow- els is shown in Fig. 1. The gestures used in KeyScretch are also referred to as “scretches” (a portmanteau of the terms “scratch” and “sketch”). In this paper, we present the method and an evaluation of its levels of performance on different types of miniaturized soft keyboards. One user study lasts 25 sessions and is carried out using an instance of KeyScretch optimized for the Italian lan- guage on a stylus-based device. Costagliola et al. [13] intro- duced the text entry method and tested it with six participants. Here, we include ten participants and extend the models based on Fitts’ law [14] to predict the expert performance of KeyScretch 2168-2291 © 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications standards/publications/rights/index.html for more information.

Upload: others

Post on 29-Oct-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Novice and Expert Performance of KeyScretch: A Gesture-Based …static.tongtianta.site/paper_pdf/d00f280a-e01d-11e9-8343... · 2019. 9. 26. · Salerno,84084Fisciano,Italy(e-mail:vfuccella@unisa.it;matderosa@unisa.it;

IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS, VOL. 44, NO. 4, AUGUST 2014 511

Novice and Expert Performance of KeyScretch:A Gesture-Based Text Entry Method

for Touch-ScreensVittorio Fuccella, Mattia De Rosa, and Gennaro Costagliola, Member, IEEE

Abstract—KeyScretch is a text entry method for devices equippedwith touch-screens, based on a menu-augmented soft keyboard.In these keyboards, a menu containing a small number of fre-quent characters is shown, while a key is pressed, allowing furthercharacter entry by menu selection. KeyScretch improves the previ-ously studied menu-based methods by enabling the interpretationof compound strokes, which allow the input of text chunks longerthan two characters. The performance of the method is analyzedon different kinds of touch-screens: First, we present a 25-sessionuser study on a stylus-based device, showing that an instance of themethod optimized for Italian can be learned in a reasonable timeby the users and significantly outperforms the traditional methodbased on the tapping interaction. Then, we define and validate amodel for predicting expert text entry rates on finger-based devices.The predicted rates for instances of KeyScretch optimized for dif-ferent Western languages vary from about 44–50 words/min on theQwerty layout, enabling improvements in the range of 30–49% ascompared with the traditional method.

Index Terms—KeyScretch, menu, soft keyboard, stroke, textentry.

I. INTRODUCTION

THE spread of interactive surfaces has prompted new solu-tions for text entry on touch-screens. The most common

technique to enter text on smartphones and tablets is tapping onsoft keyboards. Alternative techniques include handwriting [1]and speech recognition. The use of gestures (instead of simpletaps) on soft keyboards is an alternative to the aforementionedtechniques, and, according to some (e.g., [2]), gesturing is amore natural interaction than tapping for a human user.

A soft keyboard is drawn on a touch-screen and is character-ized by a layout that establishes a mapping between charactersand 2-D regions on the keyboard (the keys). The keys can havedifferent shapes and dimensions. As the devices are small, apointer (pen or finger) or two thumbs are generally used fortext entry, resulting in a low text entry speed. Although moreefficient keyboard layouts exist [3]–[6], the traditional Qwertylayout is the most common, due to user familiarity.

Manuscript received received November 20, 2013; revised February 6, 2014,March 7, 2014, March 19, 2014; accepted March 27, 2014. Date of publicationApril 25, 2014; date of current version July 11, 2014. This paper was recom-mended by Editor-in-Chief E. J. Bass.

The authors are with the Dipartimento di Informatica, Universita degli Studi diSalerno, 84084 Fisciano, Italy (e-mail: [email protected]; [email protected];[email protected]).

Color versions of one or more of the figures in this paper are available onlineat http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/THMS.2014.2314447

The use of gestures on a soft keyboard has been referred toas gesture-based [7] or stroke-based [8], [9] text entry. In [9], ithas been modeled as the alternation of strokes and jumps on thekeyboard. A stroke starts with a pointer-down event and endswith a pointer-up event and produces a sequence of characters.A jump brings the pointer from the location of the pointer-upevent of the previous stroke to the location of the pointer-downevent of the subsequent one and does not produce any character.

According to the text entry method, an input gesture canproduce an entire word (as in Shapewriter [10]) or a shortertext chunk (as in menu-augmented soft keyboards [11]). Bothhave pros and cons. With the former methods, the input of the(particularly frequent) space character can be spared, since it isautomatically entered at the end of a gesture. Nevertheless, aone-to-one mapping between strokes and words would includemany gestures, likely more than that could be easily learned [12].With the latter methods, the number of different gestures tolearn is limited, but text must be suitably segmented by theuser, inevitably expanding the text entry time in the early stagesof practice. Another disadvantage of having a large number ofgestures concerns the greater ambiguity in their interpretation(e.g., in Shapewriter, the disambiguation of a gesture similarto more than one template is obtained through an additionalselection of a word from a list of candidates).

KeyScretch is a gesture-based text entry method that improvesmenu-augmented soft keyboards [11]. In these keyboards, aradial menu is shown around a character as soon as the char-acter is pressed. Each menu item is associated with a char-acter. The previous menu-based methods enabled the input ofdigraphs through linear gestures called flicks. The innovation ofKeyScretch lies in the introduction of compound strokes con-necting in a sequence more characters located in the menu. Alonger text than a simple digraph can be entered with a sin-gle stroke. An example of a soft keyboard augmented withmenus containing four items associated with the first four vow-els is shown in Fig. 1. The gestures used in KeyScretch are alsoreferred to as “scretches” (a portmanteau of the terms “scratch”and “sketch”).

In this paper, we present the method and an evaluation ofits levels of performance on different types of miniaturized softkeyboards. One user study lasts 25 sessions and is carried outusing an instance of KeyScretch optimized for the Italian lan-guage on a stylus-based device. Costagliola et al. [13] intro-duced the text entry method and tested it with six participants.Here, we include ten participants and extend the models based onFitts’ law [14] to predict the expert performance of KeyScretch

2168-2291 © 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications standards/publications/rights/index.html for more information.

Page 2: Novice and Expert Performance of KeyScretch: A Gesture-Based …static.tongtianta.site/paper_pdf/d00f280a-e01d-11e9-8343... · 2019. 9. 26. · Salerno,84084Fisciano,Italy(e-mail:vfuccella@unisa.it;matderosa@unisa.it;

512 IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS, VOL. 44, NO. 4, AUGUST 2014

Fig. 1. Qwerty keyboard layout augmented with a menu (an Android imple-mentation of an Italian instance).

with different languages and keyboard layouts on finger-baseddevices. The model is empirically validated in typing taskswhere a user must transcribe a short phrase. Then, the model isused to estimate the speed for entering long input text througha simulation.

The remainder of this paper is organized as follows: The nextsection presents the main text entry methods on touch-screens.Section III describes the text entry method in details. Section IVpresents the user study and its results. Section V presents thepredictive model. Finally, a discussion concludes this paper.

II. RELATED WORK

Text entry on devices equipped with a touch-screen is ofteninefficient, because of the difficulty of using multiple fingers.A widespread method is the use of a soft keyboard (e.g., witha Qwerty layout) on which the user has to tap to enter a singlecharacter at a time.

On some devices, handwriting recognition (see [1] for asurvey) is proposed as an alternative to the use of the soft key-board. The recognition of one character at a time, implementedin methods such as Unistrokes [15] and Graffiti [16], is suitablefor small devices and facilitates recognition. However, the dif-ficulty in effectively recognizing handwriting and its relativelyslow speed (in the 15–25 words/min (wpm) range [17], [18])have limited the spread of such a technique.

Predictive input methods [19] reduce the effort required toenter text by predicting what the user is entering. A typical pre-dictive handwriting system allows the user to select an item froma list of candidates, in order to skip manual writing. Predictivetext entry with English language is not necessarily faster thanjust simply finishing typing the words [20]. Predictive methodshave been applied to several systems for text entry and hand-writing recognition [20], [21].

A. Layout Optimization

The Qwerty layout was designed for mechanical keyboards,not soft keyboards. An efficient keyboard layout should mini-mize the distance between keys with a high probability of beingtapped consecutively for the target languages. The frequencyof the digraphs in the target languages (e.g., English) has beenanalyzed and has resulted in the proposal of specific layouts,such as Fitaly [3], OPTI and OPTI II [4], Metropolis [5], andAtomik [6]. Although these layouts enable faster and more ac-curate entry as compared with the Qwerty layout, they hada limited adoption because of a lack of user familiarity [22].

A compromise between performance and familiarity is thequasi-Qwerty keyboard layout where letters are located at mostone step (key) away from their original positions on the Qwertylayout [23].

In [24], languages including German, French, Spanish, andChinese have been considered and different optimized lay-outs for each of them and an optimized common layout weredemonstrated.

B. Gesture-Based Text Entry on Soft Keyboards

Another factor influencing typing speed and comfort is thetype of entry interaction. Surfaces enable gesture-based interac-tions in addition to key presses. The introduction of keyboardssupporting gestural input dates back to 1982 [2]. In such a key-board design, gestures could only connect contiguous charac-ters. Although the layout was designed to maximize the numberof characters per stroke (cps), in most cases only a word chunkcould be entered through a single gesture.

In order to allow the users to enter a whole word (anda trailing space) with a single stroke (word-level unistroke)without ambiguity, a circular layout was introduced with theCirrin soft keyboard [25]. The problem of ambiguity was solvedwith the proposal of the SHARK2 system [26] (also known asShapewriter [27]). The method does not require a specializedlayout: Disambiguation is performed by storing a template foreach word in a lexicon. The paradigm supports a gradual andseamless transition from the novice mode based on visuallyguided tracing to the expert mode in which recall-based gestur-ing prevails. Inspired by SHARK2, a design has been proposedin [8] to ramp the unistroke letters into word-level unistrokesthrough autocompletion. Autocompletion is invoked through theseamless continuation of the stroke in a pigtail and by traversingone of the four sides of the bounding box of the stroke.

For performance reasons, some text entry methods requirethe user to segment the text into chunks not necessarily corre-sponding to words. An example is shorthand, whose purposeis to accelerate writing by associating an entire syllable with asingle sign. Since a large set of signs can require long learn-ing times, some methods introduced schemes indicating thepath the pointer should follow, such as the pentagrid used inVirHKey [28]. Menu-augmented soft keyboards [29] enable thepossibility of entering digraphs through a simple pointer stroke:The first character is the one associated with the key where thepointer is pressed; the second is identified by the selected menuitem. A menu with eight items was tested by Isokoski [11].A previous menu-based text entry method is T-Cube [30], en-abling the input of a single character per menu item selection.In two different designs, soft keyboards have been augmentedwith multitouch gestures to support the input of nonalphanu-meric characters [31] and to facilitate the editing of text [32].Another approach enabling the entry of two characters with asingle stroke combination of a conventional soft keyboard and aUnistroke recognizer is presented in [33]. Some methods allowthe user to enter text without ever lifting the pointer from thesurface, such as Quikwriting [34] and Dasher [35].

Page 3: Novice and Expert Performance of KeyScretch: A Gesture-Based …static.tongtianta.site/paper_pdf/d00f280a-e01d-11e9-8343... · 2019. 9. 26. · Salerno,84084Fisciano,Italy(e-mail:vfuccella@unisa.it;matderosa@unisa.it;

FUCCELLA et al.: NOVICE AND EXPERT PERFORMANCE OF KEYSCRETCH: A GESTURE-BASED TEXT ENTRY METHOD 513

III. KeyScretch TEXT ENTRY METHOD

KeyScretch is a text entry method based on a menu-augmentedsoft keyboard. It introduces the use of compound strokes in theradial menu surrounding a key to enter character sequences.The menu is shown while the key is pressed and contains asmall number of frequent characters, which can be entered bytouching them in succession. Besides the direct use of the spacebar, a space character can be inserted by ending a stroke insidethe character key area. The strokes are scale invariant, i.e., theycan be performed with a user-desired size.

In KeyScretch, using a menu containing n charactersx1 , . . . , xn , a stroke produces a text chunk described by thefollowing regular expression:

.[x1x2 . . . xn ] + [ ]? (1)

The aforementioned pattern matches a text chunk starting withany character (specified in (1) by the starting “.”) chained to asequence of one or more characters from the {x1 , x2 , . . . , xn}set, (specified by [x1x2 . . . xn ]+) possibly ending with a spacecharacter, ([ ]?).

Although radial menus containing up to 12 characters havebeen tested in the scientific literature, we will only considermenus with four items. This choice is based on previous findings[36] in menu item selection which show that a small and evennumber of items is preferable. Furthermore, a reasonably smallnumber of items facilitates the recognition of the compoundstrokes. Another design choice is to have the same menu layoutfor all keys, except for one or at most two carefully chosenexceptions, to improve the learnability of the method.

When shown, a menu with n items divides the keyboard areain n + 1 zones: those corresponding to the menu items andthe central key area. By convention, we establish a numberingfor the zones, assigning 0 to the key area and a progressivenumber starting from 1 with the topmost sector and proceedingclockwise. Using such a numbering, a scretch template can beuniquely identified by a string. The string is composed of theconcatenation of the identifiers of the zones selected by thescretch. Some of the scretches produced using a menu with fouritems and their identifiers are shown in Fig 2. In the figure,the scretches are grouped by the number of segments. One ofthe samples shown in the figure (the scretch identified by thestring 0330) shows that a text chunk containing a double lettercan be entered by selecting the same item twice. Except forthose producing a text containing double letters, the scretchescomposed of k segments produce text chunks whose length isk + 1 characters.

KeyScretch can be instantiated differently for different lan-guages. In particular, the instances can differ in the number ofmenu items, the characters to associate to the menu items, andthe menu layout. Although it is possible to associate a differentmenu with each key, such a choice may lead to long learningtimes. In the following, we use the function M(c) to identify thecharacters associated with the menu of the key corresponding tothe character c. In particular, M(c) = 〈x1x2 . . . xi . . . xn 〉 willindicate that the character xi is associated with the sector i ofthe menu of the key corresponding to the character c.

Fig. 2. Scretches and their identifiers using a menu with four items.

03140 02 0 02

Fig. 3. Four strokes needed to enter the text “ciao gente.”

We adopted qualitative criteria to maximize text entry perfor-mance. An instance of KeyScretch must:

1) maximize the use of the menu, i.e., maximize the numberof characters matched by the pattern (1) in the text;

2) be easily remembered by users;3) minimize the length of the jumps.For example, using the Italian language as a target, we may

assign the first four vowels of the alphabet to the menu items,M(c) = 〈aeio〉 for each c. This choice is compliant with thethree enumerated properties: The vowels are the most frequentletters in the target language, accounting for a cumulative fre-quency of about 45% [37]; furthermore, they can be easily re-membered by the users; finally, they have the property of beingperipheral in the Qwerty layout: the user can more easily selecta menu item rather than jumping to a far key. A possible layoutusing the chosen set, that may perform well for the Italian lan-guage, is shown in Fig. 1. A systematic way to find the optimalmenu layout for a target language is described in Section V-C1.

With the chosen instance, the interaction sequence to enterthe Italian text ciao gente (hello folks, in English) is shownin Fig. 3. The string is ten characters long, but it can be en-tered with a sequence of four strokes (taps or scretches). Thestrokes correspond to the following sequence of text chunks{ciao }{ge}{n}{te}. Three text chunks out of four are entered

Page 4: Novice and Expert Performance of KeyScretch: A Gesture-Based …static.tongtianta.site/paper_pdf/d00f280a-e01d-11e9-8343... · 2019. 9. 26. · Salerno,84084Fisciano,Italy(e-mail:vfuccella@unisa.it;matderosa@unisa.it;

514 IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS, VOL. 44, NO. 4, AUGUST 2014

through a scretch and only one through a tap. The first strokeallows entry of up to five characters.

A. Interpretation of the Scretches

The interpretation of the scretches is performed through arecognizer, which takes as input a stroke and returns the associ-ated text. The simplest recognizer is the target-based one: Thetext is produced on the basis of the sequence of menu itemstouched by the user. Since this recognizer is sensitive to smallinaccuracies in menu selection (e.g., articulatory errors [36]),we proposed two more effective recognizers: the StrokeMatcherand the Geometric recognizers [38].

The Geometric recognizer, which is used in the prototypeevaluation here, uses stroke-based techniques consisting of de-composing the stroke into a sequence of basic segments, eachrepresenting the movement of the pointer to reach the desiredmenu item. This is performed by first interpreting the strokeas a polyline. In the subsequent steps, the polyline is cleanedby identifying and correctly handling segments which do notrepresent complete movements and segments produced by error(tails). Finally, the segments of the clean polyline are translatedinto a chain code [39] and interpreted in order to produce text.The Geometric recognizer reduces the average error by about30% over the simple target-based one.

IV. PROTOTYPE EVALUATION

We ran a prototype evaluation aimed at comparing the perfor-mance of KeyScretch with that of the traditional method basedon tapping on a soft keyboard with a Qwerty layout (referredto as “the traditional method,” in the following). Besides eval-uating the text entry speed and accuracy of the method, ourgoal was to understand the efficiency and the learnability of thescretches.

The experiment lasted 25 sessions. The first 20 were morefocused on evaluating the performance with and learning of thescretches. In particular, we evaluated if the users could correctlysegment the text and quickly execute the gestures associatedwith the text chunks. In these sessions, the text associated withthe scretches was produced using the target-based recognizerdescribed in Section III-A.

To test the performance of the method in a real scenario, weprogrammed five sessions with a slightly modified experimen-tal procedure: The phrase set was enlarged and we allowed theparticipants to correct errors. In these sessions, the geometricrecognizer described in Section III-A was used. In the follow-ing, we will refer to the earliest 20 sessions and to the latestfive sessions as the learning sessions and the typing sessions,respectively.

A. Participants

Our participants were seven male and three female Italianuniversity students, aged between 21 and 30 (M = 25.0, SD =2.9) with no previous experience with KeyScretch. Five of themhad some previous experience of text entry on a soft keyboard.All of them are habitual computer and mobile phone users and

only one was left-handed. They were recruited voluntarily andreceived a gadget as a reward.

B. Apparatus

The experiment was executed on a SMART Podium ID250Interactive Pen Display connected to a Dell Precision T5400workstation equipped with an Intel Xeon CPU at 2.50 GHzrunning Microsoft Windows XP operating system and the JavaRun-Time Environment 6.

A Java application equipped with the interfaces of both meth-ods was developed for the experiment. The dimension of the key-board is 71 × 36 mm; the side of each squared key was 5.4 mmwith a key spacing of 1.5 mm. These dimensions are comparablewith those of recent devices called phablets (a portmanteau ofthe words phone and tablet). The instance optimized for Italian,with the menu M(c) = 〈aeio〉, was used for each character c.We included the following two exceptions: M(“q”) = 〈aeiu〉and M(“e”) = 〈a′io〉. The former is based on the observationthat the “qu” digraph is much more frequent than “qo”; the latterenables a shortcut to enter the “e” character, also particularlyfrequent in Italian.

C. Procedure

In each session (lasting about 35–40 min), the participantsentered text with both methods (15 min each), with a 5-minbreak between the two half-sessions. The order of the methodswas counterbalanced across the sessions. The participants hadto enter text continuously, with no relevant temporal breaks,during each half-session. The task consisted of copying shorttext phrases randomly selected from the sets described next. Theparticipants were asked to carefully read each presented phraseand remember it. Then, they had to copy it as quickly and accu-rately as possible. Before the beginning of the experiment, theparticipants had an induction phase where they were explainedthe methods and the experimental procedure.

In the learning sessions, the participants were asked to ignoreerrors and to keep writing. If the error rate was greater than 5%,they were asked to be more accurate in the next session. In thetyping sessions, they were required to correct the typing errorsthrough the use of a backspace key.

D. Dependent Measures

As in previous experiments on text entry, the main dependentmeasures were text entry speed and error rate. In addition to theoverall performance of the method, we investigated the learn-ability of the interaction using KeyScretch and the efficiencywith which it can produce the associated text. To this aim, wemeasured the speed of the scretches, grouped by the number ofsegments they are composed of. We also measured the effective-ness with which the users segmented the text. The capacity ofthe users was evaluated by comparing their segmentation dur-ing typing with the ideal segmentation (obtained through a textscanner) of the written text.

For the text entry speed, we used the wpm metric: basically,the entry speed is calculated by dividing the total number of

Page 5: Novice and Expert Performance of KeyScretch: A Gesture-Based …static.tongtianta.site/paper_pdf/d00f280a-e01d-11e9-8343... · 2019. 9. 26. · Salerno,84084Fisciano,Italy(e-mail:vfuccella@unisa.it;matderosa@unisa.it;

FUCCELLA et al.: NOVICE AND EXPERT PERFORMANCE OF KEYSCRETCH: A GESTURE-BASED TEXT ENTRY METHOD 515

TABLE ITEN STATEMENTS OF THE SUS QUESTIONNAIRE USED TO EVALUATE USER

SATISFACTION WITH KeyScretch

entered characters (the first entered character is excluded) bythe time to enter them. In this metric, a word is conventionallycomposed of five characters [40].

The experiment was a 2 × 25 within-subjects factorial design.The two factors were:

1) Method {Traditional, KeyScretch};2) Session {25 sessions}.A System Usability Scale (SUS) [41] questionnaire was ad-

ministered to the participants during the experiment, in orderto evaluate their satisfaction with KeyScretch. SUS includes tenstatements (see Table I), to which respondents had to specifytheir level of agreement using a five-point Likert scale. Fur-thermore, the questionnaire included five questions aimed tocompare the feelings of the users on both methods: The partic-ipants had to declare their preference for one of the two meth-ods with regard to speed, ease of use, accuracy, comfort, andoverall impression. The participants were also asked whetherthey would like to use KeyScretch on their palmtop/smartphone.Freeform comments were elicited. No identifying informationwas collected.

E. Phrase Sets

The phrase sets contain short and easy-to-remember textphrases. The phrases only contain lowercase letters, apostro-phes, and spaces. The accented letters were substituted with thesame unaccented letter followed by an apostrophe. This choicewas made to test a basic version of the Qwerty keyboard.

An initial set (Set1) of 80 Italian phrases was used in thelearning sessions. A larger set (Set2) of 250 phrases, con-taining some phrases from Set1, was used in the typing ses-sions. Features of the two phrase sets are reported in Table II.The phrases were chosen to be highly representative of theItalian language. We measured this feature through Pearsonproduct–moment correlation coefficient, between the set and anItalian corpus [42], for the frequency of letters and digraphs.We also considered the frequency of the text chunks that can beentered with a single scretch. Sample phrases from the extendedphrase set are shown in Fig. 4.

TABLE IIFEATURES OF THE TWO PHRASE SETS USED IN OUR EXPERIMENTS

Fig. 4. Sample phrases from the extended phrase set (composed of 250 phrasesin total).

Fig. 5. Overall text entry speed by method and session (mean entry rate (wpm)and 95% confidence intervals) of KeyScretch and the traditional method throughthe learning sessions.

F. Results

Each participant completed the 25 sessions. The duration foreach participant for each method is estimated to be about 6 h and15 min (5 h the learning sessions plus 1 h and 15 min the typingsessions). The phrases not completely transcribed or containingerrors because of the substitution of entire words were removedfrom the logs before analysis. The removals were infrequentand caused an average decrease in the error rate of about 0.2%on both methods. We separately analyzed the results for thelearning sessions and the typing sessions.

1) Learning the Scretches: Fig. 5 illustrates that, as ex-pected, KeyScretch was slower than the traditional method inthe initial sessions and faster in the final ones. The perfor-mance crossover was reached between the 10th and 11th ses-sions, after about 2 h and a half of use. The best averagesession speeds were 41.6 and 37.2 wpm, respectively. Therepeated-measure ANOVA of text entry speed showed no maineffect for method. There was a significant effect of session(F19,171 = 76.7, p <0.00001) and a significant method-by-session interaction (F19,171 = 20.2, p <0.00001).

Page 6: Novice and Expert Performance of KeyScretch: A Gesture-Based …static.tongtianta.site/paper_pdf/d00f280a-e01d-11e9-8343... · 2019. 9. 26. · Salerno,84084Fisciano,Italy(e-mail:vfuccella@unisa.it;matderosa@unisa.it;

516 IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS, VOL. 44, NO. 4, AUGUST 2014

Fig. 6. Average speed of the scretches, grouped by their number of segments.

Fig. 7. Average number of cps for KeyScretch measured across the learningsessions and its ideal value.

The accuracy of KeyScretch in the learning sessions isaffected by the use of the target-based recognizer. Since theusers did not correct typing errors, the error rate was measuredthrough the minimum string distance [43]. For the traditionalmethod, the average error was 2.24% (SD = 0.47%, range =1.43–3.13%), while for KeyScretch, it was 2.96% (SD =0.41%, range = 2.24–3.66%). An ANOVA revealed a signifi-cant difference in error rates between the two keyboard designs(F1,9 = 8.21, p = 0.019).

Regarding the performance of the scretches, we have calcu-lated the text entry speed of a single scretch as n/t, where nis the length of the produced text chunk and t is the durationof the time interval between two pointer-up events (i.e., in-cluding the time for both the stroke and the jump immediatelypreceding it). The speed has been converted to wpm. The aver-age speeds for the scretches through the learning sessions arereported, weighted by the value of t, in Fig. 6. The data show thatentering text using the scretches is consistently more efficientthan using simple taps. Furthermore, the compound scretches(composed of two or three segments) are more efficient thansimple flicks.

As a measure of the user’s ability to segment text correctly,we report the number of characters per stroke (cps, in the follow-ing) measured across the sessions and its ideal value estimatedthrough a text scanner. Fig. 7 shows the measured cps, the idealcps, and its average over all 20 sessions. The measured val-ues are averaged per user and then per session. The measured

Fig. 8. (a) Speed and (b) the accuracy by method and session through thetyping sessions (21–25). Ninety-five percent confidence intervals are reported.For the accuracy, both the total error rate (the dashed lines) and the not correctederror rate (the continuous lines) have been reported.

cps trends toward the ideal cps over the sessions. The largestimprovement is obtained between the first two sessions.

2) Speed: A more precise comparison of the speeds obtainedwith the two methods is obtained through the analysis of thetyping sessions. Fig. 8(a) shows for the speed of both methodsthrough the sessions 21–25. The tested instance of KeyScretchis consistently faster in all sessions. The average session speedranges from 34.5 to 37.4 wpm for KeyScretch, and from 30.8 to31.8 wpm for the traditional method. The ANOVA of text entryspeed showed a significant effect of the method (F1,9 = 13.248,p = 0.005). There was a significant effect of session (F4,36 =5.766, p = 0.001) and a trend for method-by-session interaction(F4,36 = 2.549, p = 0.056).

3) Accuracy: In the typing sessions, we evaluated both thetotal and the not corrected (errors left in the transcribed text)error rate [43] [see Fig. 8(b)]. The dashed lines report the totalerror rate series, while the continuous lines represent the not cor-rected error rate. The overall error rate for KeyScretch (3.80%)is greater than that of the traditional method (3.47%). However,the difference is not statistically significant. The not correctederror rate for KeyScretch is slightly greater than the one for thetraditional method (0.48% versus 0.37%) and the difference isstatistically significant (F1,9 = 5.458, p = 0.044).

KeyScretch entails more correction activity than the tradi-tional method. Nevertheless, the overall number of corrections

Page 7: Novice and Expert Performance of KeyScretch: A Gesture-Based …static.tongtianta.site/paper_pdf/d00f280a-e01d-11e9-8343... · 2019. 9. 26. · Salerno,84084Fisciano,Italy(e-mail:vfuccella@unisa.it;matderosa@unisa.it;

FUCCELLA et al.: NOVICE AND EXPERT PERFORMANCE OF KEYSCRETCH: A GESTURE-BASED TEXT ENTRY METHOD 517

TABLE IIIUSER SATISFACTION ON THE TWO METHODS

(intended as the sequences of backspace characters) was lowerwith KeyScretch (1721) than with the traditional method (2045).The greater correction activity is, not surprisingly, due to the po-sition of the character to correct: With KeyScretch, an error canoccur in any position of the entered text chunk. Thus, there isa greater probability of performing a correction with a longersequence of backspaces (or not to notice the error at all). Infact, the average length of the backspace sequences is 2.13 forKeyScretch and 1.59 for the traditional method.

4) User Satisfaction: The scores of the SUS questionnairefor our ten participants range from 65 to 85, with an averagevalue of 73.5. The responses of the users to the further fivequestions are shown in Table III. The table reports the num-ber of participants favorable to the one or to the other methodor neutral. The results show that KeyScretch is the users’ fa-vorite method. The participants mostly appreciated the speedand the comfort of the method, remarking that entering textwith KeyScretch is much less tiring than with the traditionalmethod. The results were analyzed through a permutation ver-sion of the χ2 test. There was a significant evidence for a differ-ence in the preference of methods with respect to the features(χ2 = 24.8502, p ≈ 0.001).

Seven participants out of ten responded that they would liketo use KeyScretch on their palmtop/smartphone. Some freeformcomments were gathered: One user pointed out some prob-lems with the accuracy related to the precision of the pointermovements.

V. PREDICTING EXPERT PERFORMANCE

WITH DIFFERENT LANGUAGES

In this section, we investigate the performance an expert usercan achieve on finger-based touch surfaces with some Westernlanguages. To estimate the typing speed of expert users, we es-tablish a predictive model based on the work by Rick [9]. Usingthe general idea of predictive models [44], a complex task isdecomposed into simpler atomic operations. Here, in particular,the model predicts the time needed to perform the basic interac-tions occurring while typing, i.e., strokes (including both tapsand scretches) and jumps. Unfortunately, Rick’s model does notsupport accurate modeling of scale-invariant movements. Thus,we directly use an empirically obtained execution time for a setof the most frequent scretches and exploit Rick’s model onlyfor the remaining, unavailable scretches. The decomposition inbasic operations follows from text segmentation. Here, we as-sume that an expert user knows how to segment the text withKeyScretch. This assumption is not so far from reality, as shownin Section IV-F1.

The resulting model is validated in a limited study with twodifferent instances of KeyScretch: one optimized for English andone optimized for Italian on the Qwerty layout. The procedure ofthis experiment is similar to the one used in [45]. For comparisonpurposes, the same user performance is also measured withshape writing.

The model is used to predict expert performance forKeyScretch in different Western languages, including: English,French, German, Italian, Portuguese, and Spanish.

The construction of the model is described in Section V-A, its validation in Section V-B, and the prediction of expertperformance in Section V-C.

A. Estimating Expert Performance for KeyScretch

Our model makes predictions on the execution time of threedifferent types of movements: jumps, taps, and scretches.

For taps and jumps, we use the classic Shannon formulationof Fitts’ law:

MT = a + b ∗ log2(D/W + 1). (2)

Here, D is the distance from the starting point to the center ofthe target, and W is the width of the target measured along theaxis of motion. Values for the coefficients a and b on finger-basedsurfaces can be found in [9]. To distinguish a simple jump froma jump followed by a tap, we use a = 49.03 and a = 128.90,respectively. The value of the b coefficient does not vary and isb = 114.88 in both cases.

To estimate the execution time of the scretches, we gatheredempirical data, taking advantage of the rather limited size of theset of the 32 most frequently used differently shaped scretches.These include: all scretches composed of a single segment, i.e.,the set {01, 02, 03, 04}; the 16 scretches composed of two seg-ments, i.e., the set {0ij|1 ≤ i ≤ 4, 0 ≤ j ≤ 4, i �= j}; and the12 scretches composed of three segments and ending in thecentral area, i.e., the set {0ij0|1 ≤ i ≤ 4, 1 ≤ j ≤ 4, i �= j}.

The execution times were gathered in a typing task in whichnine participants (seven males, two females) from 23 to 34 yearsold (M = 25; SD = 3.4) were asked to type a list of wordsusing KeyScretch on a finger-based tablet device (a SamsungGalaxy Tab running Android, equipped with a 7-in screen). In or-der to simulate expert performance, the participants were askedto perform trials on each word, until they were satisfied withtheir speed; then they were asked to enter the word repeatedlyeight times in a row. Twenty-four samples were gathered for eachparticipant and for each scretch. The participants were asked totype as accurately and quickly as possible and were allowed toretype the misspelled words. Prior to performing the task, theparticipants familiarized themselves with KeyScretch by playingwith a typing video game designed to teach the method [46]. Thetrials were performed on a prototypical keyboard with a Qwertylayout, augmented with a menu containing the first four vowels.The use of a specific language (Italian) and of a fixed characterset for the menu did not prevent obtaining a reliable estimationof the movement time. The expertise gained in the repeated tri-als made the speed dependent on the movement control ratherthan on other factors, such as the knowledge of the language or

Page 8: Novice and Expert Performance of KeyScretch: A Gesture-Based …static.tongtianta.site/paper_pdf/d00f280a-e01d-11e9-8343... · 2019. 9. 26. · Salerno,84084Fisciano,Italy(e-mail:vfuccella@unisa.it;matderosa@unisa.it;

518 IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS, VOL. 44, NO. 4, AUGUST 2014

TABLE IVAVERAGE MOVEMENT TIMES (IN MILLISECONDS) FOR THE Scretches

Fig. 9. Model of a key augmented with a four-item menu.

the visual search time. However, each participant was provideda different random ordering of the word list and an instance ofKeyScretch with a different random mapping between the lettersand the menu items. The movement time for each scretch wascalculated as the average of the median time of each participant.Table IV reports, for each scretch, its identifier and the averagemovement time. Standard deviations are reported in parenthesis.

The movement time for the unavailable scretches was approx-imated using Rick’s model. Basically, the model decomposes astroke into segments corresponding to simpler straight move-ments and their single contribution is summed to calculate theoverall stroke time. The segments of a stroke are classified inone of three possible types: Beginning (the first one), End (thelast one), or Middle (segments between the first and the lastone). The execution time of these simple movements is calcu-lated using an equation which fits data better than Fitts’ law.In the new equation, two additional parameters are taken intoaccount: Fα , a multiplicative adjustment depending on the mag-nitude of the angle formed by the considered segment and thehorizontal axis and Fδ , an additive adjustment depending onthe magnitude of the angle formed by the considered segmentand the previous segment. To elaborate, the equation used in themodel to estimate the movement time for a stroke segment isthe following:

MTs = (a + b ∗ log2(D/W + c))Fα + Fδ . (3)

The values for the a, b, c, Fα , and Fδ parameters are deter-mined on the basis of the segments’ types and of the magnitudeof the angles, using data from [9]. To assign a value to theD/W ratio, we establish a simplified geometrical model of akey augmented with a menu containing four items (see Fig. 9).Here, the key area is squared, while those of the menu itemshave the shape of an isosceles trapezoid having a height equal

to the measure of the key side. Through simple reasoning aboutthe geometrical properties, we assign to the ratio the followingvalues:

1) D/W = D1/W1 = 1 for movements from the central keyarea to a menu item (or in the opposite direction);

2) D/W = D2/W2 = 1 for movements between contiguousmenu items;

3) D/W = D3/W3 = 2 movements between opposite menuitems.

Although the scretches are scale invariant, a fixed location forthe pointer-up event of a stroke must be chosen as the startingpoint of the next jump movement. We adopted a simplifiedmodel: For the scretches ending in the central key area, weconsidered the center of the key; and for the scretches ending ina menu item, we considered the center of the key contiguous tothe pressed one in the direction of the last movement.

B. Empirical Validation of the Model and ComparisonWith Shapewriter

To validate the model, we investigated to what extent thecompletion time predicted by the model deviates from the em-pirically measured value. The empirical measurement was car-ried out in typical typing tasks consisting of transcribing an inputphrase for which the prediction of the model had been estimated.We tested two different instances of KeyScretch: one optimizedfor English and one optimized for Italian on the Qwerty layout.For comparison purposes, the measurement was also made withshape writing.

1) Participants: Sixteen participants (four females), agedbetween 22 and 35 (M = 27.4, SD = 3.8), were recruited forthe experiment.

2) Apparatus: The trials were executed on an LG P760 Op-timus L9 mobile phone, equipped with a 4.7′′ display with540 × 960 resolution and running the Android 4.0.4 operatingsystem.

We used a prototypical implementation of KeyScretch forAndroid, with an interface similar to Fig. 1. To cope with oc-clusion, a magnified view of the pressed key (key preview)is shown in a higher location than the pressed key and themenu is shown around the key preview. The size of the key-board, when shown on the smartphone is 59 × 37mm. TheItalian instance used M(c) = 〈aeio〉 for each character c withthe exception M(“q”) = 〈aeuo〉 while the English instancewas M(c) = 〈osea〉 with M(“t”) = 〈ohea〉 as an exception.For shape writing, two different implementations were used:Shapewriter and TouchPal for English and Italian, respec-tively. The choice of using two different commercial productswas because of the unavailability of the Italian dictionary inShapewriter, which was our first choice implementation forEnglish.

Furthermore, we used an application showing a phrase andrecording the typing speed, similar to that described in [47]. Thekeyboard prototype was implemented as a customization of theAndroid SoftKeyboard sample project.

3) Study Design and Procedure: The participants were di-vided into two equally sized groups: Each group carried out

Page 9: Novice and Expert Performance of KeyScretch: A Gesture-Based …static.tongtianta.site/paper_pdf/d00f280a-e01d-11e9-8343... · 2019. 9. 26. · Salerno,84084Fisciano,Italy(e-mail:vfuccella@unisa.it;matderosa@unisa.it;

FUCCELLA et al.: NOVICE AND EXPERT PERFORMANCE OF KEYSCRETCH: A GESTURE-BASED TEXT ENTRY METHOD 519

TABLE VCALCULATED AND OBSERVED COMPLETION TIMES FOR THE

TASKS OF THE EXPERIMENT

the experiment with a single instance of KeyScretch, English orItalian, and with shape writing in the same language.

A total number of 20 phrases were used to validate the model:Ten were chosen from the English set [48] and ten from theItalian set of 250 phrases described in Section IV-E. In bothcases, the choice was random. Before executing the experiment,the participants were exposed to several learning sessions onboth methods. These sessions were carried out autonomouslyby the participants on their own mobile devices. They practicedeach method for about 6 h. An application was installed onthe devices to control the use of the keyboards and the properexecution of the learning phase. These hours of practice wereuseful to provide the participants with a basic experience. In or-der to simulate expert performance, the participants were askedto perform some trials on each phrase, until they were satisfiedwith their speed; then, they were asked to enter the phrase re-peatedly five times in a row. The participants were asked to typeas accurately and as quickly as possible but they were told toignore typing errors. The misspelled phrases were retyped inorder to gather data only from correctly executed tasks. Thisprocedure was repeated with each of the ten phrases entered byeach participant with each method. The order of the phrases wasdifferent and randomly chosen for each participant. The partic-ipants were instructed to keep the phone in one hand and to usethe keyboard with the finger of the other hand. The order of themethods (KeyScretch and shape writing) was counterbalanced.

4) Results: Each experimental session lasted about one hourand a half. This time included short breaks after each phrase,and a mandatory break of at least 10 min between the first andthe second method. Thus, each participant performed a totalnumber of 20 tasks: ten for each method.

Table V reports, in each column, the presented text phrase,the prediction of the model, the observed task completiontime, and the percentage with which the prediction deviates

Fig. 10. Predicted versus observed completion times (in seconds) in theexperiment.

TABLE VITYPING SPEEDS (MEAN ± S.D.) OBTAINED IN THE EXPERIMENT WITH

KeyScretch AND SHAPE WRITING

from the observed value. The comparison of the predicted ver-sus the observed data is also shown in Fig. 10. Averaging overthe tasks, the root-mean-square error is 10.01% of the predictedtyping speed (R2 = 0.77). The prediction is as reliable with theEnglish instance as with the Italian instance, even though theobserved performance with the Italian instance shows a greaterstandard deviation.

Table VI reports the average typing speeds recorded with bothmethods for the comparison with shape writing. While the per-formance recorded with the Italian instance of KeyScretch hasa speed comparable with that recorded with shape writing, theEnglish instance of KeyScretch is slower in most tasks. In theformer case, four tasks out of ten are faster with KeyScretch, withefficiency advantages for the one or the other method lower orequal to 10%; in the latter case, instead, shape writing has a tan-gible advantage in most cases, while KeyScretch prevails onlyin two tasks out of ten. With Italian, the typing speeds (averaged

Page 10: Novice and Expert Performance of KeyScretch: A Gesture-Based …static.tongtianta.site/paper_pdf/d00f280a-e01d-11e9-8343... · 2019. 9. 26. · Salerno,84084Fisciano,Italy(e-mail:vfuccella@unisa.it;matderosa@unisa.it;

520 IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS, VOL. 44, NO. 4, AUGUST 2014

TABLE VIICHARACTERISTICS OF THE INPUT TEXT USED FOR THE COMPARISON

Fig. 11. Keyboard variants.

among participants) were 51.74 and 52.63 with KeyScretch andshape writing, respectively. A two-tailed paired t-test showedthat this difference was not statistically significant. With En-glish, the speeds were 43.45 and 47.80 and the difference wasnot significant.

To replace the misspelled phrases, each phrase was repeatedon average 3.75 and 2.74 times for the Italian and Englishinstances of KeyScretch, respectively, and 1.73 and 3.81 forshape writing. The longest phrases required a greater number ofrepetitions.

C. Predicting Typing Speeds for Different Instances

The model supported the prediction of the typing speeds fordifferent instances of the method, by varying keyboard and menulayout. A software simulator implementing the model was de-veloped in Java, in order to calculate the time required to enteran input text. In addition to a file containing the text, the simu-lator took as input an XML file containing the definition of oneor more keyboard layouts, each specifying the coordinates andthe size of the keys.

We produced text in six different languages (English, French,German, Italian, Portuguese, and Spanish). The text was ob-tained from Wikipedia, by extracting random paragraphs fromrandom articles. Table VII reports the size of the text files interms of number of characters, words and unique words, andthe correlation of the input text to text corpora [42] in terms ofletter, digraph, and word frequency.

The keyboard layouts shown in Fig. 11 have been reproducedfor our trials. These include: the classic Qwerty layout and itsFrench and German variants [see Fig. 11(a)]; the optimized

TABLE VIIISPEED (IN WPM) OF DIFFERENT TEXT ENTRY METHODS WITH DIFFERENT

LANGUAGES USING THE Qwerty KEYBOARD LAYOUT AND ITS VARIANTS

layouts Opti II [4] and K-5 [see Fig. 11(b)]; the four K-English,K-French, K-German, and K-Spanish layouts, optimized for spe-cific western languages [24] [see Fig. 11(c)].

The layouts only contain the 26 characters of the Europeanlanguages and a space character. We excluded other characters(e.g., the SHIFT key, punctuation, or diacritics) since the ven-dors of the most widespread mobile phones make customizedchoices to handle the input of infrequent characters. In theiroriginal definition [24], the K-5 layout and the four layouts inFig. 11(c) do not contain the space key, required in our simula-tion. We added it to the end of the fourth key line: its positionin the ATOMIK [6] layout for Shapewriter, from which the fivelayouts are derived.

In order to run the simulation, the input text was lowercased,diacritic characters were substituted with their closest alpha-betic neighbor, and punctuation and nonalphabetic letters wereignored. We simulated text entry with KeyScretch, the traditionalmethod (Baseline) and with the simple menu-based method onlysupporting flicks. Different language/keyboard layout combina-tions were tested.

The instances of KeyScretch have been chosen so as to max-imize the performance. Although several complex methods forkeyboard layout optimization have been described in the litera-ture (including genetic algorithms [49] and models derived fromphysics [5]), we opted for a brute-force algorithm to select themenu characters. In particular, the set of letters associated withthe menu items was chosen by testing all of the permutationson the four combinations of the set containing the eight mostfrequent letters in each language. This restriction limits the testto 24 × 70 different permutations (i.e., menu layouts) for eachkeyboard layout/language combination and appears to be rea-sonable, considering that the presence of an infrequent characterin the menu is useless.

1) Results: The results for the Qwerty layout and its variantsare shown in Table VIII, while those for the optimized layoutsare shown in Table IX. The tables report the language of the inputtext, the keyboard layout, the best performing menu layout usedfor the menu-based methods (Simple Menu and KeyScretch),and the simulated speed (in wpm) for each text entry method.The improvement over the baseline method gained with eachmethod is reported in parenthesis.

The main result of our simulation is that KeyScretch is thefastest method with all languages. Using a Qwerty layout orits variants, the improvements over the baseline method arein the range of about 30–49% and vary according to the tar-get language. The best results are obtained with Italian: thetested instance of the method achieves an improvement closeto 50%. High improvements, greater than 40%, can also be

Page 11: Novice and Expert Performance of KeyScretch: A Gesture-Based …static.tongtianta.site/paper_pdf/d00f280a-e01d-11e9-8343... · 2019. 9. 26. · Salerno,84084Fisciano,Italy(e-mail:vfuccella@unisa.it;matderosa@unisa.it;

FUCCELLA et al.: NOVICE AND EXPERT PERFORMANCE OF KEYSCRETCH: A GESTURE-BASED TEXT ENTRY METHOD 521

TABLE IXSPEED (IN WPM) OF DIFFERENT TEXT ENTRY METHODS WITH DIFFERENT

LANGUAGES USING OPTIMIZED KEYBOARD LAYOUTS

TABLE XCOMPARISON OF THE AVERAGE jump LENGTH AMONG THE TESTED METHODS

TABLE XILANGUAGE-RELATED PARAMETERS OF THE INSTANCES OF KeyScretch

OPTIMIZED FOR THE Qwerty LAYOUT (AND VARIANTS)FOR DIFFERENT LANGUAGES

obtained with Portuguese and Spanish. The method consistentlyoutperforms the simple menu-augmented keyboard with a goodmargin. With the optimized layouts, the improvement over thebaseline method is in the lower range of 21–38%. KeyScretchstill outperforms the simple menu-augmented keyboard.

In Table X, we report a comparison of the average jump lengthamong the tested methods on a set of layout/language combi-nations. The data show that the menu-based methods alwaysreduce the average jump length on the Qwerty (and its variants)keyboard layout with respect to the baseline method. The op-timized layouts, already designed to shorten the jumps and theuse of a menu, do not enable further reductions.

Table XI reports, for each instance of KeyScretch, the numberof different scretches associated with a legal text chunk in theinput text of the target language, the relative frequency of thescretches out of the total number of strokes (scretches + taps),the percentage of text which can be entered through the scretchesand the cumulative frequency of the 25 most frequent scretchesout of the whole set, and the average number of charactersentered through a single stroke (cps). The results show that

Italian, Portuguese, and Spanish are best suited for use withKeyScretch than the other three tested languages. In particular,the whole text can be entered through a reasonably small set ofscretches and the cumulative frequency of the 25 most frequentof them is close to 100%. The use of the scretches is also greaterin the latter three languages, resulting in a higher number of cps.

The last result we report is the cumulative frequency of the32 scretches whose speed has been empirically obtained to es-tablish the model (see Section V-A). This result confirms thatour choice of simply measuring the execution times of a lim-ited number of scretches led to a good approximation of theestimate of the final text entry speed of the various instances.This is particularly true for the Italian, Portuguese, and Spanishlanguages, for which the empirically measured scretches have acumulative frequency close to 100% (99.9%, 99.7%, 99.98%),respectively. The accuracy obtained with English, French, andGerman is slightly lower, with values of 99.4%, 98.2%, and98.1%, respectively.

VI. DISCUSSION

In the previous sections, we have described two main experi-ments, the former aimed at evaluating the learning of KeyScretchand the second aimed at evaluating the performance of expertusers. The two experiments were also conducted on devicesthat use a different type of interaction: with the stylus and withthe finger, respectively. Both experiments show a reasonablygood performance for KeyScretch and for the scretch, the newinteraction introduced by the method.

Despite learning being evaluated only for a single instance ofthe method (optimized for Italian), the availability of a ratherreliable model for the prediction of expert performance and theanalysis carried out on different languages (shown in Section V-C1) let us expect that learning the method can also be a goodinvestment for users typing in Portuguese and Spanish languagesand having familiarity with the Qwerty layout.

The efficiency of the compound strokes introduced by themethod, the scretches, has been demonstrated in Section IV-F1.The usefulness of this interactive technique can be generalizedto situations different than text entry where it is required toperform multiple selections of items from a radial menu.

With respect to shape writing, KeyScretch offers the advan-tage of not being tied to the use of a dictionary, avoiding theproblem of the input of out-of-dictionary words. This does notpreclude that the method may be improved through the use of adictionary, e.g., for spell checking. For the previous reasons, wecan state that KeyScretch is definitely a valid replacement of thetraditional tapping method and can compete with shape writing,at least on some of the tested languages. Although KeyScretchperformed worse, we have to report that the procedure of the ex-periment put shape writing in ideal conditions which may haveproduced an overestimation of its real speed: the absence of out-of-dictionary words and the continuous repetition of the samephrase. In particular, the repetition updated the language modelof the shape writing keyboards, which is used as a channel forgesture recognition [26]. Unfortunately, in the experiment, wedid not have the full control of this feature on the commercial

Page 12: Novice and Expert Performance of KeyScretch: A Gesture-Based …static.tongtianta.site/paper_pdf/d00f280a-e01d-11e9-8343... · 2019. 9. 26. · Salerno,84084Fisciano,Italy(e-mail:vfuccella@unisa.it;matderosa@unisa.it;

522 IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS, VOL. 44, NO. 4, AUGUST 2014

keyboards. Furthermore, we did not carry out a longitudinalstudy in order to compare the learning of both methods. How-ever, some of the participants who practiced on both methodsat the same time had discomfort and confusion between themin the initial phases. Although this did not prevent them fromgetting a basic expertise on the two methods, and therefore didnot affect the validity of the results, we were warned about thepossibility of making a comparison between the two methodsusing a within-subjects design.

Despite the positive results of our prototype evaluation, itis possible that users could be discouraged by the low initialperformance and abandon the use of the method. We designeda typing video game to avoid this situation [46].

The predictive model is able to tell us, with a certain margin oferror, only the speed of an expert user. Unfortunately, at presentwe are not able to predict with precision the learning times.However, parameters, as those reported in Table XI, can also beconsidered to have an insight of the learning time of an instance.

The user studies presented in this paper were performed onsmall screen views, typical of mobile devices. Nevertheless,the keyboard view must be carefully designed, since the useof the menu requires that some more space is available on thesides of the peripheral keys. Thus, the size of a single key mustbe reduced to some extent. As a consequence, the use of themethod on very tiny devices might be uncomfortable or errorprone. Both the user study and the simulation were carried outconsidering one-pointer text entry: The scenario where usershold and enter text with the same hand, which is preferred byseveral users [50], was not tested in our experiments.

Occlusion is another issue to take into account in the design ofthe keyboard on finger-based devices of reduced size: the menucan be hidden by the user’s fingers. To cope with this problem, inthe design of the Android implementation, we preferred to showthe menu above the key, rather than around it. The participantsin the experiment did not show particular difficulties in using themenu in the learning stages, although some of them complainedthat the menu sometimes hid part of the keyboard. This wasbecause of our choice of leaving it active even after the key wasreleased, in order to give the user a feedback on the recognitionof the gesture. Special care should be used in choosing the righttiming.

VII. CONCLUSION

We have presented KeyScretch, a text entry method for devicesequipped with a touch-screen. The method is based on menu-augmented keyboards, improving them through the possibilityof performing compound strokes to enter text chunks of variablelength. We discovered that the use of the scretch is a definiteadvantage in terms of typing efficiency. A factor influencing thelearnability of the method is the segmentation of the text inducedby the method. We showed that the users are able to learn thecorrect segmentation in a short time. The method enables a fastexpert text entry with many configurations, in particular with themost commonly used keyboard layouts. The language structuresof Italian, Portuguese, and Spanish make the method suitable to

be used with these languages, more than with the other testedlanguages.

The method can still be improved. In particular, the results ofthe study with Italian users indicated that the method requiresmore correction activity. To tackle the accuracy problem, we arecurrently working on different techniques: a chunk-level cor-rection mechanism to reduce the correction activity, an ad hocdesigned spell corrector, and an effective feedback mechanismtelling the user the recognized template scretch, through whichthe user could be immediately conscious of possible typingerrors. Furthermore, we plan to further investigate the perfor-mance of the method with differently sized devices and differentlanguages.

ACKNOWLEDGMENT

The authors would like to thank D. De Stefano for his supportin the statistical analysis and M. Quaranta for his help in themodel validation.

REFERENCES

[1] R. Plamondon and S. N. Srihari, “On-line and off-line handwriting recog-nition: A comprehensive survey,” IEEE Trans. Pattern Anal. Mach. Intell.,vol. 22, no. 1, pp. 63–84, Jan. 2000.

[2] E. Montgomery, “Bringing manual input into the 20th century: New key-board concepts,” Computer, vol. 15, no. 3, pp. 11–18, Mar 1982.

[3] (1998). Textware solutions: The fitaly one-finger keyboard. [Online].Available: http://fitaly.com/fitaly/fitaly.htm

[4] I. S. MacKenzie and S. X. Zhang, “The design and evaluation of a high-performance soft keyboard,” in Proc. Extended Abstr. Human FactorsComput. Syst., 1999, pp. 25–31.

[5] S. Zhai, M. Hunter, and B. A. Smith, “The metropolis keyboard—An ex-ploration of quantitative techniques for virtual keyboard design,” in Proc.13th Annu. ACM Symp. User Interface Softw. Technol., 2000, pp. 119–128.

[6] S. Zhai, M. Hunter, and B. A. Smith, “Performance optimization of virtualkeyboards,” Human-Comput. Interact., vol. 17, pp. 229–270, 2002.

[7] I. S. MacKenzie and R. W. Soukoreff, “Text entry for mobile computing:Models and methods, theory and practice,” Human-Comput. Interact.,vol. 17, pp. 147–198, 2002.

[8] J. O. Wobbrock, B. A. Myers, and D. H. Chau, “In-stroke word comple-tion,” in Proc. 19th Annu. ACM Symp. User Interface Softw. Technol.,2006, pp. 333–336.

[9] J. Rick, “Performance optimizations of virtual keyboards for stroke-basedtext entry on a touch-based tabletop,” in Proc. 23rd Annu. ACM Symp.User Interface Softw. Technol., 2010, pp. 77–86.

[10] S. Zhai and P. O. Kristensson, “The word-gesture keyboard: Reimaginingkeyboard interaction,” Commun. ACM, vol. 55, no. 9, pp. 91–101, 2012.

[11] P. Isokoski, “Performance of menu-augmented soft keyboards,” in Proc.Extended Abstr. Human Factors Comput. Syst., 2004, pp. 423–430.

[12] S. Zhai and P.-O. Kristensson, “Shorthand writing on stylus keyboard,” inProc. Extended Abstr. Human Factors Comput. Syst., 2003, pp. 97–104.

[13] G. Costagliola, V. Fuccella, and M. Di Capua, “Text entry withkeyscretch,” in Proc. 16th Int. Conf. Intell. User Interface, 2011, pp. 277–286.

[14] P. M. Fitts, “The information capacity of the human motor system incontrolling the amplitude of movement,” J. Exp. Psychol., vol. 47, pp. 381–391, 1954.

[15] D. Goldberg and C. Richardson, “Touch-typing with a stylus,” in Proc.INTERACT Human Factors Comput. Syst., 1993, pp. 80–87.

[16] C. H. Blickenstorfer, “Graffiti: Wow!!!!,” Pen Comput. Mag., vol. 1,pp. 30–31, 1995.

[17] D. Devoe, “Alternatives to handprinting in the manual entry of data,”IEEE Trans. Human Factors Electron., vol. HFE-8, no. 1, pp. 21–32, Mar.1967.

[18] P. O. Kristensson and L. C. Denby, “Text entry performance of state of theart unconstrained handwriting recognition: A longitudinal user study,” inProc. Extended Abstr. Human Factors Comput. Syst., 2009, pp. 567–570.

Page 13: Novice and Expert Performance of KeyScretch: A Gesture-Based …static.tongtianta.site/paper_pdf/d00f280a-e01d-11e9-8343... · 2019. 9. 26. · Salerno,84084Fisciano,Italy(e-mail:vfuccella@unisa.it;matderosa@unisa.it;

FUCCELLA et al.: NOVICE AND EXPERT PERFORMANCE OF KEYSCRETCH: A GESTURE-BASED TEXT ENTRY METHOD 523

[19] C. E. Shannon, “Prediction and entropy of printed English,” Bell Syst.Tech. J., vol. 30, pp. 50–64, 1951.

[20] K. Kurihara, M. Goto, J. Ogata, and T. Igarashi, “Speech pen: Predictivehandwriting based on ambient multimodal recognition,” in Proc. ExtendedAbstr. Human Factors Comput. Syst., 2006, pp. 851–860.

[21] I. S. MacKenzie, J. Chen, and A. Oniszczak, “Unipad: Single stroketext entry with language-based acceleration,” in Proc. 4th Nordic Conf.Human-Comput. Interact., Extending Boundaries, 2006, pp. 78–85.

[22] L. Magnien, J. L. Bouraoui, and N. Vigouroux, “Mobile devices: Soft key-board text-entry enhanced by visual cues,” in Proc. 1st French-SpeakingConf. Mobility Ubiquity Comput., 2004, pp. 158–165.

[23] X. Bi, B. A. Smith, and S. Zhai, “Quasi-qwerty soft keyboard optimiza-tion,” in Proc. Extended Abstr. Human Factors Comput. Syst., 2010,pp. 283–286.

[24] X. Bi, B. A. Smith, and S. Zhai, “Multilingual touchscreen keyboarddesign and optimization,” Human-Comput. Interact., vol. 27, no. 4, pp.352–382, 2012.

[25] J. Mankoff and G. D. Abowd, “Cirrin: A word-level unistroke keyboardfor pen input,” in Proc. 11th Annu. ACM Symp. User Interface Softw.Technol., 1998, pp. 213–214.

[26] P.-O. Kristensson and S. Zhai, “Shark2: A large vocabulary shorthandwriting system for pen-based computers,” in Proc. Extended Abstr. HumanFactors Comput. Syst., 2004, pp. 43–52.

[27] Shapewriter. (2010). [Online]. Available: http://www.shapewriter.com/[28] B. Martin, “Virhkey: A virtual hyperbolic keyboard with gesture inter-

action and visual feedback for mobile devices,” in Proc. 7th Int. Conf.Human Comput. Interact. Mobile Devices Services, 2005, pp. 99–106.

[29] N. Jhaveri, “Two characters per stroke—A novel pen-based text inputtechnique,” New Interaction Techniques’03. University of Tampere Tech.Rep., 2003.

[30] D. Venolia and F. Neiberg, “T-cube: A fast, self-disclosing pen-basedalphabet,” in Proc. Extended Abstr. Human Factors Comput. Syst., 1994,pp. 265–270.

[31] L. Findlater, B. Lee, and J. Wobbrock, “Beyond qwerty: Augmentingtouch screen keyboards with multi-touch gestures for non-alphanumericinput,” in Proc. Extended Abstr. Human Factors Comput. Syst., 2012,pp. 2679–2682.

[32] V. Fuccella, P. Isokoski, and B. Martin, “Gestures and widgets: Perfor-mance in text editing on multi-touch capable mobile devices,” in Proc.Extended Abstr. Human Factors Comput. Syst., 2013, pp. 2785–2794.

[33] P. Isokoski, B. Martin, P. Gandouly, and T. Stephanov, “Motor efficiencyof text entry in a combination of a soft keyboard and unistrokes,” in Proc.6th Nordic Conf. Human-Comput. Interact., Extending Boundaries, 2010,pp. 683–686.

[34] K. Perlin, “Quikwriting: Continuous stylus-based text entry,” in Proc. 11thAnnu. ACM Symp. User Interface Softw. Technol., 1998, pp. 215–216.

[35] D. J. Ward, A. F. Blackwell, and D. J. C. MacKay, “Dasher—A data entryinterface using continuous gestures and language models,” in Proc. 13thAnnu. ACM Symp. User Interface Softw. Technol., 2000, pp. 129–137.

[36] G. P. Kurtenbach, A. J. Sellen, and W. A. S. Buxton, “An empirical eval-uation of some articulatory and cognitive aspects of marking menus,”Human-Comput. Interact., vol. 8, no. 1, pp. 1–23, 1993.

[37] S. Singh, The Code Book: The Science of Secrecy From Ancient Egypt toQuantum Cryptography. New York, NY, USA: Doubleday, 1999.

[38] G. Costagliola, V. Fuccella, and M. Di Capua, “Interpretation of strokes inradial menus: The case of the keyscretch text entry method,” J. Vis. Lang.Comput., vol. 24, no. 4, pp. 234–247, 2013.

[39] H. Maurer, G. Rozenberg, and E. Welzl, “Chain code picture languages,”in Graph-Grammars and Their Application to Computer Science (Lec-ture Notes in Computer Science), vol. 153, H. Ehrig, M. Nagl, and G.Rozenberg, Eds. Berlin, Germany: Springer, 1983, pp. 232–244.

[40] I. S. MacKenzie. (2014). A note on calculating text entry speed. [Online].Available: http://www.yorku.ca/mack/RN-TextEntrySpeed.html

[41] J. Brooke, “SUS: A quick, and dirty usability scale, in Usability Eval-uation in Industry, P. W. Jordan, B. Weerdmeester, A. Thomas, andI. L. Mclelland, Eds. London, U.K.: Taylor & Francis, 1996.

[42] (2014). Large Corpora used in CTS. [Online]. Available:http://corpus.leeds.ac.uk/list.html

[43] R. W. Soukoreff and I. S. MacKenzie, “Metrics for text entry research: Anevaluation of MSD and KSPC, and a new unified error metric,” in Proc.Extended Abstr. Human Factors Comput. Syst., 2003, pp. 113–120.

[44] S. K. Card, T. P. Moran, and A. Newell, “The keystroke-level model foruser performance time with interactive systems,” Commun. ACM, vol. 23,no. 7, pp. 396–410, Jul. 1980.

[45] P. O. Kristensson, “Discrete and continuous shape writing for text entryand control” Ph.D. dissertation, Linkoping Univ., Linkoping, Sweden,2007.

[46] G. Costagliola, M. De Rosa, V. Fuccella, and F. Torre, “Typejump: Atyping game for keyscretch,” in Proc. IEEE Symp. Vis. Lang. Human-Centric Comput, 2012, pp. 249–250.

[47] S. J. Castellucci and I. S. MacKenzie, “Gathering text entry metrics onandroid devices,” in Proc. Extended Abstr. Human Factors Comput. Syst.,2011, pp. 1507–1512.

[48] I. S. MacKenzie and R. W. Soukoreff, “Phrase sets for evaluating text entrytechniques,” in Proc. Extended Abstr. Human Factors Comput. Syst., 2003,pp. 754–755.

[49] M. Raynal and N. Vigouroux, “Genetic algorithm to generate optimizedsoft keyboard,” in Proc. Extended Abstr. Human Factors Comput. Syst.,2005, pp. 1729–1732.

[50] A. K. Karlson, B. B. Bederson, and J. L. Contreras-Vidal, “Understandingone-handed use of mobile devices,” in Handbook of Research on UserInterface Design and Evaluation for Mobile Technology. Hershey, PA,USA: IGI Global, 2008, pp. 86–101.

Vittorio Fuccella received the Laurea (cum laude)and Ph.D. degrees in computer science from the Uni-versity of Salerno, Fisciano, Italy, in 2003 and 2007,respectively.

He is currently a Research Fellow with the De-partment of Informatics, University of Salerno, andserves as a Teaching Assistant with the Univer-sity of Naples “Federico II,” Napoli, Italy. His re-search interests include human–computer interac-tion, information visualization, e-learning, and webengineering.

Mattia De Rosa received the Laurea (cum laude) andthe Ph.D. degrees in computer science from the Uni-versity of Salerno, Fisciano, Italy, in 2010 and 2014,respectively.

He is currently a Research Fellow with the De-partment of Informatics, University of Salerno. Hisresearch interests include human–computer interac-tion, sketch recognition, and e-learning.

Gennaro Costagliola (M’91) received his Laurea de-gree in computer science magna cum laude from theUniversity of Salerno, Italy, in 1987. From 1989 to1993, he was a post-graduate and a visiting researcherat the University of Pittsburgh, PA, USA, where heearned his MS in 1991.

Since 2001 he is a Full Professor of ComputerScience at the University of Salerno and the Direc-tor of the Web Technologies and e-Learning Lab.His research interests include the theory, implemen-tation, and applications of visual languages, human-

computer interaction, web engineering and e-learning.Prof. Costagliola is Associate Editor of the Journal of Visual Languages and

Computing and a member of the Steering Committee of the IEEE Symposiumon VL/HCC.