spectral analysis of keystroke streams: towards effective...

Spectral Analysis of Keystroke Streams: Towards Effective Real-TimeContinuous User Authentication

Abdullah Alshehri, Frans Coenen and Danushka BollegalaDepartment of Computer Science, University of Liverpool, Liverpool, United Kingdom

{a.a.alshehri, coenen, danushka.bollegala}@liverpool.ac.uk

Keywords: Keystroke Time Series, Continuous Authentication, Keystroke Streams, Behavioral Biometric

Abstract: Continuous authentication using keystroke dynamics is significant for applications where continuous moni-toring of a user identity is desirable, for example in the context of the online assessments and examinationsfrequently encountered in eLearning environments. In this paper, a novel approach to realtime keystrokecontinuous authentication is proposed that is founded on a sinusoidal signals based approach that takes intoconsideration the sequencing of keystrokes. Three alternative time series representations are considered andcompared: Keystroke Time Series (KTS), Discrete Fourier Transform (DFT) and Discrete Wavelet Transform(DWT). The proposed process is fully described and analysed using three keystroke dynamics datasets. Theevaluation also includes a comparison with the established Feature Vector Representation (FVR) approach.The reported evaluation demonstrates that the proposed method, coupled with the DWT representation, out-performs other approaches to keystroke continuous authentication with a best accuracy of 99.22%; a clearindicator that the proposed keystroke continuous authentication using time series analysis has significant po-tential.

1 INTRODUCTION

Keystroke dynamics are a form of behavioural bio-metrics which can be used to authenticate keyboard(keypad) users (Gaines et al., 1980; Alshehri et al.,2016b). Broadly, we can identify two forms ofkeystroke authentication: (i) static authentication and(ii) continuous authentication. The first is used in thecontext of one-time authentication, for example pass-word or pin number access to a system; thus in thecontext of fixed texts. Some examples, from the lit-erature, concerning this form of authentication can befound in (Bleha et al., 1990; Killourhy and Maxion,2009; Syed, 2014). The second form of authentica-tion is typically applied in the context of continuousfree text where it is desirable to continuously monitorthe identity of a user; examples regarding this form ofauthentication can be found in (Shepherd, 1995; Mon-rose and Rubin, 1997; Dowland and Furnell, 2004;Gunetti and Picardi, 2005; Ahmed and Traore, 2014).An application where continuous authentication is ap-plicable is in the case of students completing onlineassessments as part of distance and online learningsystems.The focus of the work presented in this paper is con-tinuous authentication. The reasons for this are as

follows: (i) there is little reported work concerningcontinuous authentication using keystroke dynamicsdue to the challenges involved, and (ii) the increas-ing prevalence of internet facilitated distance learning(eLearning, Massive Open Online Courses and so on)where continuous authentication is desirable.

In this paper, we introduce a novel mechanismfor keystroke continuous authentication, namelyKeystroke Continuous Authentication based SpectralAnalysis (KCASA) model. The proposed model ismotivated by conceptualizing the process of keyboardusage as a continuous stream of keystroke events,thus as a time series which can be transformed intothe spectral domain to extract typing patterns. Morespecifically, the idea is to convert a given keystrokestream from the temporal domain (raw data) to thesinusoidal (frequency) domain. The intuition is thatsuch transformations for time series streams leadto faster, and more accurate, detection of patterns(Chan and Fu, 1999; Keogh et al., 2001). There-fore, keystroke streams can be effectively employedfor real-time/continuous user authentication. In thisstudy, two types of spectral transform are considered:(i) Discrete Fourier Transformation (DFT) and (ii)Discrete Wavelet Transform (DWT).

The remainder of this paper is structured as follows.

In Section 2, we provide a problem statement and dis-cuss current issues with respect to keystroke contin-uous authentication. This is followed with Section3 where definitions and preliminaries concerning theproposed model are given. Section 4 then discussesthe proposed process of finding similarity betweenkeystroke sinusoidal signals, while in Section 5 theproposed KCASA model is presented. The evalua-tion of the proposed approach is given in Section 6.Finally, the paper is concluded with a summary of themain findings and some recommendations for futurework in Section 7.

2 PREVIOUS WORK

The fundamental approach of using keystroke dy-namics for user authentication is founded on twokeystroke timing features: (i) key hold time (KHt ),the elapsed time between a key press and a key re-lease; and (ii) flight time (F t ), the time between nconsecutive key presses (releases), also sometimesreferred to as flight time latency or simply latency(Obaidat and Sadoun, 1997). Both can be indexedusing either a temporal or a consecutive numeric ref-erence. Whatever the case both flight time and holdtime can be used to construct a distinctive typing pro-file associated with individual users (Gaines et al.,1980). These profiles are typically encapsulated usinga feature vector representation of some form. In otherwords, typing profiles are frequently constructed us-ing vectors of statistical values, such as the averageand standard deviation of hold times, or the digraphflight time latency of selected frequently occurring di-graphs. Authentication is then operated by compar-ing the similarity between stored feature vector rep-resented typing (reference) profiles, which are knownto belong to a specific user, and a previously unseenprofile that is claimed to belong to a particular user.Although there has been only limited reported workdirected at keystroke continuous keystroke authenti-cation, what reported work there has been has used afeature vector representation; this has met with somesuccess.However, there are some limitations regarding the uti-lization of the feature vector representation in the con-text of keystroke continuous authentication. One ofthe main limitations is that the size of the feature vec-tors, a significant number of digraphs and/or trigraphshas to be considered which is infeasible in the contextof real-time continuous authentication. In (Monroseand Rubin, 1997) the feature vectors were composedof the flight time means of all digraphs in the trainingdataset. The continuous authentication was then con-

ducted by repeatedly generating “test” feature vectorsfor a given user, one every minute, and comparingwith stored reference profiles. If a statistically simi-lar match was found, then this was considered to bean indication of user authentication. Although thetyping profile was composed of all digraph features,the overall reported accuracy was a surprising 23%.Similarly, in (Dowland and Furnell, 2004) the meanand Standard Deviation (SD) of the flight times forall digraphs and trigraphs in the training dataset wereused. Thus, an average of 6,390 digraphs was neededto make up a sufficient typing profile.

Some researchers have attempted to use an abstrac-tion of features to decrease the size of feature vectors.In (Gunetti and Picardi, 2005) the flight time, for fre-quent n-graphs, was used, although the approach wasused in the context of user identification, as opposedto authentication. Thus, given a previously unseensample, the shared n-graphs in the sample and thestored n-graphs were identified and collected in sep-arate arrays. The elements in the arrays were thenordered according to flight time and the differencebetween the arrays computed by considering the or-derings of the elements; a measure referred to as thedegree of disorder was used (an idea motivated bySpearman’s rank correlation coefficient (Zar, 1972)).Identifying a new sample required comparison withall stored sample profiles (reference profiles), a com-putationally expensive process. In the reported evalu-ation, 600 reference profiles were considered (gener-ated from 40 users, each with 15 samples); the timetaken for a single match, in this case, was 140 sec-onds (using a Pentium IV, 2.5 GHz). However, con-struction typing profile using the average flight timeof only shared n-graphs contained in the training datamight not be representative of the n-graphs in the sam-ples to be authenticated. This can, in turn, affect theauthentication accuracy, especially in the context ofreal-time continuous authentication where typing pat-terns are extracted from free text; thus a substantialamount of n-graphs are expected to be typed in thecurrent session. Furthermore, it can be observed fromthe study presented in (Gunetti and Picardi, 2005) thatthe authentication of one sample relies on all othersamples in the training data. This can also lead to anefficiency issue where, in the context of continuousauthentication, the current sample needs to be com-pared against the claimed user’s reference profile.

In (Ahmed and Traore, 2014) an Artificial NeuralNetwork classifier was used to build a predictionmodel to overcome the limitation of (Gunetti and Pi-cardi, 2005) work. Key-down time was used togetherwith average digraph and monograph flight times topredict missing digraphs based on the limited infor-

mation in the training data; thus no need to involve agreat number of keystroke features while constructingthe typing profile. This mechanism worked reason-ably well in the context of static authentication in acontrolled setting (homogeneous); typing of the sametext using the same keyboard layout in an allocatedenvironment. Thus the work on continuous authenti-cation remains an open area for further investigation.A general criticism of the feature vector approach isthat the feature vector values are either typing patternabstractions (for example average hold times) or onlyrepresent a subset of the data (for example only fre-quently occurring digraphs).It is argued in this work that the feature vector repre-sentation may not be the most appropriate representa-tion for keystroke continuous authentication. There-fore, it is conjectured that representing keystroke fea-tures as time series signals, and transforming thesesignals to the frequency domain, can lead to a betterunderstating of typing patterns with respect to real-time continuous authentication using keystroke dy-namics. To the best knowledge of the authors, noprior work in the literature has considered the conceptof sinusoidal representation for keystroke dynamicsin the context of continuous keyboard authentication.Note that in (Alshehri et al., 2016b) the authors firstproposed the idea of keyboard continuous authenti-cation using time series, but with respect to statictext. In (Alshehri et al., 2016a) it was suggested thatthis could also be applied in the context of contin-uous text, although only hold time was considered.This paper presents a much more sophisticated real-isation and analysis of the approach encompassing:(i) the idea of transforming the keystroke timing fea-tures into the sinusoidal (frequency) domain, (ii) us-ing additional keystroke timing features to enhancethe effectiveness of the authentication, (iii) usage ofa transformed sinusoidal sliding windows to achievethe desired continuous authentication, (iv) a processfor cleaning keystroke streaming data before authen-tication is conducted and (iv) a dynamic method forcalculating similarity thresholds calibrated to individ-ual users.

3 KEYSTROKE TIME SERIESREPRESENTATION

As already noted, the process of typing producesa Keystroke time series Kts = {e1,e2, . . . ,en} whereen is an independent data event, and n ∈ N is thelength of the time series. Each data event ei rep-resents a tuple of the form of 〈ti,ki〉 where: (i) tiis a temporal index of some form, and (ii) ki de-

notes some associated attribute (feature) value. Thus,Kts = {〈t1,k1〉,〈t2,k2〉, . . . ,〈ti,ki〉}. Such a time se-ries can be viewed as a 2D plot with t along the x-axis and attribute value k along the y-axis (Figure 1).With respect to the work presented in this paper, thevalue for ti is set to be a sequential ID number (se-quence of key presses), whilst k records either flighttime (F t ) or hold time (KHt ). Note that in this pa-per only univariate time series representation is con-sidered, that is, in the evaluation section, we haveadopted F t and KHt features independently to deter-mine the effectiveness of each on the proposed model.Figure 1 shows four pairs of Kts sequences, each fea-turing n = 300 keystrokes using F t feature. The fig-ure shows four (random) subjects selected from thedatasets used for evaluation purposes as reported onin Section 6. Inspection of the figure clearly indi-cates that individual subjects have distinct keystrokestreams and thus that they can be used to generate dis-tinct typing profiles. Note that we also represent typ-ing streams using KHt , in the same manner; however,because of space limitations these are not included inthe figure.The generated keystroke time series can be used di-rectly as described in (Alshehri et al., 2016b). How-ever, as already noted, the usage of such “raw” timeseries is expensive in terms of efficiency and storagecapacity (Agrawal et al., 1993). Thus the idea pre-sented in this paper is to use some forms of transfor-mation of the time series; it is conjectured that thiswill yield accurate results more efficiently. As notedin the introduction to this paper, two transformationsare considered: (i) Discrete Fourier Transform (DFT),and (ii) Discrete Wavelet Transform (DWT). Each isdiscussed in further detail in the following two sub-sections.

3.1 DFT for Keystroke Time Series

The Discrete Fourier Transform (DFT) has beenwidely adopted with respect to time series data of allkinds (see for example (Agrawal et al., 1993; Vla-chos et al., 2004)). In this paper, DFT has been usedto transform keystroke time series data from the tem-poral domain to the frequency domain. The idea isthen to compact the keystroke data points without los-ing any salient information. The compression is con-ducted by representing the keystroke stream as a lin-ear combination of sinusoidal coefficients. Then thesimilarity is computed between the transformed coef-ficients for any pairs of corresponding signals.Let’s assume that we have a keystroke time series,such that Kts = {e1,e2, . . . ,en}, where ki ∈ en is ei-ther a F t or a KHt value, and n is the length of the

0

2

1

2

1

0

F t(ms)Subject 1 Subject 2 Subject 3 Subject 4

a) a) a) a)

b) b) b) b)

Figure 1: Examples time series (n = 300) for four subjects, two examples per subject, writing unspecified free text.

keystroke time series. The DFT transform then com-presses Kts into a linear set of sinusoidal functionswith amplitudes p, q and phase w:

Kts =N

∑i=1

(piCos(2πwkF ti )+qiSin(2πwiF t

i )) (1)

Note that the time complexity to transform (each)Kts is O(n log n) using the radix 2 DFT algorithm(Janacek et al., 2005; Cooley and Tukey, 1965).Using the DFT transform, the obtained Kts is com-posed of a new magnitude (the amplitude of the dis-crete coefficients) and phase spectral shape in whichthe similarity can be computed between pairs of trans-formed Kts frequencies. Similarity measurement willbe discussed in further detail in Section 4. For fur-ther detail concerning the DFT, interested readers arereferred to (Harris, 1978).

3.2 DWT for Keystroke Time Series

The Discrete Wavelet Transform (DWT) is an alter-native form of time series representations that con-siders the time span over which different frequen-cies are present in a time series. DWT is some-times claimed to provide a better transformation thanDFT in that it retains more information (Chan and Fu,1999). DWT can be applied to time series accord-ing to different scales, orthogonal (Haar, 1910) andnonorthogonal (Gabor, 1946). In this paper, an or-thogonal scale is used for the DWT, more specificallythe well known Haar transform was adopted (Haar,1910) as described in (Chan and Fu, 1999). Fun-damentally, a Haar Wavelet is simply a sequence offunctions which together form a wavelet comprisedof a series of square shapes. The Haar transform isconsidered to be the simplest form of DWT; however,it has been shown to offer advantages with respect totime series analysis where the time series feature sud-den changes. The transformation is usually describedas per Equation 2 where, in the context of this paper,x is a keystroke timing feature.

φ(x) =

1, if 0 < t < 12

−1, if 12 < t < 1

0, otherwise(2)

The time complexity for the Haar transform is O(n)for each Kts. For space limitations, we omit thefull mathematical explanation of the Haar transform;however interested readers may refer to (Edwards,1991) and (Burrus et al., 1997) for further detail.

4 SIMILARITY MEASUREMENT

To compare transformed keystroke streams, it is nec-essary to use some kind of similarity measure. Typ-ically, given two keystroke streams (time series), S1and S2 of the same length, the simplest way to com-pare them is to find the Euclidean absolute distancesbetween all pairs of corresponding points in S1 and S2and compute the average distance. If the average dis-tance is 0, S1 and S2 are identical. However, this sim-ple approach does not take into account offsets (phaseshifts). For the process of the KCASA model, dis-cussed in the following section, Dynamic Time Warp-ing (DTW) was therefore adopted. The reason is thatDTW takes into consideration phase shifting betweenpairs of signals more accurately than Euclidean dis-tance measurement (Ye and Keogh, 2009).DTW operates as follows. For twogiven (transformed) keystroke time se-ries S1 = {a1,a2, . . . ,ai, . . . ,ax} and S2 ={b1,b2, . . . ,b j, . . . ,by}, where x and y are thelength of the two series respectively, and (ai and b j)are DFT or DWT coefficients, the elements of eachseries are constructed in a matrix M of size x×y. Thevalue for each element mi j ∈M is then computed bycalculating the distance from each element ai ∈ S1 toeach element b j ∈ S2:

mi j =√

(ai−b j)2 (3)

Top

0 2 4 6 8

0

2

4

6

8

(a)

0 2 4 6 8

0

2

4

6

8

(b)Bottom

0 2 4 6 8

0

2

4

6

8

(c)

0 2 4 6 8

0

2

4

6

8

(d)Figure 2: WPs examples. Top: WPs obtained from comparing two keystroke sinusoidal signals from same subject typingdifferent texts, (a) DFT and (b) DWT. Bottom WPs obtained from comparing two keystroke sinusoidal signals from twodifferent subjects writing different texts, (c) DFT and (d) DWT.

A Warping Path (WP = {mi1, j1 ,mi2, j2 , . . .}) is thena sequence of matrix elements (locations), mi j, suchthat each location is immediately above, to the rightof, or above and to the right of, the previous location(see Figure 2). For each location, the next locationis chosen so as to minimise the accumulated warpingpath length. The “best” warping path is the one thatserves to minimise the distance from m1,1 to mx,y. Theidea is then to find the path with the shortest ”warp-ing distance” (wd) between the two series calculatedas follows:

wd =i=|WP|∑i=1

mi ∈WP (4)

The value of wd is thus an indicator of the similar-ity between two keystroke signals; if wd = 0 the twokeystroke signals are identical.To further illustrate the concept of DTW, Figure 2presents four WPs, resulting from application of theDTW process. Figures 2(a) and 2(b) show WPs ob-tained when DTW was applied to keystroke sinu-soidal signals for the same subject writing different

unknown texts; Figure 2(a) using DFT and 2(b) us-ing DWT. In contrast, Figures 2(c) and 2(d) show theWPs obtained when comparing keystroke sinusoidalsignals associated with two different subjects, writingdifferent texts; Figure 2(c) using DFT and 2(d) usingDWT.

5 KEYSTROKE CONTINUOUSAUTHENTICATION BASEDSPECTRAL ANALYSIS(KCASA) OPERATION

The proposed KCASA model operates using a win-dowing approach, continuously sampling keystrokestream subsequences Kw ⊂ Kts. The window size w ispredefined by the user. Thus Kw = {ei,ei+1, . . . ,ew}where i is a “start” time stamp. The keystroke streamsubsequences can be made up of either flight time (F t )or hold time (KHt ) values and can be processed sim-ply as a straight forward time series, the Keystroke

Time Series (KTS) representation. Alternatively, asproposed in this paper, the time series can be trans-formed, using the DFT or DWT representation as de-scribed above. In the evaluation presented later inthis paper, the effectiveness of the DFT and DWTrepresentations is compared with the operation of thestraight-forward KTS representation.

5.1 User Profile Calculation

A user profile Up is a set of m non-overlappingkeystroke streams (windows), or simply keystrokesinusoidal windows, Up = {W1,W2, . . . ,Wm}, whereeach window W has a length of ω. Note that |Up|needs to be substantially greater than the windowlength ω, so that a number of subsequences (win-dows) can be extracted. Note also that the gener-ated windows are prepared for the next transforma-tion using DFT and DWT. Note also that ω is userdefined. For the experiments reported on later in thispaper, a range of ω values was considered from 25to 150 key presses increasing in steps of 25, that isω = {25,50,75,100,125,150}. By doing so, we canexamine the effect of ω on performance in terms ofaccuracy. It was anticipated that a small window sizewould provide efficiency gains; that is desirable in thecontext of real-time continuous authentication.The set Up is also used to generate a bespoke σ

threshold value. This is calculated by comparing allsubsequences in Up using DTW, and obtaining an av-erage warping distance w̄d which is used as the valuefor σ:

σ = w̄d =1|Up|

|Up|∑i=2

DTW (Wi−1,Wi) (5)

It has been shown that averaging the warping dis-tances of time series lead to fast and accurateclassification of streaming data (Niennattrakul andRatanamahatana, 2009).

5.2 Subsequence Preprocessing andNoise Reduction

Before the KCASA authentication process can com-mence, each newly collated keystroke time seriesmust be cleaned. The issue here is that F t values canbe large, for example when the subject has pausedtyping or as a consequence of (say) an “away fromkeyboard” event. A limit is therefore placed on F t

values using a maximum flight time threshold valueϕ. Given a F t value in excess of ϕ, the value will bereduced to ϕ. For the evaluation presented later in this

paper, a range of values for ϕ were considered, rang-ing from 0.750 to 2.00 seconds increasing in steps of0.25 seconds, that is:

ϕ = {0.75,1.00,1.25,1.50,1.75,2.00}With respect to key hold time KHt , the time wherebya key is held down is normally no longer than 1 sec-ond. Inspection of the datasets used in the study pre-sented in this paper indicated that the highest recordedvalue of KHt was 0.950 seconds. Consequently, itwas felt that no maximum hold time threshold wasrequired in this case.

5.3 The KCASA Algorithm

The pseudo code for KCASA process is presented inAlgorithm 1. As already noted, the principle ideais, as typing proceeds, to collect non-overlappingkeystroke sinusoidal windows, each of length ω, andcompare these to previously obtained keystroke si-nusoidal signals. On start up, it is first necessaryto confirm that the user is who they say they are bycomparing the first collected sinusoidal windows withthe user profile Up as described in Sub-section 5.1.As the session proceeds, continuous authentication isundertaken by comparing the most recent sinusoidalwindows Wi with the previously collected sinusoidalwindows Wi−1. Algorithm 1 takes the following in-puts: (i) window size ω, (ii) a similarity threshold σ

(derived as described above in Sub-Section 5.1) and(iii) a ϕ threshold for F t . The process operates con-tinuously in a loop until the typing session is termi-nated (the user completes the assessment, times outor logs-out) (lines 4-6). Values for k are recorded assoon as the typing session starts (line 7). Note that inthe case of flight time the value will be checked, andif necessary replaced, according to ϕ (lines 8 to 10).The k value is then appended to the keystroke streamKts. The counter is monitored, and sub-sequences areextracted whenever ω keystrokes have been obtained.For the first collected window (W1 ∈ Kts) this is thestartup time series; each subsequent sinusoidal win-dow Wi is then compared, using DTW, with the previ-ous Wi−1 sinusoidal window.

6 EVALUATION

A series of experiments were conducted to evaluatethe proposed KCASA model to determine how wellit performed in terms of the detection of imperson-ators. Comparisons were also undertaken with respectto a Feature Vector Representation (FVR), the estab-lished approach from the literature to keystroke con-

Table 1: Summary of datasets.

Dataset # Sub. Env. Lang. Features Ave. size SDACB 30 Free English F t ,KHt 4625 1207GP 31 Free. Italian F t 7157 1095VHHS 39 Lab. English F t ,KHt 4853 1021

Algorithm 1 KCASA algorithm

Input: ω, σ, ϕ.Output: Continuous authentication commentary.

1: counter = 02: Kts = /0

3: loop4: if terminated signal received then5: break6: end if7: k = keystroke feature (e.g. F t or KHt )8: if Flight time & k > ϕ then9: k = ϕ . Noise reduction.

10: end if11: Kts = Kts∪〈counter,k〉12: counter++13: if REM(counter/ω) == 0 then14: Wi = subsequence{Ktscounter−ω

. . .Ktscounter}15: if counter = ω then . Start up situation16: Trans f orm(W ) . Transform W to

(DFT)/(DWT)17: Start up: authenticate Wi w.r.t Up and

σ, and report18: else19: Authenticate Wi w.r.t. Wi−1 and σ, and

report20: end if21: end if22: end loop

tinuous authentication. The metrics used for the eval-uation were: (i) authentication accuracy (Acc.), (ii)the False Acceptance Rate (FAR) and (iii) the FalseRejection Rate (FRR)1. In more detail, the objectivesof the evaluation were:

1. Authentication Performance using the KCASAModel: To compare the effectiveness of the DFTand DWT representations in the context of theproposed KCASA approach, and the usage of thesimple KTS representation (as prposed in (Al-shehri et al., 2016b)), in terms of accuracy, FARand FRR.

2. Effect on Authentication Performance using

1FAR and FRR are the traditional metrics used to mea-sure the performance of Biometric systems (Polemi, 1997).

Different Parameters: To determine the effect ofusing different values for ω (the sampling windowsize) and ϕ (the maximum flight time thresholdvalue).

3. Efficiency: to compare the run time efficiency ofKCASA in the context of the three representationsconsidered (DFT, DWT and KTS).

4. Comparison with Feature Vector Approach: Tocompare the operation of KCASA with the estab-lished feature vector based approach for keystrokecontinuous authentication.

Note that the evaluation was conducted using flighttime and hold time so as to also analyse which featureyielded the better results.The rest of this section is organised as follows. Thedatasets used for the evaluation are introduced in Sub-section 6.1. The results with respect to the first set ofexperiments are considered in Sub-section 6.2, whilethose with respect to the second set of experimentsin Sub-section 6.3. Efficiency is considered in Sub-section 6.4; and the comparison with the feature vec-tor based approach is presented in Sub-section 6.5.

6.1 Datasets

Three datasets were used with respect to the reportedexperiments (Gunetti and Picardi, 2005; Vural et al.,2014; Alshehri et al., 2016b) to conduct. For easeof presentation the three data sets are identified hereusing acronyms made up of the authors’ surnames:GP (Gunetti and Picardi, 2005), VHHS (Vural et al.,2014) and ACB (Alshehri et al., 2016b).GP dataset comprised 31 subjects typing free text inItalian (that used in (Gunetti and Picardi, 2005) had40 subjects, but some records are not available in thepublic version). The VHHS dataset was collected inlaboratory conditions. The subjects were asked totype both predefined text and free text (in English);however, only the free text part was used with respectto the experiments reported on in this paper. Note alsothat for the GP dataset only the F t feature was avail-able, whilst for the remaining two datasets both F t

and KHt were collected. Therefore the performanceof KCASA using KHt could not be evaluated usingthe GP dataset.The ACB comprises 30 subjects although the origi-nal dataset consisted of 17 subjects, but the number

of subjects has increased to 30 in the public version.Each subject provided free text samples (in English)in a simulated online assessment environment; theaim being to mimic the mode of typing when usingan eLearning platform. Thus, the subjects used what-ever keyboard they had at hand.Table 1 provides a summary of the three datasets used;the table also includes some statical measurementsconcerning the average length of the time seres ineach data collection and the associated Standard De-viation (SD). For evaluation purpose, each record ineach data set was divided into two where the first halfwas used to generate the typing profile Up, and thesecond half for the continuous authentication evalua-tion.

6.2 Authentication Performance usingthe KCASA Model

The results obtained with respect to the evaluation di-rected at comparing the DFT, DWT and KTS KCASArepresentations, using either F t or KHt , are givenin Tables 2 to 5; Tables 2 and 4 show the accuracy(Acc.), FAR and FRR results obtained using F t , whileTables 3 and 5 presents the results, using the samemetrics, obtained using KHt . For the reported experi-ments, ω = 75 keystrokes and ϕ = 1.25 seconds wereused as default settings. These parameters were usedbecause experiments, reported on in the followingsub-section, had indicated that these produced best re-sults.From Table 2, it can be observed that the DWT rep-resentation produced the best overall accuracy (aver-age accuracy of 98.24% with an associated StandardDeviation-SD of 1.07) when using F t . With respectto FAR, we can observe from Table 4 that DWT alsoproduced best results, except in the case of the GPdatasets where DFT was recorded as producing thebest result. It can also be noted from Table 4 that theDWT representation gave the best FRR results withan average of 1.50 and an associated SD of 0.14.With respect to KHt (Tables 3 and 5), a best accuracyresults of 95.66% was obtained using DFT (with anassociated SD of 2.40). Inspection of Table 5 showsthat the best average FAR result was 0.04 when us-ing the DFT representation, and the best average FRRresult was 1.56 using DWT. Recall that evaluation us-ing KHt could not be conducted using the GP datasetbecause KHt was not recorded in this case.To support a comparison summary, the results listedin Tables 2 to 5 are presented in summary form in Ta-ble 6. From this summary table, it can be observedthat the simple KTS representation did not performwell compared to the DFT and DWT representations.

Also, from the results presented in this table, an argu-ment can be made in favor of the DWT representation,coupled with F t , which gave the best overall perfor-mance in terms of Acc, FAR and FRR.

6.3 Effect on AuthenticationPerformance using DifferentParameters

The results presented in the previous sub-section as-sumed a window size ω of 75 and a maximumF t threshold value ϕ of 1.25. Recall that the lat-ter is only applicable in the context of F t . Toevaluate the effect of these parameters, experimentswere conducted using a range of values for ω andϕ; {25,50,75,100,125,150} key presses for ω, and{0.75,1.00,1.25,1.50,1.75,2.00} seconds for ϕ.

0

20

40

60

80

150

100

125100

VHHS dataset

7550

25 DWTDFT

KTS

Figure 3: The effect of the ω parameter on accuracy usingKHt feature for VHHS dataset.

0

20

40

60

80

150

100

125100

ACB dataset

7550

25 DWTDFT

KTS

Figure 4: The effect of the ω parameter on accuracy usingKHt feature for ACB dataset.

The accuracy results using KHt , as the keystroke dy-namics, are shown in the form of 3D bar charts inFigures 3 and 4 for the VHHS and ACB datasets re-spectively. From the figure, it can be seen that ω = 75produced better accuracy results for the two datasetsin terms of all three KCASA representations, withthe exception of the KTS representation in the ACBdataset where ω = 100 has produced better accuracy.

Table 2: Accuracy results obtained using the three differ-ent KCASA representations when using Ft (best resultsin bold font).

DatasetFlight time F t

AccuracyKTS DFT DWT

ACB 96.20 97.43 99.22GP 95.47 96.94 98.41VHHS 94.83 97.43 97.09Average 95.50 97.27 98.24SD 0.68 0.28 1.07

Table 3: Accuracy results obtained using the three dif-ferent KCASA representations when using KHt (best re-sults in bold font).

DatasetKey hold time KHt

AccuracyKTS DFT DWT

ACB 96.15 97.36 95.09VHHS 94.33 93.69 95.75Average 95.24 95.66 95.42SD 1.29 2.40 0.47

Table 4: FAR and FRR results obtained using the threedifferent KCASA representations when using Ft (bestresults in bold font).

DatasetFlight time F t

FAR FRRKTS DFT DWT KTS DFT DWT

ACB 0.050 0.030 0.026 1.96 1.50 1.37GP 0.039 0.034 0.035 1.98 1.72 1.48VHHS 0.030 0.022 0.016 1.97 1.85 1.65Ave. 0.040 0.029 0.026 1.97 1.69 1.50SD 0.010 0.006 0.010 0.01 0.17 0.14

Table 5: FAR and FRR results obtained using the threedifferent KCASA representations when using KHt (bestresults in bold font).

DatasetKey hold time KHt

FAR FRRKTS DFT DWT KTS DFT DWT

ACB 0.06 0.04 0.45 2.01 1.61 1.38VHHS 0.03 0.02 0.04 1.97 1.91 1.74Ave. 0.05 0.04 0.25 1.99 1.76 1.56SD 0.02 0.01 0.29 0.02 0.22 0.25

Table 6: Summary of results presented in Tables 2 to 5.

Metric F t Feature KHt FeatureKTS DFT DWT KTS DFT DWT

Acc 95.50 97.27 98.24 95.24 95.66 95.42FAR 0.040 0.029 0.026 0.05 0.04 0.25FRR 1.97 1.69 1.50 1.99 1.76 1.56

The accuracy results obtained using F t as thekeystroke dynamics are presented, again in the formof 3D bar charts, in Figure 5. From this Figure, itcan be seen that ω and ϕ values of 75 and 1.25, re-spectively, tended to produce better results, althoughthe selection of ϕ does not seem to have had as muchimpact as the selection of ω. Note also that accuracy“levels off” as ω is increased.

6.4 Efficiency

To compare the efficiency of the considered KCASArepresentations, experiments were conducted in termsof the time to generate the user profiles in each case.For the experiments, ω was set to a range of values, asdescribed earlier, whilst ϕ was kept constant at 1.25because earlier experiments, reported on above, haddemonstrated that the value of ϕ was less significant.The efficiency performance using F t is presented inFigure 6 with respect to each of the three datasets con-sidered. From the Figure, it can be seen that as ω in-

creased the run time also increased. This was to beexpected because the computation time required forthe DTW would increase as the size of the windowω increases. Interestingly, there are well-known solu-tions to mitigate against the complexity of DTW (seefor example (Itakura, 1975; Sakoe and Chiba, 1978));however, no such mitigation was applied with respectto the experiments reported on in this paper althoughthis could clearly be done. We left this for further fu-ture investigation.

Overall the results indicated that when using the pro-posed transformations efficiency gains were madewith respect to the simple KTS representation, withDFT producing better runtime results than DWT. Fur-thermore, comparing the runtime performance ob-tained with the feature vector approach to keystrokeauthentication, it is interesting to note that in (Gunettiand Picardi, 2005) the time taken to construct a userprofile was given as 140 second, a significant differ-ence to when using the proposed KCASA method.

Note that in the context of KHt , similar runtime re-sults were produced to those presented in Figure 6,because both are using the same DTW similarity mea-sure.

GP dataset

0

20

40

60

150

80

100

125

KTS

2.001001.7575

1.50501.2525

1.000.750

0

20

40

60

150

80

100

125

DFT

2.001001.7575

1.50501.2525

1.000.750

0

20

40

60

150

80

100

125

DWT

2.001001.7575

1.50501.2525

1.000.750

VHHS dataset

0

20

40

60

150

80

100

125

KTS

2.001001.7575

1.50501.2525

1.000.750

(a)

0

20

40

60

150

80

100

125

DFT

100 2.001.7575

1.50501.2525

1.000.750

(b)

0

20

40

60

150

80

100

125

DWT

2.001001.7575

1.50501.2525

1.000.750

(c)ACB dataset

0

20

40

60

150

80

100

125

KTS

2.001001.7575

1.50501.2525

1.000.750

0

20

40

60

150

80

100

125

DFT

100 2.001.7575

1.50501.2525

1.000.750

0

20

40

60

150

80

100

125

DWT

2.001001.7575

1.50501.2525

1.000.750

Figure 5: The accuracy results obtained for KTS, DFT and DWT using different values ω and ϕ.

6.5 Comparison with Feature VectorApproach

From the literature, previous work on keystroke con-tinuous authentication has been conducted using Fea-ture Vector Representation (FVR). It has already beennoted that the proposed KCASA model has signifi-cant runtime advantages over the feature vector basedapproach (see above). However, it was felt appro-priate to conduct further experiments comparing theoperation of KCASA with the feature vector basedapproach in terms of authentication accuracy. Us-ing both F t and KHt appropriate feature vectors weregenerated. Concequently, further comparison couldbe made with the mechanism proposed in (Gunettiand Picardi, 2005) (see Section 2). The reason forselecting the mechanism presented in (Gunetti andPicardi, 2005) was that the mechanism, to the bestknowledge of the authors, had produced the best re-ported FAR and FRR results to date. However, itshould be noted that the code for that mechanism isnot available; thus we encoded the mechanism our-

selves according to the description given in the origi-nal study. So as to conduct a fair comparison only F t

was considered, because the study in (Gunetti and Pi-cardi, 2005) used F t values. The average accuracyresults obtained, when comparing the operation ofFVR with the KTS, DFT and DWT representations,in terms of F t , are given in Figure 7. The best ac-curacy result obtained for FVR was 90.15%, signifi-cantly worse than the accuracy results obtained usingKCASA representations which yielded a best accu-racy result of 98.24% (when using the DWT repre-sentation).

7 CONCLUSION

In this paper, a novel mechanism for realtime con-tinuous keystroke authentication, called KeystrokeContinuous Authentication using Spectral Analysis(KCASA) has been proposed, whereby authentica-tion of user typing patterns is conducted by captur-ing keystroke dynamics in the form of spectral (fre-

KTS

DFT

DWT

(a)

KTS

DFT

DWT

(b)

KTS

DFT

DWT

(c)Figure 6: Runtime (seconds) comparison using flight time and the three KCASA representations with respect to each of thethree datasets, (a) GP, (b) VHHS and (c) ACB.

Figure 7: The obtained average accuracy using the threerepresentations (KTS, DFT, DWT and FVR) with respect tothe three datasets used. DWT shows a comparative perfor-mance with respect to KCASA model.

quency) streams. KCASA efficiently operates us-ing either flight time F t or hold time KHt keystroketiming features. Two spectral transformations wereconsidered to represent keystroke timing features:(i) Discrete Fourier Transform (DFT) and (ii) Dis-crete Wavelet Transform (DWT). Keystroke spec-tral streams similarity was conducted using DynamicTime Warping (DTW), although alternative time se-ries comparison techniques could equally well havebeen applied. The KCASA model operates by con-tinuously extracting non-overlapped keystroke sinu-soidal signals captured using a sliding window of sizeω. The most appropriate size for ω was found tobe 75 keystrokes for both timing features (flight timeF t and key hold time KHt ). In the case of flighttime, an issue was discovered with excessive flighttimes; flight times were thus capped with a maximumvalue defined by a parameter ϕ, the most appropriatevalue for ϕ was found to be 1.25 seconds. The re-ported experimentation and evaluation indicated that

the most accurate representation was DWT using theF t keystroke feature, while the most efficient wasfound to be DFT. Experiments were also reported onindicating that the proposed KCASA model outper-formed the feature vector based approach used bycomparator systems such as that reported in (Gunettiand Picardi, 2005). For future work, the authors in-tend to investigate the usage of multivariate keystroketime series (incorporating F t and KHt timing featurestogether) within the context of the proposed KCASAmodel. Furthermore, the time complexity of DTW, inthe context of the proposed representations, remainsan open topic for future work.

REFERENCES

Agrawal, R., Faloutsos, C., and Swami, A. (1993). Efficientsimilarity search in sequence databases. In Interna-tional Conference on Foundations of Data Organiza-tion and Algorithms, pages 69–84. Springer.

Ahmed, A. A. and Traore, I. (2014). Biometric recognitionbased on free-text keystroke dynamics. Cybernetics,IEEE Transactions on, 44(4):458–472.

Alshehri, A., Coenen, F., and Bollegala, D. (2016a). Key-board usage authentication using time series analy-sis. In International Conference on Big Data Analyticsand Knowledge Discovery, pages 239–252. Springer.

Alshehri, A., Coenen, F., and Bollegala, D. (2016b). To-wards keystroke continuous authentication using timeseries analytics. In Proc. AI 2016, Research andDevelopment in Intelligent Systems XXXIII, Springer,pp275-287., pages 325–338. Springer.

Bleha, S., Slivinsky, C., and Hussien, B. (1990). Computer-access security systems using keystroke dynamics.Pattern Analysis and Machine Intelligence, IEEETransactions on, 12(12):1217–1222.

Burrus, C. S., Gopinath, R. A., and Guo, H. (1997). Intro-duction to wavelets and wavelet transforms: a primer.

Chan, K.-P. and Fu, A. W.-C. (1999). Efficient time se-ries matching by wavelets. In Data Engineering,1999. Proceedings., 15th International Conferenceon, pages 126–133. IEEE.

Cooley, J. W. and Tukey, J. W. (1965). An algorithm for themachine calculation of complex fourier series. Math-ematics of computation, 19(90):297–301.

Dowland, P. S. and Furnell, S. M. (2004). A long-term trialof keystroke profiling using digraph, trigraph and key-word latencies. In Security and Protection in Informa-tion Processing Systems, pages 275–289. Springer.

Edwards, T. (1991). Discrete wavelet transforms: Theoryand implementation. Universidad de.

Gabor, D. (1946). Theory of communication. part 1: Theanalysis of information. Journal of the Institution ofElectrical Engineers-Part III: Radio and Communica-tion Engineering, 93(26):429–441.

Gaines, R. S., Lisowski, W., Press, S. J., and Shapiro, N.(1980). Authentication by keystroke timing: Somepreliminary results. Technical report, DTIC Docu-ment.

Gunetti, D. and Picardi, C. (2005). Keystroke analysis offree text. ACM Transactions on Information and Sys-tem Security (TISSEC), 8(3):312–347.

Haar, A. (1910). Zur theorie der orthogonalen funktionen-systeme. Mathematische Annalen, 69(3):331–371.

Harris, F. J. (1978). On the use of windows for harmonicanalysis with the discrete fourier transform. Proceed-ings of the IEEE, 66(1):51–83.

Itakura, F. (1975). Minimum prediction residual principleapplied to speech recognition. IEEETrans. Acoustics,Speech, and Signal Processing, pages 52–72.

Janacek, G. J., Bagnall, A. J., and Powell, M. (2005). Alikelihood ratio distance measure for the similarity be-tween the fourier transform of time series. In Pacific-

Asia Conference on Knowledge Discovery and DataMining, pages 737–743. Springer.

Keogh, E., Chakrabarti, K., Pazzani, M., and Mehrotra, S.(2001). Dimensionality reduction for fast similaritysearch in large time series databases. Knowledge andinformation Systems, 3(3):263–286.

Killourhy, K. S. and Maxion, R. A. (2009). Compar-ing anomaly-detection algorithms for keystroke dy-namics. In Dependable Systems & Networks, 2009.DSN’09. IEEE/IFIP International Conference on,pages 125–134. IEEE.

Monrose, F. and Rubin, A. (1997). Authentication viakeystroke dynamics. In Proceedings of the 4th ACMconference on Computer and communications secu-rity, pages 48–56. ACM.

Niennattrakul, V. and Ratanamahatana, C. A. (2009). Shapeaveraging under time warping. In Electrical Engineer-ing/Electronics, Computer, Telecommunications andInformation Technology, 2009. ECTI-CON 2009. 6thInternational Conference on, volume 2, pages 626–629. IEEE.

Obaidat, M. S. and Sadoun, B. (1997). Verification of com-puter users using keystroke dynamics. Systems, Man,and Cybernetics, Part B: Cybernetics, IEEE Transac-tions on, 27(2):261–269.

Polemi, D. (1997). Biometric techniques: review and evalu-ation of biometric techniques for identification and au-thentication, including an appraisal of the areas wherethey are most applicable. Reported prepared for theEuropean Commision DG XIIIC, 4.

Sakoe, H. and Chiba, S. (1978). Dynamic programmingalgorithm optimization fro spoken word recognition.IEEE Trans. Acoustics, Speech, and Signal Process-ing, pages 43–49.

Shepherd, S. (1995). Continuous authentication by anal-ysis of keyboard typing characteristics. In Securityand Detection, 1995., European Convention on, pages111–114. IET.

Syed, Z. A. (2014). Keystroke and Touch-dynamics BasedAuthentication for Desktop and Mobile Devices. PhDthesis, West Virginia University.

Vlachos, M., Meek, C., Vagena, Z., and Gunopulos, D.(2004). Identifying similarities, periodicities andbursts for online search queries. In Proceedings ofthe 2004 ACM SIGMOD international conference onManagement of data, pages 131–142. ACM.

Vural, E., Huang, J., Hou, D., and Schuckers, S. (2014).Shared research dataset to support development ofkeystroke authentication. In Biometrics (IJCB), 2014IEEE International Joint Conference on, pages 1–8.IEEE.

Ye, L. and Keogh, E. (2009). Time series shapelets: anew primitive for data mining. In Proceedings ofthe 15th ACM SIGKDD international conference onKnowledge discovery and data mining, pages 947–956. ACM.

Zar, J. H. (1972). Significance testing of the spearman rankcorrelation coefficient. Journal of the American Sta-tistical Association, 67(339):578–580.

spectral analysis of keystroke streams: towards effective...

Documents