an energy-based similarity measure for time series

8
Hindawi Publishing Corporation EURASIP Journal on Advances in Signal Processing Volume 2008, Article ID 135892, 8 pages doi:10.1155/2008/135892 Research Article An Energy-Based Similarity Measure for Time Series Abdel-Ouahab Boudraa, 1, 2 Jean-Christophe Cexus, 2 Mathieu Groussat, 1 and Pierre Brunagel 1 1 IRENav, Ecole Navale, Lanv´ eoc Poulmic, BP600, 29240 Brest-Arm´ ees, France 2 E3I2, EA 3876, ENSIETA, 29806 Brest Cedex 9, France Correspondence should be addressed to Abdel-Ouahab Boudraa, [email protected] Received 27 August 2006; Revised 30 March 2007; Accepted 24 July 2007 Recommended by Jose C. M. Bermudez A new similarity measure, called SimilB, for time series analysis, based on the cross-Ψ B -energy operator (2004), is introduced. Ψ B is a nonlinear measure which quantifies the interaction between two time series. Compared to Euclidean distance (ED) or the Pear- son correlation coecient (CC), SimilB includes the temporal information and relative changes of the time series using the first and second derivatives of the time series. SimilB is well suited for both nonstationary and stationary time series and particularly those presenting discontinuities. Some new properties of Ψ B are presented. Particularly, we show that Ψ B as similarity measure is robust to both scale and time shift. SimilB is illustrated with synthetic time series and an artificial dataset and compared to the CC and the ED measures. Copyright © 2008 Abdel-Ouahab Boudraa et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. 1. INTRODUCTION A Time Series (TS) is a sequence of real numbers where each one represents the value of an attribute of interest (stock or commodity price, sale, exchange, weather data, biomedical measurement, etc.). TS datasets are common in various fields such as in medicine, finance, and multimedia. For example, in gesture recognition and video sequence matching using computer vision, several features are extracted from each im- age continuously, which renders them TSs [2]. Typical appli- cations on TSs deal with tasks like classification, clustering, similarity search, prediction, and forecasting. These applica- tions rely heavily on the ability to measure the similarity or dissimilarity between TSs [3]. Defining the similarity of TSs or objects is crucial in any data analysis and decision mak- ing process. The simplest approach typically used to define a similarity function is based on the Euclidean distance (ED) or some extensions to support various transformations such as scaling or shifting. The ED may fail to produce a correct similarity measure between TSs because it cannot deal with outliers and it is very sensitive to small distortions in the time axis [4]. The Pearson correlation coecient (CC) is a popu- lar measure to compare TSs. Yet, the CC is not necessarily coherent with the shape and it does not consider the order of time points and uneven sampling intervals. Furthermore, similarity measures using the ED or the CC do not include temporal information and the relative changes of the TSs. Thus, clustering algorithms based on these metrics, such as k-means, fuzzy c-means, or hierarchical clustering, cannot cluster TSs correctly [5]. In this paper, we introduce a new similarity measure, noted SimilB, which includes the tempo- ral information and relative change of the TS. SimilB is based on the Ψ B operator [1], a nonlinear similarity function which measures the interaction between two time-signals including their first and second derivatives [6]. Furthermore, the link established between Ψ B operator and the cross Wigner-Ville distribution shows that Ψ B and consequently SimilB are well suited to study nonstationary signals [1]. 2. THE Ψ B OPERATOR To measure the interaction between two real time signals, the cross Teager-Kaiser operator (CTKEO) has been defined [7]. This operator has been extended to complex-valued sig- nals noted Ψ C , in [1]. The CTKEO, applied to signals x(t ) and y(t ), is given by [xy] ˙ xy x ˙ y, where [xy] is the Lie bracket which measures the instantaneous dierences in the relative rate of change between x and ˙ y. In the general case, if x and y represent displacements in some generalized motions, [xy] has dimensions of energy (per unit mass), it

Upload: ivan-borodavka

Post on 16-Jan-2016

17 views

Category:

Documents


0 download

DESCRIPTION

An Energy-Based Similarity Measure for Time Series

TRANSCRIPT

Page 1: An Energy-Based Similarity Measure for Time Series

Hindawi Publishing CorporationEURASIP Journal on Advances in Signal ProcessingVolume 2008, Article ID 135892, 8 pagesdoi:10.1155/2008/135892

Research ArticleAn Energy-Based Similarity Measure for Time Series

Abdel-Ouahab Boudraa,1, 2 Jean-Christophe Cexus,2 Mathieu Groussat,1 and Pierre Brunagel1

1 IRENav, Ecole Navale, Lanveoc Poulmic, BP600, 29240 Brest-Armees, France2 E3I2, EA 3876, ENSIETA, 29806 Brest Cedex 9, France

Correspondence should be addressed to Abdel-Ouahab Boudraa, [email protected]

Received 27 August 2006; Revised 30 March 2007; Accepted 24 July 2007

Recommended by Jose C. M. Bermudez

A new similarity measure, called SimilB, for time series analysis, based on the cross-ΨB-energy operator (2004), is introduced. ΨB

is a nonlinear measure which quantifies the interaction between two time series. Compared to Euclidean distance (ED) or the Pear-son correlation coefficient (CC), SimilB includes the temporal information and relative changes of the time series using the firstand second derivatives of the time series. SimilB is well suited for both nonstationary and stationary time series and particularlythose presenting discontinuities. Some new properties of ΨB are presented. Particularly, we show that ΨB as similarity measure isrobust to both scale and time shift. SimilB is illustrated with synthetic time series and an artificial dataset and compared to the CCand the ED measures.

Copyright © 2008 Abdel-Ouahab Boudraa et al. This is an open access article distributed under the Creative CommonsAttribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work isproperly cited.

1. INTRODUCTION

A Time Series (TS) is a sequence of real numbers where eachone represents the value of an attribute of interest (stock orcommodity price, sale, exchange, weather data, biomedicalmeasurement, etc.). TS datasets are common in various fieldssuch as in medicine, finance, and multimedia. For example,in gesture recognition and video sequence matching usingcomputer vision, several features are extracted from each im-age continuously, which renders them TSs [2]. Typical appli-cations on TSs deal with tasks like classification, clustering,similarity search, prediction, and forecasting. These applica-tions rely heavily on the ability to measure the similarity ordissimilarity between TSs [3]. Defining the similarity of TSsor objects is crucial in any data analysis and decision mak-ing process. The simplest approach typically used to define asimilarity function is based on the Euclidean distance (ED)or some extensions to support various transformations suchas scaling or shifting. The ED may fail to produce a correctsimilarity measure between TSs because it cannot deal withoutliers and it is very sensitive to small distortions in the timeaxis [4]. The Pearson correlation coefficient (CC) is a popu-lar measure to compare TSs. Yet, the CC is not necessarilycoherent with the shape and it does not consider the orderof time points and uneven sampling intervals. Furthermore,

similarity measures using the ED or the CC do not includetemporal information and the relative changes of the TSs.Thus, clustering algorithms based on these metrics, such ask-means, fuzzy c-means, or hierarchical clustering, cannotcluster TSs correctly [5]. In this paper, we introduce a newsimilarity measure, noted SimilB, which includes the tempo-ral information and relative change of the TS. SimilB is basedon theΨB operator [1], a nonlinear similarity function whichmeasures the interaction between two time-signals includingtheir first and second derivatives [6]. Furthermore, the linkestablished between ΨB operator and the cross Wigner-Villedistribution shows that ΨB and consequently SimilB are wellsuited to study nonstationary signals [1].

2. THE ΨB OPERATOR

To measure the interaction between two real time signals,the cross Teager-Kaiser operator (CTKEO) has been defined[7]. This operator has been extended to complex-valued sig-nals noted ΨC, in [1]. The CTKEO, applied to signals x(t)and y(t), is given by [x, y] ≡ xy − xy, where [x, y] is theLie bracket which measures the instantaneous differences inthe relative rate of change between x and y. In the generalcase, if x and y represent displacements in some generalizedmotions, [x, y] has dimensions of energy (per unit mass), it

Page 2: An Energy-Based Similarity Measure for Time Series

2 EURASIP Journal on Advances in Signal Processing

is viewed as a cross-energy between x and y [7]. Based onΨC function, a symmetric and positive function, called cross-ΨB-energy operator, is defined [1]. We have shown that time-delay estimation problem between two signals is an exampleof interaction measure between these two signals by ΨB [6].Let x and y be two complex signals, ΨB is defined as [1]

ΨB(x, y) = 12

[ΨC(x, y) +ΨC(y, x)

], (1)

where ΨC(x, y) = (1/2)[x∗ y + x y∗]− (1/2)[xy∗ + x∗ y]. TheΨB(x, y) of complex signals x and y is equal to the sum ofΨB(x, y) of their real and imaginary parts [1]:

ΨB(x, y) = ΨB(xr , yr

)+ΨB

(xi, yi

), (2)

where x(t) = xr(t)+ jxi(t) and y(t) = yr(t)+ j yi(t) and j de-notes the imaginary unit. Subscripts r and i indicate the realand imaginary parts of the complex signal. According to (2),theΨB(x, y) is a real quantity, as expected for an energy oper-ator. To compute the analytic signals x(t) or y(t), the Hilberttransform is used. In the following we give the expression ofΨB for analytic signals.

3. EXPRESSION OFΨB FOR ASSOCIATEDANALYTIC SIGNALS

Complex signals are used in various areas of signal process-ing. In the continuous time, they appear, for example, in thedescription for narrow-band signals. Indeed, the appropriatedefinition of instantaneous phase or amplitude of such sig-nals requires the introduction of the analytic signal, whichis necessarily complex. Let x and y be two real signals, andxA and yA, respectively, their corresponding analytic signals:xA = x + jH(x) and yA = y + jH(y), where H(·) is theHilbert transform.1 By applying the relation

uv − 12

(uv + vu) = 2uv − 12d2uv

dt2(3)

in (2), for (u, v) = (x, y) and (u, v) = (H(x), H(y)), respec-tively, it comes that ΨB(xA, yA) is expressed directly in termsof x, y, H(x) and H(y) as

ΨB(xA, yA

) = 2[x y + H(x)H(y)

]

− 12d2

dt2[xy + H(x)H(y)

].

(4)

Equation (4) is used to calculate the interaction between con-tinuous TSs.

4. DISCRETIZING THE CONTINUOUS-TIMEΨB OPERATOR

Discretized derivatives are combined to obtain from the con-tinuous version ofΨB an expression closely related to discrete

1 H(x) = h� x, where the frequency response of h is h( f ) = − jsign( f ).

1 2 3 4 5 6 7 8 9 10

Time

0.5

1

1.5

2

2.5

3

3.5

4

4.5

Am

plit

ude

f1f2f3

Figure 1: Three sampled TSs with different shapes.

Table 1: SimilB, the ED, the CC between f2 and f1, and f2 and f3 inFigure 1.

TSs ED CC SimilB(f2, f1

)3.9955 0.0917 0.930

(f2, f3

)3.9955 0.0917 0.750

Table 2: Classification errors of clustering task using the SimilB, theED, and the CC for CBF.dat dataset.

SimilB ED CC

0.222 0.888 0.888

form of the operator noted ΨBd and operating on discrete-time signals x(n) and y(n). Three sample differences are ex-amined. For simplicity, we replace t by nTs (Ts is the sam-pling period), x(t) with x(nTs) or simply x(n). Using thesame reasoning as in [8] we obtain the following relations.

(i) Two-sample backward difference:

x(t) �−→[xk(n)− xk(n− 1)

]

Ts,

x(t) �−→[xk(n)− 2xk(n− 1) + xk(n− 2)

]

T2s

,

ΨB(xk(t), yk(t)) �−→ xk(n− 1)yk(n− 1)T2s

− 0.5[xk(n)yk(n− 2) + yk(n)xk(n− 2)

]

T2s

,

ΨB(xk(t), yk(t)

) �−→ ΨBd

(xk(n− 1), yk(n− 1)

)

T2s

, k ∈ {i, r}.(5)

Page 3: An Energy-Based Similarity Measure for Time Series

Abdel-Ouahab Boudraa et al. 3

Table 3: Estimated TB value versus SNR signals s1(t) and s2(t) using SimilB.

SimilB SNR = −6 dB SNR = −2 dB SNR = 1 dB SNR = 3 dB SNR = 5 dB SNR = 9 dB(s1(t), r1(t)

)300± 1 300± 1 300 300 300 300

(s2(t), r2(t)

)300± 2 300± 1 300± 1 300± 1 300± 1 300

Finally, the discrete form of ΨB(x(t), y(t)) is given by

ΨB(x(t), y(t)

)

�−→[ΨBd

(xr(n−1), yr(n−1)

)+ΨBd

(xi(n−1), yi(n−1)

)]

T2s

,

(6)

where �−→ denotes the mapping from continuous to discrete.(ii) Two-sample forward difference:

x(t) �−→[xk(n + 1)− xk(n)

]

Ts,

x(t) �−→[xk(n + 2)− 2xk(n + 1) + xk(n)

]

T2s

,

ΨB(xk(t), yk(t)

) �−→ xk(n + 1)yk(n + 1)T2s

− 0.5[xk(n + 2)yk(n) + yk(n + 2)xk(n)

]

T2s

,

ΨB(xk(t), yk(t)

) �−→ ΨBd

(xk(n + 1), yk(n + 1)

)

T2s

, k ∈ {i, r}.(7)

Thus, from ΨB we obtain ΨBd shifted by one sample tothe right and scaled by T−2

s . Finally, the discrete form ofΨB(x(t), y(t)) is given by

ΨB(x(t), y(t)

)

�−→[ΨBd

(xr(n +1), yr(n +1)

)+ΨBd

(xi(n +1), yi(n +1)

)]

T2s

.

(8)

Note that for both asymmetric two-sample differences, ΨB

is shifted by one sample and scaled by T−2s . If we ignore the

one-sample shift and the scaling parameter, one can trans-form ΨB(x(t), y(t)) into ΨBd (x(n), y(n)) as follows:

ΨB(x(t), y(t)

) �−→ ΨBd

(xr(n), yr(n)

)+ΨBd

(xi(n), yi(n)

),(9)

ΨBd

(xk(n), yk(n)

)

= xk(n)yk(n)− 0.5[xk(n + 1)yk(n− 1)

+ yk(n + 1)xk(n− 1)], k ∈ {i, r}.

(10)

0 20 40 60 80 100 120 140

Time

−5

0

5

10

Am

plit

ude

Cylinder

(a)

0 20 40 60 80 100 120 140

Time

−5

0

5

10

Am

plit

ude

Bell

(b)

0 20 40 60 80 100 120 140

Time

−5

0

5

10

Am

plit

ude

Funnel

(c)

Figure 2: The Cylinder-Bell-Funnel dataset (CBF.dat) [10].

(iii) Three-sample symmetric difference:

x(t) �−→[xk(n + 1)− xk(n− 1)

]

2Ts,

x(t) �−→[xk(n + 2)− 2xk(n) + xk(n− 2)

]

4T2s

,

ΨB(xk(t), yk(t)

)

�−→ 2xk(n)yk(n)4T2

s−[xk(n+1)yk(n−1)+yk(n+1)xk(n−1)

]

4T2s

,

xk(n−1)yk(n−1)4T2

s− 0.5

[xk(n)yk(n−2) + yk(n)xk(n−2)

]

4T2s

+xk(n+1)yk(n+1)

4T2s

− 0.5[xk(n+2)yk(n)+yk(n + 2)xk(n)

]

4T2s

,

ΨB(xk(t), yk(t)

)

�−→ [ΨBd

(xk(n+1), yk(n+1)

)+2ΨBd

(xk(n), yk(n)

)

+ΨBd

(xk(n− 1), yk(n− 1)

)]/4T2

s , k ∈ {i, r}.(11)

Page 4: An Energy-Based Similarity Measure for Time Series

4 EURASIP Journal on Advances in Signal Processing

1 7 2 5 8 9 3 6 4

Labels

12

14

16

18

20

22

24

26

28

30

32

Thr

esho

ldEuclidean

(a)

1 7 2 5 8 9 3 6 4

Labels

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Thr

esho

ld

Correlation

(b)

1 2 3 4 5 6 7 8 9

Labels

300

350

400

450

500

550

Th

resh

old

SimilB

(c)

Figure 3: Comparison of the SimilB, the ED, the CC on a clustering task. Labels (1,2,3), (4,5,6), and (7,8,9) correspond to Cylinder, Bell,and Funnel classes, respectively.

Compared to asymmetric two-sample differences, the three-sample symmetric difference leads to more complicatedexpression. Expression (11) corresponds to three-sampleweighted moving average of ΨBd (xk(n), yk(n)). Note if x =y, ΨBd is reduced to the Teager-Kaiser operator (TKO):ΨBd (x(n), x(n)) = x2(n)− x(n+ 1)x(n− 1) (see [9]). Finally,the asymmetric approximation is less complicated for imple-mentation and is faster than the symmetric one.

5. PROPERTIES OFΨB

We provide here some new properties of ΨB [1]. We denoteΨB of x(t) and y(t) by ΨB(x, y; t) and denote by “← ” theaffectation operation.

Similarity measure:

ΨB(x, y; t) = ΨB(y, x; t). (12)

This is a basic requirement for most of similarity or distancemeasures.

Time shift:

x1(t) ←− x(t − t0

),

y1(t) ←− y(t − t0

).

(13)

It is trivial that ΨB is time-shift invariant, that is,ΨB(x1, y1; t) = ΨB(x, y; t − t0). This property states that anytime translations in the signals, x(t) and y(t), should bepreserved in their measure of interaction, ΨB(x, y; t). Thus,ΨB(x, y; t) is robust to time shifts.

Amplitude scale:

x1(t) ←− α·x(t),

y1(t) ←− β·y(t).(14)

It is easy to verify that ΨB(x1, y1; t) = α·βΨB(x, y; t). Thus,the time where ΨB peaks, corresponding to the maximumof interaction between x(t) and y(t), is robust to amplitudescale.

Time scale:

x1(t) ←− x(at),

y1(t) ←− y(at).(15)

It is easy to verify that ΨB(x1, y1; t) = a2ΨB(x, y; t). Thisproperty states that if the time of the two signals is com-pressed by a scale a, then the energy of interaction is com-pressed by a2.

Page 5: An Energy-Based Similarity Measure for Time Series

Abdel-Ouahab Boudraa et al. 5

0 50 100 150 200

Times

−1

−0.5

0

0.5

1Si

gnalX

(a)

0 50 100 150 200

Times

0.05

0.1

0.15

0.2

0.25

Nor

mal

ized

freq

uen

cy

Signal X

Intersection frequency

(b)

0 50 100 150 200

Times

−1

−0.5

0

0.5

1

Sign

alY

(c)

0 50 100 150 200

Times

0.05

0.1

0.15

0.2

0.25

Nor

mal

ized

freq

uen

cy

Signal Y

Intersection frequency

(d)

Figure 4: Linear chirp TSs (parabolic phase).

0 50 100 150 200

Times

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

Am

plit

ude

Ψ�(Y ,Y ) Ψ�(X ,X)

Ψ�(X ,Y )

Intersection frequency

Figure 5: Similarity measure using SimilB with a sliding windowanalysis.

5.1. ΨB-based similarity measure

A similarity measure S(x(t), y(t)) is a function to comparethe TSs x(t) and y(t). Conventionally, this measure is a

0 50 100 150 200

Time

−0.2

−0.15

−0.1

−0.05

0

0.05

0.1

0.15

Am

plit

ude

Figure 6: Similarity measure using CC with a sliding window anal-ysis.

symmetric function whose value is large when x and y aresomehow similar. The proposed similarity measure based onΨB(x, y), between x(t) and y(t), uses their interaction. A

Page 6: An Energy-Based Similarity Measure for Time Series

6 EURASIP Journal on Advances in Signal Processing

0 50 100 150 200 250 300 350 400 450 500

Time

0 50 100 150 200 250 300 350 400 450 500

−2

0

2

4

6

Am

plit

ude

−1

−0.5

0

0.51

Am

plit

ude

s2(t)

s1(t)

(a) Signals s1(t) and s2(t)

0 50 100 150 200 250 300 350 400 450 500

Time

−500−400−300−200−100

0100200300400500

Am

plit

ude

CC

(b) CC with a sliding window analysis

0 50 100 150 200 250 300 350 400 450 500

Time

00.10.20.30.40.50.60.70.80.9

1

Am

plit

ude

SimilB

(c) SimilB with a sliding window analysis

Figure 7: Similarity measure using SimilB and CC of sinusoidal TSs.

larger value indicates more interaction in energy betweenTSs. If the input variables (or samples) of the TS x(t) (ory(t)) have large range, then this can overpower the other in-put variables of y(t) (or x(t)). Therefore, the proposed sim-ilarity measure, SimilB, is a normalized version of ΨB(x, y)and is defined as follows:

SimilB(x, y) =√

2∫TΨB(x, y)dt

∫T

√Ψ2

B(x, x) +Ψ2B(y, y)dt

. (16)

T is the TS duration or the size of sliding window analysis.The similarity is symmetric when comparing two TSs:

SimilB(x, y) = SimilB(y, x) ∀(x, y) ∈ C2. (17)

It is a basic requirement for most of similarity or distancemeasures. Note that if x = y then SimilB(x, y) = 1.

6. RESULTS

SimilB (equation (17)) is combined with relations (10) and(11), and relation (3) or (4) to process discrete (Figure 2) andcontinuous (Figures 1, 4, 7, and 8) data, respectively. The ef-fects of temporal information and the inclusion of the signalderivatives are shown on nonstationary and stationary syn-thetic TSs. Figure 1 shows three TSs with different shapes toillustrate the limit of the ED and the CC. Since f1, f2, and f3have different shapes, then an appropriate similarity measure

would show, for example, that the similarity values betweenf1 and f2 and that between f3 and f2 are different. Results ofthe SimilB, the ED, and the CC between f2 and f1 and thatbetween f2 and f3 are reported in Table 1. These results showthat SimilB is the unique measure which properly capturethe temporal information in the comparison of the shapes.The most studied TS classification/clustering problem is theCylinder-Bell-Funnel dataset (noted CBF.dat) [10]. It is a 3-class problem. Typical examples of each class are shown inFigure 2. The classes are generated by the equations [10]

c(t) = (6 + η)·X[a,b](t) + ε(t) // Cylinder class,

b(t) = (6 + η)·X[a,b](t).(t − a)(b− a)

+ ε(t) // Bell class,

f (t) = (6 + η)·X[a,b](t).(b− t)(b− a)

+ ε(t) // Funnel class,

X[a,b] = 1 if a ≤ t ≤ b, elseX[a,b] = 0,(18)

where η and ε(t) are drawn from a standard normal distribu-tion N (0, 1), a is an integer drawn uniformly from the range[16, 32], and (b − a) is an integer drawn uniformly from therange [32, 96] (Figure 2). The task is to classify a TS as oneof the three classes, Cylinder, Bell, or Funnel. We have per-formed an experiment classification on CBF.dat dataset con-sisting of 3 TSs of each class. TSs are clustered using group-average hierarchical clustering. The dendrograms are formedwith nearest neighbor linkage for three of each type of TSsusing SimilB measure, the ED, and the CC. We have averaged

Page 7: An Energy-Based Similarity Measure for Time Series

Abdel-Ouahab Boudraa et al. 7

0 20 40 60

Time

0

0.5

1

s 1(t

)

(a)

0 20 40 60 80

Time

−1

0

1

s 2(t

)

(d)

0 200 400 600

Time

−1

0

1

r 1(t

)

(b)

0 200 400 600

Time

−1

0

1

r 2(t

)

(e)

0 200 400 600

Time

−0.5

0

0.5

1

������

(s1(t

),r 1

(t))

T�

(c)

0 200 400 600

Time

−1

0

1

������

(s2(t

),r 2

(t))

T�

(f)

Figure 8: Similarity measure using SimilB of TSs of nonequal length.

the classification results over 45 runs. Figure 3 shows the re-sult of these averaged runs where both the ED and the CCfail to differentiate between the three classes. SimilB distin-guishes the three original classes as shown in Figure 3. Clas-sification errors reported in Table 2 show that SimilB is moreeffective than the ED and the CC. These results are expectedsince the ED and the CC are not able to include the tempo-ral information while SimilB using derivatives of the TS cap-tures this kind of information. Moreover, these results maybe due to the fact that ΨB is local operator [1, 6] while theED and the CC are global ones. Figure 4 shows an exam-ple of nonstationary TSs (two linear FM signals), x(t) andy(t). The instantaneous frequency (IF) of x(t) increases lin-early with time while that of y(t) decreases with time. Thepoint where the IFs intercept (Figure 4), noted Q, is locatedat t = 125. Figure 5 shows the energy of each TS and the en-ergy of their interaction obtained with a sliding window anal-ysis of T = 15. The point Q corresponds to the maximumof similarity and also where the energy of x(t) (SimilB(x, x))and that of y(t) (SimilB(y, y)) are equal. Away from Q, theamplitude of interaction decreases because there is less sim-ilarity between TSs (the TSs tend to be more and more dif-ferent). As the IFs converge from the time origin to Q (theTSs tend to be equal), the interaction intensity of the TSs in-creases and the maximum of similarity is achieved at t = 125.

Figure 6 shows that the maximum of similarity given by CCis located at t = 240. Thus, the CC fails to point out, as ex-pected (Figure 4), the maximum of similarity at Q. The in-teraction measure using SimilB and CC is performed usinga sliding window analysis of size T . Different T values rang-ing from 3 to 91 have been tested. Globally, we found com-parable results. The CC is calculated with the same slidingwindow as for SimilB. Furthermore, as the IFs converge to Qor diverge from Q, the CC function has, globally, the samebehavior and thus the similarity study of such TSs is diffi-cult. This example shows that the SimilB is more effectiveto study nonstationary TSs than the CC. This may be duethe fact that the ΨB is nonlinear operator while the CC islinear one. Figure 7(a) shows an example of two sinusoidalTSs, s1(t) and s2(t), of the same frequency and amplitude. TSs2(t) presents a discontinuity located at t = 200. Both CCand SimilB are calculated with T set to 17. CC measure failsto detect the discontinuity and shows a maximum of interac-tion at t = 262 (Figure 7(b)). The result of SimilB is expected(Figure 7(c)). Indeed, excepted for data point at t = 200, s1(t)and s2(t) are equal and ΨB behaves toward these two signalsas the TKO applied to s1(t) (s2(t)) and thus giving a constantoutput (square of the amplitude times the frequency) [9].This example shows the interest of SimilB to track disconti-nuities (Figure 7(c)). Two synthetic signals, s1(t) and s2(t), of

Page 8: An Energy-Based Similarity Measure for Time Series

8 EURASIP Journal on Advances in Signal Processing

nonequal lengths with size window observation T of 65 and81, respectively, are shown in Figures 8(a) and 8(d). Thesetwo signals are time shifted by 300 samples and corruptedby additive Gaussian noise. The obtained signals, r1(t) andr2(t), are shown in Figures 8(b) and 8(e), respectively. Theattenuation coefficient is set to 0.7. For both signals r1(t) andr2(t), a similarity measure would show, in theory, a maxi-mum of interaction located at t = 300. No warping pro-cess is used. We use the smallest TS length as a sliding win-dow and calculate SimilB, inside this window, between twoTSs of the same length. Outputs of SimilB are shown in Fig-ures 8(c) and 8(f) indicating a net maximum at t = TB.As expected, both SimilB(s1(t), r1(t)) and SimilB(s2(t), r2(t))peak to TB = 300. Table 3 lists the TB values calculated forSimilB(s1(t), r1(t)) and SimilB(s2(t), r2(t)) for different SNRsranging from −6 dB to 9 dB. Each value of Table 3 corre-sponds to the average of an ensemble of twenty five trials ofTB estimation. These results show that the performances ofSimilB are very close to that of the theory and also that SimilBworks correctly for moderately noisy TSs.

7. CONCLUSION

Relative change of amplitude and the corresponding tempo-ral information are well suited to measure similarity betweenTSs. In this paper, a new nonlinear similarity measure for TSanalysis, SimilB, which takes into account the temporal in-formation is introduced. Using the first and second deriva-tives of the TS, SimilB is able to capture temporal changesand discontinuities of the TS. Some new properties of ΨB

are presented showing, particularly, that the interaction mea-sure is robust both to time shift and amplitude scale. It is alsoshown that if the time of the signals is scaled by a factor, thecorresponding interaction energy is proportional to that ofthe original ones. Thus, the time corresponding to the max-imum of interaction is unchanged by time scale. Note thatSimilB is not a unique measure of similarity based on ΨB op-erator. Different similarity based on ΨB can be constructed.To process continuous analytic TSs an expression of ΨB isprovided. The discrete version of ΨB, for its implementation,is presented and three derivative approximations are exam-ined. Only the asymmetric approximation which is less com-plicated and less time consuming is implemented. Results ofdifferent synthetic TSs (stationary and nonstationary) showthat SimilB performs better than the ED and the CC andshow the interest to take into account the relative changesof the TSs. Compared to generative models (HMM, GMM,. . . ) or distance kernel-based methods, SimilB is nonpara-metric approach that does not require the specification of akernel or the selection of a probability distribution. Further-more, SimilB is fast and easy to implement. SimilB may beviewed as a data-driven approach because no a priori infor-mation about the signals or parameters setting is required.The processed TSs are either noiseless or moderately noisy.For very noisy TSs, the robustness of SimilB must be studied.In a future work, we plan to use smooth splines to give morerobustness to SimilB [11]. We also plan to include the Sim-ilB measure in a clustering process or algorithm such as fuzzyc-means or k-means for classification of TSs in different clus-

ters. To confirm the presented results, a large class of real TSsdatasets must be studied as well as the results compared toother methods particularly those including the temporal in-formation.

REFERENCES

[1] J.-C. Cexus and A.-O. Boudraa, “Link between cross-Wignerdistribution and cross-Teager energy operator,” Electronics Let-ters, vol. 40, no. 12, pp. 778–780, 2004.

[2] J. Alon, S. Sclaroff, G. Kollios, and V. Pavlovic, “Discoveringclusters in motion time-series data,” in Proceedings of IEEEComputer Society Conference on Computer Vision and PatternRecognition (CVPR ’03), vol. 1, pp. 375–381, Madison, Wis,USA, June 2003.

[3] R. Agrawal, C. Faloutsos, and A. Swami, “Efficient similaritysearch in sequence databases,” in Proceedings of the 4th Inter-national Conference on Foundations of Data Organization andAlgorithms (FODO ’93), vol. 730 of Lecture Notes in ComputerScience, pp. 69–84, Chicago, Ill, USA, October 1993.

[4] S. Chu, E. Keogh, D. Hart, and M. Pezzani, “Iterative deepen-ing dynamic time warping for time series,” in Proceedings of the2nd SIAM International Conference on Data Mining, Arlington,Va, USA, April 2002.

[5] C. S. Moller-Levet, F. Klawonn, K. H. Cho, and O. Wolken-hauer, “Fuzzy clustering of short time-series and unevenlydistributed sampling points,” in Proceedings of the 5th Inter-national Symposium on Intelligent Data Analysis (IDA ’03),vol. 2810 of Lecture Notes in Computer Science, pp. 330–340,Berlin, Germany, August 2003.

[6] Z. Saidi, A.-O. Boudraa, J.-C. Cexus, and S. Bourennane,“Time-delay estimation using cross-ΨB-energy operator,” In-ternational Journal of Signal Processing, vol. 1, no. 1, pp. 28–32,2004.

[7] P. Maragos and A. Potamianos, “Higher order differential en-ergy operators,” IEEE Signal Processing Letters, vol. 2, no. 8, pp.152–154, 1995.

[8] P. Maragos, J. F. Kaiser, and T. F. Quatieri, “On amplitudeand frequency demodulation using energy operators,” IEEETransactions on Signal Processing, vol. 41, no. 4, pp. 1532–1550,1993.

[9] J. F. Kaiser, “Some useful properties of Teager’s energy opera-tors,” in Proceedings of IEEE International Conference on Acous-tics, Speech, and Signal Processing (ICASSP ’93), vol. 3, pp. 149–152, Minneapolis, Minn, USA, April 1993.

[10] N. Saito, Local feature extraction and its application using a li-brary of bases, Ph.D. thesis, Yale University, New Haven, Conn,USA, 1994.

[11] D. Dimitriadis and P. Maragos, “An improved energy demod-ulation algorithm using splines,” in Proceedings of IEEE Inter-national Conference on Acoustics, Speech, and Signal Process-ing (ICASSP ’01), vol. 6, pp. 3481–3484, Salt Lake, Utah, USA,May 2001.