time series (2) - unibo.it · pdf filefor undefined terms (e.g., s ... a real-world graphical...

1

Time Series (2)Information Systems M

Prof. Paolo Ciaccia

http://www-db.deis.unibo.it/courses/SI-M/

The “alignment” problem

� Euclidean distance, as well as other Lp-norms, are not robust w.r.t., even

small, contractions/expansions of the signal along the time axis

� E.g., speech signals, music

� Intuitively, we would need a distance measure that is able to “match” a

point of time series s even with “surrounding” points of time series q

� Alternatively, we may view the time axis as a “stretchable” one

� A distance like this exists, and is called “Dynamic Time Warping” (DTW)

Sistemi Informativi M 2

Fixed Time Axis

Sequences are aligned “one to one”

“Warped” Time Axis

Non-linear alignments are possible

Time series (2)

2

The DTW distance

� The DTW distance is well-defined even for series of different length

� It is parametric in a “ground” distance d, which measures the difference

between samples of the series

� Usually, di,j ≡ d(si,qj) = (si - qj)2 or d(si - qj) = |si - qj|

� Let s[1:n] and q[1:m] be two series of length n and m, respectively

� The definition of DTW is recursive:

DTW(s[1:n],q[1:m])2 = dn,m + min{DTW(s[1:n],q[1:m-1])2,

DTW(s[1:n-1],q[1:m])2,

DTW(s[1:n-1],q[1:m-1])2 }

DTW(s[1:1],q[1:1])2 = d1,1

� For undefined terms (e.g., s[1:0]) substitute ∞ in the above equation

Time series (2) Sistemi Informativi M 3

The DTW distance: an example

� Consider the two time series:

� This is the d matrix of

ground distances, di,j = (si - qj)2

� On the right, the (squared) “DTW” matrix: DTWi,j = DTW(s[1:i],q[1:j])2


7 25 16 25 36 16 9

3 1 0 1 4 0 1

4 4 1 4 9 1 0

5 9 4 9 16 4 1

2 0 1 0 1 1 4

1 1 4 1 0 4 9

d 2 3 2 1 3 4

s

q

7 40 22 31 43 24 15

3 15 6 7 11 8 6

4 14 6 9 18 8 5

5 10 5 11 18 7 5

2 1 2 2 3 4 8

1 1 5 6 6 10 19

DTW 2 3 2 1 3 4

s

q

Time series (2)

1 2 3 4 5 6

s 1 2 5 4 3 7

q 2 3 2 1 3 4

n=m=6

L2(s,q) = √√√√29

3

What DTW computes (1)

� Consider the d matrix, and the following “game”:

� Start from (1,1) and end in (n,m)

� Take one step at a time

� At each step, move only by increasing i, j, or both

� I.e., never go back!

� “Jumps” are not allowed!

� Sum all distances you have found in the “warping path”

� This is 21 in the figure below


7 25 16 25 36 16 9

3 1 0 1 4 0 1

4 4 1 4 9 1 0

5 9 4 9 16 4 1

2 0 1 0 1 1 4

1 1 4 1 0 4 9

2 3 2 1 3 4

s

warping path w

q

find the optimal (minimum cost)

warping path!

What DTW computes (2)

� Each warping path is a way to “align” (match) s and q, such that all samples are

matched with at least one sample of the other series

� The DTW distance is the (square root of the) cost of the optimal warping path

� The “Euclidean path” moves only along the main diagonal, and costs 29 in the example

� The recursive definition allows DTW to be computed in O(n×m) time, even if the

number of warping paths is exponential!

� This is an example of dynamic programming algorithm

� When the DTW matrix has been filled, the optimal warping path can be

recovered by going back from DTWn,m


s

q

7 40 22 31 43 24 15

3 15 6 7 11 8 6

4 14 6 9 18 8 5

5 10 5 11 18 7 5

2 1 2 2 3 4 8

1 1 5 6 6 10 19

DTW 2 3 2 1 3 4

4

Visualizing the optimal warping path


q

sq

s

q

s

A real-world graphical example


Power-Demand time teries

Each sequence corresponds

to a week’s demand for power

in a Dutch research facility

in 1997

Monday was a holiday

Wednesday was a holiday

Time series (2)

5

Fast searching with DTW

� We have now 2 problems to face, if we want to use DTW for searching:

1. Computing the DTW distance is very time-consuming

2. How to index it?

� Both problems can be solved:

1. Use a lower-resolution approximation of the time series

� However the method can introduce false dismissals


1.3 sec

22.7 sec

Time series (2)

DTW with global constraints

� Usually, even to avoid “pathological” alignments, one sets some “global

constraint” on the allowed warping paths

� One of them, suitable for the case n=m, is the so-called Sakoe-Chiba band of

width b, which reduces the complexity to O(n×b)


q

s

The Sakoe-Chiba band

of width b=4

s

q

7 43 24 17

3 7 11 8 8

4 6 9 18 8 7

5 10 5 11 18 7

2 1 2 2 3

1 1 5 6

DTW 2 3 2 1 3 4

Our example with b=2

6

Lower-bounding the DTW distance (1)

� An effective indexing technique for DTW has been proposed in [Keo02]

� The method requires a global constraint (e.g. Sakoe-Chiba band)

� The basic idea of LBKeogh is to create an “envelope”, whose “thickness”

depends on b, of the query q, Env(q) = (Low(q),Up(q))

Sistemi Informativi M 11Time series (2)

Low

Up

q

Upi = max{q[i-b:i+b]}

Lowi = min{q[i-b:i+b]}

Lower-bounding the DTW distance (2)

� Env(q) provides the means to derive an Euclidean-like lower-bound to the

DTW distance between q and any series s

� For this, each sample of s is aligned exactly once

� There are 3 cases to consider:

� si > Upi: since si must be aligned with at least one sample of q[i-b:i+b], it follows that this alignment costs at least (si - Upi)

2

� si < Lowi: since si must be aligned with at least one sample of q[i-b:i+b], it follows that the cost of this alignment is at least (Lowi- si)

2

� si ∈ [Lowi,Upi]: in this case only the trivial bound 0 applies

� Thus, putting all together:


( )( )∑

=

<−

>−

=n

1i

ii

2

ii

ii

2

ii

2

Keogh

otherwise0

Lows ifsLow

Ups ifUps

q)(s,LB

7

Visualizing LBKeogh

� The sum of the (squared) lengths of the vertical lines equals LBKeogh(s,q)2


sUp

Lowq

Indexing the DTW (1)

� Because of high-dimensionality, an approximate representation is needed

for indexing time series

� The second step of Keogh’s method computes, for each time series s in the

DB, its PAA-approximation, s’, using a suitable window size W

� Let n’ = n/W be the dimensionality of s’


0 20 40 60 80 100 120 140

s

s'

W

( )

W

s

s

Wi

1W1it

t

'

i

∑×

+×−==

8


� PAA-approximations are indexed with an R-tree

� The region of a node N, Reg(N), is completely defined by the coordinates of

the two opposite vertices L and H (li ≤ hi):

� What remains to be done is to compute a PAA-approximation of Env(q)…


L = {l1, l2, …, ln’}

H = {h1, h2, …, hn’}

h1

h2

hi

l1l2

li


� The PAA-approximation of the envelope, Env’(q) = (Low’(q),Up’(q)),

“encloses the envelope”:


( ){ }( ){ }

iW1W1i

'

i

iW1W1i

'

i

Low,...,LowminLow

Up,...,UpmaxUp

+−

+−

=

=

Low’

Up’

q

9


� The final step consists in defining a “MinDist” between Env’(q) and Reg(N),

which results in:

� It can be proved that, if s ∈Reg(N), this distance lower bounds the DTW

distance between s and q


( )( )∑

=

<−

>−

=n'

1i

'

ii

2

i

'

i

'

ii

2'

ii

2'

min

otherwise0

Lowh ifhLow

Upl ifUpl

W(q))Env(Reg(N),d

Low’

Up’

hi

li

Some experimental results (from [Keo02])

� Two datasets (left: synthetic; right: a mix of real data)


0

0.2

0.4

0.6

0.8

1

210 212 214 216 218 220

No

rmalized

CP

U C

ost

210 212 214 216 218 220

Random Walk II Mixed Bag

Seq. scanLB_Keogh

Seq. scanLB_Keogh

DB size DB size

Time series (2)

10

Final considerations

� We have just seen some basic techniques to deal with (large) time series

databases

� Several other distance measures have been proposed, besides DTW, to deal with the “alignment” problem

� Many other relevant problems exist and have attracted interest, among

which:

� Searching for similar sub-sequences

� Searching for multi-dimensional time series (i.e., trajectories)


Applying time series techinquesto shape matching

The WARP case

11

WARP (Bartolini, Ciaccia & Patella, 2005)

� WARP (= WARP: Accurate Retrieval based on Phase) is a shape matching

technique characterized by the following features:

� DFT-based dimensionality reduction

� Modification of Fourier coefficients so as to achieve invariance w.r.t. a series of transformations

� DTW distance for comparing shape descriptors

21

Edge detection

Parameterization

DFT

Invariance transformations

Dimensionality reduction

amplitude phase

amplitude phase

Boundary parameterization

� Starting from a boundary description obtained through a shape extraction

algorithm, WARP parameterizes the boundary to obtain a complex discrete-

time periodic signal whose period N equals the number of points in the

object boundary

� The output of this step is therefore the signal

z(t) = x(t)+ j y(t) t= 0,…,N-1


Edge detection

Parameterization

12

DFT

� From the z signal, a frequency representation is obtained using DFT:

where Rm and Θm are the amplitude and the phase of the m-th DFT

coefficient, respectively


DFT

amplitude phase

12N1,...,,...,-1,0,2Nm

eR z(t)eZ mjΘ

m

1N

0t

mt/N2πj

m

−−=

==∑−

=

−


� A lower-dimensional approximation is obtained by retaining only the M

Fourier coefficients closer to 0 (m = -M/2,…,-1,0,1,…,M/2-1)



amplitude phase

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1 10 100 1000

Fraction of energy vs no. of coefficientsM=32 (90% of the energy) is used in the

experiments

13


� The purpose of this step is to modify DFT coefficients so as to obtain

invariance with respect to:

1. Translation (= changing the object position)

2. Scaling (= changing the object size)

3. Rotation

4. Starting point of parameterization

� Invariance w.r.t. 3. and 4. can be trivially obtained by discarding the phase

of DFT coefficients, but this negatively influences the accuracy of retrieval

� The transformations lead to a set of normalized DFT (N-DFT) coefficients ZN



amplitude phase

The effect of phase

� By using only the amplitude spectra, and comparing them using L2,

(a) results more similar to (b) than to (c) (!?)


amplitude phase

14

1. Translation

� Each point z(t) is shifted by a

constant value c: z’(t) = z(t)+c

� The DFT of the z’ signal is:

� m ≠ 0: the sum is 0

� m = 0: the sum is ≠ 0

� Translation only influences the

DC coefficient, which indeed is

the average of the signal

The 0-frequency

DFT coefficient is ignored:


z(t)

z(t) + c

c

∑

∑−

=

−

−

=

−

+=

+=

1N

0t

mt/N2πj

m

1N

0t

mt/N2πj'

m

ec Z

c)e(z(t)Z

0ZN

0 =

2. Scaling

� If z’(t) = k z(t), the DFT of z’ is:


z(t)

k z(t)m

'

mm

'

m

m

1N

0t

mt/N2πj'

m

ΘΘ ;Rk R

k Z z(t)ek Z

==

==∑−

=

−

� Scaling only influences the

amplitude of DFT coefficients

� To obtain scale invariance:

The amplitude of the N-DFT

coefficients is normalized:

1m

N

m R/RR =

15

3. Rotation

� If the image is rotated counter-clockwise

by an angle θ, it is z’(t) = z(t) ejθ and


z(t)

z(t) ejθ

θ

θΘΘ ;RR

e Zez(t)eZ

m

'

mm

'

m

jθ

m

1N

0t

mt/N2πjjθ'

m

+==

==∑−

=

−

� Rotation adds to the phase of

each coefficient a constant

value

� To achieve rotation-invariance:

The phase of the N-DFT

coefficients is obtained as, e.g.:

ΘΘΘΘNm = ΘΘΘΘm - ΘΘΘΘ 1

The term to be subtracted is

indeed different, since it also

depends on the last invariance…

4. Starting point

� Changing the starting point

corresponds to a shifting in the

time domain

� Thus, it is z’(t) = z(t-t0)

(z’ is “anticipated” by t0)

� The “shift theorem” of DFT yields:

� Thus, all coefficients are rotated

by an amount that is linear in the

frequency (m)…


z(t) p0

z(t - t0)

/Nmt2πΘΘ ;RR

e Z)et-z(tZ

0m

'

mm

'

m

/Nmt2πj

m

1N

0t

mt/N2πj

0

'

m0

−==

== −−

=

−∑

16

3. and 4. together

� Combining 3. and 4. we get: z’(t) = z(t-t0) ejθ and

� In particular, it is:

� The normalized phase is:

� To see why this works, consider the N-DFT coefficient of the z’ signal:


/Nt2πθΘΘ

/Nt2πθΘΘ

01-

'

1-

01

'

1

++=

−+=

2

ΘΘ-

2

ΘΘΘΘ

11-11-m

N

m

+−+= m

2

ΘΘ-

2

ΘΘmΘΘ

'

1

'

1-

'

1

'

1-'

m

N'

m

+−+=

/Nmt2πθΘΘ 0m

'

m −+=

N

m11-11-

m0101-

0101-0m

Θ2

ΘΘ

2

ΘΘmΘ

2

π/N2tθΘπ/N2tθΘ-

2

π/N2tθ-Θπ/N2tθΘmπ/N2mtθΘ

=+

−−

+=−++++

+−+++−+=

Comparing images

� Images are compared using DTW in the time domain by applying the Inverse

DFT (IDFT) to the N-DFT descriptors


amplitude

phase

QUERY IMAGE=

?

IDFT

DTW

amplitude

phase

DB IMAGE

IDFT

17

Experimental results (1)

� NoPhase-EU: Phase is discarded, Euclidean distance

� WARP-EU: Phase is retained, Euclidean distance


-10

0

10

20

30

40

50

60

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

Pre

cisi

on

ga

in (

%)

Recall

NoPhase-EU

Warp-EU

Experimental results (2)

� MAXC: parameterization method using maximum curvature points


0

0.1

0.2

0.3

0.4

0.5

0.6

0 0.2 0.4 0.6 0.8 1

Pre

cisi

on

Recall

WARP

MAXC

time series (2) - unibo.it · pdf filefor undefined terms (e.g., s ... a real-world graphical...

Documents