dr. hesam izakian october 2014. 2 spatial time series problem formulation anomaly detection in...
TRANSCRIPT
Cluster-Centric Anomaly Detection and
Characterization in Spatial Time Series
Dr. Hesam Izakian
October 2014
2
Spatial time series Problem formulation Anomaly detection in spatial time series- questions Overall scheme of the proposed method
o Time series segmentationo Spatial time series clusteringo Assigning anomaly scores to clusterso Visualizing the propagation of anomalies
An outbreak detection scenario Application Conclusions
Outline
3
Structure of datao A set of spatial coordinateso One or more time series for
each point
Exampleso Daily average temperature in different climate stationso Stock market indexes in different countrieso Number of absent students in different schoolso Number emergency department visits in different hospitalso Measured signals in different parts of brain
Spatial time series
4
There are N spatial time series
Objective: Find a spatial neighborhood of data
In a time interval
Containing a high level of unexpected changes
nriN R xxxx ,,,, 21
rntxtxtxtrsxsxsxs
ts
iniii
iriii
iii
)(),....,(),()(2usually)(),...,(),()(
)(|)(
21
21
T
xx
xxx
Npp xxx ,,, 21
nlnlqqttt lqqq ,,0,...,, 1t
Problem formulation
5
Spatial neighborhood of datao Size of neighborhoodo Overlapping neighborhoods
Unexpected changes (anomalies)o What kind of changes are expected/not expectedo How to evaluate the level of unexpected changes
Anomaly visualization Anomaly characterization
o What was the source of anomalyo How the anomaly is propagated over time
Anomaly detection in spatial time series- questions
6
Revealing the structure of data in various time intervals Comparing the revealed structures
Overall scheme of the proposed method
Sliding window
Spatial time series clustering
Spatial time series data
KUUU ,,, 21
Anomaly scores
Fuzzy relations
Ksss ,,, 21
KKRRR 1,-2,32,1 ,,,
Spatial time series data
KWWW ,,, 21
7
Time series part segmentation
Sliding windowo Spatio-temporal subsequenceso Local view of time series part
8
Revealing the structure of data in various time intervals Comparing the revealed structures
Overall scheme of the proposed method
Sliding window
Spatial time series clustering
Spatial time series data
KWWW ,,, 21
KUUU ,,, 21
Anomaly scores
Fuzzy relations
Ksss ,,, 21
KKRRR 1,-2,32,1 ,,,
9
Fuzzy C-Means clustering- visual illustration
1 1 1 1 1 0 0 0 0 00 0 0 0 0 1 1 1 1 1
BA
10
Fuzzy C-Means clustering- visual illustration
0.91 0.96 1.00 0.95 0.70 0.30 0.05 0.00 0.04 0.090.09 0.04 0.00 0.05 0.30 0.70 0.95 1.00 0.96 0.91
BA
11
Fuzzy C-Means clustering…
Partitions N data Into clusters Result:
Objective function:
Minimization:
N
k
mik
N
kk
mik
i
u
u
1
1
x
v
diN Rxxxx ,,,, 21
,,,,, 21d
ic Rvvvv
2
1 1ki
c
i
N
k
mikuJ xv
)1/(2
1
mc
j kj
kiiku
xv
xv
cNc
N
c
N
uu
uuU
1
1111
1
v
v
xx
c
iikik kuu
1
1],1,0[
)1(, Ncc
12
Reveals available structure within datao In form of partition matrices
Challengeso Different sources: Spatial part vs. temporal parto Different dimensionality in each parto Different structure within each part
Spatial time series clustering
13
In spatial time series, we define
Adopted FCM objective function
Characteristicso When λ=0: Only spatial part of data in clusteringo A higher value of λ : a higher impact of time series part in
clusteringo Optimal value of λ: Optimal impact of each part in clustering
Spatial time series clustering…
0)()()()(),(
222
ttssd kikiki xvxvxv
),(2
1 1ki
c
i
N
k
mik duJ xv
14
Spatial-time series clustering- Optimal value of λ
c
i
mik
c
i imik
ku
u
1
1ˆv
x
N
kkkE
1
2ˆ)( xx
15
Revealing the structure of data in various time intervals Comparing the revealed structures
Overall scheme of the proposed method
Sliding window
Spatial time series clustering
Spatial time series data
KWWW ,,, 21
KUUU ,,, 21
Anomaly scores
Fuzzy relations
Ksss ,,, 21
KKRRR 1,-2,32,1 ,,,
16
Assign an anomaly score to each single subsequence based on historical data
Aggregating anomaly scores inside revealed clusters
Assigning anomaly scores to clusters in different time windows
ciufusWN
kik
N
kkikiij ,...,2,1,
11
v
cUW vvv ,...,,, 2122
Nkfk ,...,2,1,
17
Revealing the structure of data in various time intervals Comparing the revealed structures
Overall scheme of the proposed method
Sliding window
Spatial time series clustering
Spatial time series data
KWWW ,,, 21
KUUU ,,, 21
Anomaly scores
Fuzzy relations
Ksss ,,, 21
KKRRR 1,-2,32,1 ,,,
18
Visualizing the propagation of anomalies- Fuzzy relations
Objective: quantifying relations between clusters
T
,1,2,1 ]...,,,[ kckkk uxuxuxux
T
,2,2,1 ],...,,[ kckkk uyuyuyuy
Nckcc
Nk
Nk
uyuyuy
uyuyuyuyuyuy
U
W
,2,21,2
,2,21,2
,1,11,1
2
2
,...,,...,
,...,,...,,...,,...,
:
Nckcc
Nk
Nk
uxuxux
uxuxuxuxuxux
U
W
,1,11,1
,2,21,2
,1,11,1
1
1
,...,,...,
,...,,...,,...,,...,
:
19
Visualizing the propagation of anomalies…
Objective function to construct relation
Optimization
N
kkk RQ
1
2uyux
N
k
c
ikjji
cjki uyrux
1
1
1
2
,,2,...,2,1
, )t(max
][ , jirR 1,...,2,1 ci 2,...,2,1 cj
)()()1(
,,, iterr
Qiterriterr
tststs
]1,0[, jir
20
Example
An outbreak o In southern part of Albertao Using NAADSM for 100 days
21
Example…
A sliding window is usedo Length : 20o Movement: 10
Generated spatio-temporal subsequences:
22
23
Example…
24
Example…
25
Example…
26
Example…
27
Application Implemented for Agriculture and Rural Development
(Government of Alberta) Using KNIME (Konstanz Information Miner) Animal health surveillance in Alberta Anomaly detection Data visualization
28
Conclusions
A framework for anomaly detection and characterization in spatial time series is developed
A sliding window to generate a set of spatio-temporal subsequences is considered
Clustering is used to discover the available structure within the spatio-temporal subsequences
An anomaly score assigned to each revealed spatio-temporal cluster
A fuzzy relation technique is proposed to quantify the relations between clusters in successive time steps
29
Thank you