Outline Introduction Mining by Learning Conclusion
A Unified Approach to Mining ComplexTime-Series Data for Various Kinds of Patterns
Yi Wang1 J.H. Feng1 J.Y. Wang1 Z.Q. Liu2
1Department of Computer Science, Tsinghua University, Beijing, 100084, China
2School of Creative Media, City University of Hong Kong, Hong Kong
IEEE ICDM Conference, 2007
Wang, et al Mining Complex Time-Series Data
Outline Introduction Mining by Learning Conclusion
1 IntroductionAspects of Sequential Data MiningVarious Approaches or A Unified One
2 Mining by LearningLearning the Temporal Structure as A GraphVarious Kinds of Hidden Markovian ModelsLearning VLHMM
3 ConclusionMining Various Kinds of PatternsContributions
Wang, et al Mining Complex Time-Series Data
Outline Introduction Mining by Learning Conclusion Aspects of the Problem A Unified Approach
Aspects of Sequential Data Mining
Various Sequence Types:
univariate/multivariate,
integer (discrete)/real (continous),
Various Mining Goals:
periodic pattern,search-by-example,frequent atomic pattern,
Difficulties:
uncertainty on the y-axis(e.g., noise),uncertainty on the x-axis(e.g., time scale).
Wang, et al Mining Complex Time-Series Data
Outline Introduction Mining by Learning Conclusion Aspects of the Problem A Unified Approach
Aspects of Sequential Data Mining
Various Sequence Types:
univariate/multivariate,integer (discrete)/real (continous),
Various Mining Goals:
periodic pattern,search-by-example,frequent atomic pattern,
Difficulties:
uncertainty on the y-axis(e.g., noise),uncertainty on the x-axis(e.g., time scale).
Wang, et al Mining Complex Time-Series Data
Outline Introduction Mining by Learning Conclusion Aspects of the Problem A Unified Approach
Aspects of Sequential Data Mining
Various Sequence Types:
univariate/multivariate,integer (discrete)/real (continous),
Various Mining Goals:
periodic pattern,
search-by-example,frequent atomic pattern,
Difficulties:
uncertainty on the y-axis(e.g., noise),uncertainty on the x-axis(e.g., time scale).
two periodic patterns:one with 3 realizations,
the other with 2.
Wang, et al Mining Complex Time-Series Data
Outline Introduction Mining by Learning Conclusion Aspects of the Problem A Unified Approach
Aspects of Sequential Data Mining
Various Sequence Types:
univariate/multivariate,integer (discrete)/real (continous),
Various Mining Goals:
periodic pattern,search-by-example,
frequent atomic pattern,
Difficulties:
uncertainty on the y-axis(e.g., noise),uncertainty on the x-axis(e.g., time scale).
Wang, et al Mining Complex Time-Series Data
Outline Introduction Mining by Learning Conclusion Aspects of the Problem A Unified Approach
Aspects of Sequential Data Mining
Various Sequence Types:
univariate/multivariate,integer (discrete)/real (continous),
Various Mining Goals:
periodic pattern,search-by-example,frequent atomic pattern,
Difficulties:
uncertainty on the y-axis(e.g., noise),uncertainty on the x-axis(e.g., time scale).
Wang, et al Mining Complex Time-Series Data
Outline Introduction Mining by Learning Conclusion Aspects of the Problem A Unified Approach
Aspects of Sequential Data Mining
Various Sequence Types:
univariate/multivariate,integer (discrete)/real (continous),
Various Mining Goals:
periodic pattern,search-by-example,frequent atomic pattern,
Difficulties:
uncertainty on the y-axis(e.g., noise),
uncertainty on the x-axis(e.g., time scale).
match?
Wang, et al Mining Complex Time-Series Data
Outline Introduction Mining by Learning Conclusion Aspects of the Problem A Unified Approach
Aspects of Sequential Data Mining
Various Sequence Types:
univariate/multivariate,integer (discrete)/real (continous),
Various Mining Goals:
periodic pattern,search-by-example,frequent atomic pattern,
Difficulties:
uncertainty on the y-axis(e.g., noise),uncertainty on the x-axis(e.g., time scale).
matches with which? ormatches with both?
Wang, et al Mining Complex Time-Series Data
Outline Introduction Mining by Learning Conclusion Aspects of the Problem A Unified Approach
Various Approaches or A Unified One
Previous Research:
Various approaches
Our Work:
A unified approach
The Unified Approach:
Learns various types ofsequences by hiddenMarkovian models;represents the temproalstructure by a graph;andmines various patternsby well-studies graphalgorithms.
resultsd
Various types ofsequences anddifficulties
Various mining algorithms
Various mining
Wang, et al Mining Complex Time-Series Data
Outline Introduction Mining by Learning Conclusion Aspects of the Problem A Unified Approach
Various Approaches or A Unified One
Previous Research:
Various approaches
Our Work:
A unified approach
The Unified Approach:
Learns various types ofsequences by hiddenMarkovian models;represents the temproalstructure by a graph;andmines various patternsby well-studies graphalgorithms.
resultsd
Various types ofsequences anddifficulties
Various mining algorithms
Various mining
Wang, et al Mining Complex Time-Series Data
Outline Introduction Mining by Learning Conclusion Aspects of the Problem A Unified Approach
Various Approaches or A Unified One
Previous Research:
Various approaches
Our Work:
A unified approach
The Unified Approach:
Learns various types ofsequences by hiddenMarkovian models;
represents the temproalstructure by a graph;andmines various patternsby well-studies graphalgorithms.
mining
resultsd
Various types ofsequences anddifficulties
Various mining algorithms
Learninghidden Markovianmodel
Temporalstructureas directedgraph
Graphalgorithmsfor
Various mining
Wang, et al Mining Complex Time-Series Data
Outline Introduction Mining by Learning Conclusion Aspects of the Problem A Unified Approach
Various Approaches or A Unified One
Previous Research:
Various approaches
Our Work:
A unified approach
The Unified Approach:
Learns various types ofsequences by hiddenMarkovian models;represents the temproalstructure by a graph;and
mines various patternsby well-studies graphalgorithms.
mining
resultsd
Various types ofsequences anddifficulties
Various mining algorithms
Learninghidden Markovianmodel
Temporalstructureas directedgraph
Graphalgorithmsfor
Various mining
Wang, et al Mining Complex Time-Series Data
Outline Introduction Mining by Learning Conclusion Aspects of the Problem A Unified Approach
Various Approaches or A Unified One
Previous Research:
Various approaches
Our Work:
A unified approach
The Unified Approach:
Learns various types ofsequences by hiddenMarkovian models;represents the temproalstructure by a graph;andmines various patternsby well-studies graphalgorithms.
mining
resultsd
Various types ofsequences anddifficulties
Various mining algorithms
Learninghidden Markovianmodel
Temporalstructureas directedgraph
Graphalgorithmsfor
Various mining
Wang, et al Mining Complex Time-Series Data
Outline Introduction Mining by Learning Conclusion Hidden Markovian Model Various HMMs VLHMM
Learning the Temporal Structure as A Graph
Wang, et al Mining Complex Time-Series Data
Outline Introduction Mining by Learning Conclusion Hidden Markovian Model Various HMMs VLHMM
Learning the Temporal Structure as A Graph
Wang, et al Mining Complex Time-Series Data
Outline Introduction Mining by Learning Conclusion Hidden Markovian Model Various HMMs VLHMM
Learning the Temporal Structure as A Graph
Wang, et al Mining Complex Time-Series Data
Outline Introduction Mining by Learning Conclusion Hidden Markovian Model Various HMMs VLHMM
Learning the Temporal Structure as A Graph
Wang, et al Mining Complex Time-Series Data
Outline Introduction Mining by Learning Conclusion Hidden Markovian Model Various HMMs VLHMM
Learning the Temporal Structure as A Graph
Wang, et al Mining Complex Time-Series Data
Outline Introduction Mining by Learning Conclusion Hidden Markovian Model Various HMMs VLHMM
Learning the Temporal Structure as A Graph
Wang, et al Mining Complex Time-Series Data
Outline Introduction Mining by Learning Conclusion Hidden Markovian Model Various HMMs VLHMM
Learning the Temporal Structure as A Graph
Wang, et al Mining Complex Time-Series Data
Outline Introduction Mining by Learning Conclusion Hidden Markovian Model Various HMMs VLHMM
Learning the Temporal Structure as A Graph
Wang, et al Mining Complex Time-Series Data
Outline Introduction Mining by Learning Conclusion Hidden Markovian Model Various HMMs VLHMM
Hidden Markov Model (HMM)
Given number of states, S , the number of contexts is S .
Short contexts → inaccurate modeling.
Wang, et al Mining Complex Time-Series Data
Outline Introduction Mining by Learning Conclusion Hidden Markovian Model Various HMMs VLHMM
Hidden Markov Model (HMM)
Given number of states, S , the number of contexts is S .
Short contexts → inaccurate modeling.
Wang, et al Mining Complex Time-Series Data
Outline Introduction Mining by Learning Conclusion Hidden Markovian Model Various HMMs VLHMM
Hidden Markov Model (HMM)
Given number of states, S , the number of contexts is S .
Short contexts → inaccurate modeling.
Wang, et al Mining Complex Time-Series Data
Outline Introduction Mining by Learning Conclusion Hidden Markovian Model Various HMMs VLHMM
Hidden Markov Model (HMM)
Given number of states, S , the number of contexts is S .
Short contexts → inaccurate modeling.
Wang, et al Mining Complex Time-Series Data
Outline Introduction Mining by Learning Conclusion Hidden Markovian Model Various HMMs VLHMM
Fixed nth-order Hidden Markov Model (n-HMM)
Given number of states, S , and the length of context, n, thenumber of contexts is Sn.
Long contexts → accurate modeling, but inefficient learning.
Wang, et al Mining Complex Time-Series Data
Outline Introduction Mining by Learning Conclusion Hidden Markovian Model Various HMMs VLHMM
Fixed nth-order Hidden Markov Model (n-HMM)
Given number of states, S , and the length of context, n, thenumber of contexts is Sn.
Long contexts → accurate modeling, but inefficient learning.
Wang, et al Mining Complex Time-Series Data
Outline Introduction Mining by Learning Conclusion Hidden Markovian Model Various HMMs VLHMM
Fixed nth-order Hidden Markov Model (n-HMM)
Given number of states, S , and the length of context, n, thenumber of contexts is Sn.
Long contexts → accurate modeling, but inefficient learning.
Wang, et al Mining Complex Time-Series Data
Outline Introduction Mining by Learning Conclusion Hidden Markovian Model Various HMMs VLHMM
Fixed nth-order Hidden Markov Model (n-HMM)
Given number of states, S , and the length of context, n, thenumber of contexts is Sn.
Long contexts → accurate modeling, but inefficient learning.
Wang, et al Mining Complex Time-Series Data
Outline Introduction Mining by Learning Conclusion Hidden Markovian Model Various HMMs VLHMM
Variable-length Hidden Markov Model (VLHMM)
Not all contexts have to be extended to fixed length of n;
Contexts have variable lengths: the shortest, but long enoughto accurately determine the next state;
Learning the minimum set of contexts for accurate modeling.
Wang, et al Mining Complex Time-Series Data
Outline Introduction Mining by Learning Conclusion Hidden Markovian Model Various HMMs VLHMM
Variable-length Hidden Markov Model (VLHMM)
Not all contexts have to be extended to fixed length of n;
Contexts have variable lengths: the shortest, but long enoughto accurately determine the next state;
Learning the minimum set of contexts for accurate modeling.
1 2
3
HMM
Wang, et al Mining Complex Time-Series Data
Outline Introduction Mining by Learning Conclusion Hidden Markovian Model Various HMMs VLHMM
Variable-length Hidden Markov Model (VLHMM)
Not all contexts have to be extended to fixed length of n;
Contexts have variable lengths: the shortest, but long enoughto accurately determine the next state;
Learning the minimum set of contexts for accurate modeling.
1 2
3
HMM
3 3
1 1
2 2
2 1
3 2
3 12 3
1 31 2
n-HMM
Wang, et al Mining Complex Time-Series Data
Outline Introduction Mining by Learning Conclusion Hidden Markovian Model Various HMMs VLHMM
Variable-length Hidden Markov Model (VLHMM)
Not all contexts have to be extended to fixed length of n;
Contexts have variable lengths: the shortest, but long enoughto accurately determine the next state;
Learning the minimum set of contexts for accurate modeling.
1 2
3
HMM
3 3
1 1
2 2
2 1
3 2
3 12 3
1 31 2
n-HMM
1 2
3 3 3
21
1 3
3
2
2
3
3
21
3
VLHMM
Wang, et al Mining Complex Time-Series Data
Outline Introduction Mining by Learning Conclusion Hidden Markovian Model Various HMMs VLHMM
Learning Variable-length Hidden Markov Model (VLHMM)
The number of contexts is unknown before learning, even withthe number of states, S , given;
This situation is called “unknown model structure” in learningtheory, and is the most of the four types of learning problems;
As the EM algorithm cannot learn the model structure, wederived a structural-EM algorithm to learn the model;
Optimizing a Minimum-Entropy criterion to learn theminimum set of contexts, and
optimizing the Maximum-likelihood criterion the estimate themodel parameters.
Wang, et al Mining Complex Time-Series Data
Outline Introduction Mining by Learning Conclusion Mining Patterns Contributions
Mining Various Kinds of Patterns
Align sequence with temporal structure
The Viterbi algorithm can setup a map from each element in thesequence to a context in the graph.
(Partial) Periodic Pattern
Finding cyclic paths in the graph. Many algorithms are developedto do this.
Search-by-Example
Input the example to the Viterbi algorithm, outputs a path that is“most likely” with the example.
Frequent Atomic Pattern
Select those contexts that frequently appear in the trainingsequence.
Wang, et al Mining Complex Time-Series Data
Outline Introduction Mining by Learning Conclusion Mining Patterns Contributions
Our Contribution
A unified framework – mining by learning
Mining from the learned temporal structure using well-studiedgraph algorithms;
“Hidden” model support learning various kinds of sequences;
Probabilistic transitions (esp, self-transitions) encodeuncertainty in time-scale; Output p.d.f.s encode noises.
VLHMM for efficient and accurate learning and mining
Optimizing two criteria simultaneously by developing astructural-EM algorithm;
Minimum-Entropy criteria → minimum number of parameters,efficient and effective learning;
Maximum-Likelihood criteria → accurate learning of thetemporal structure.
Wang, et al Mining Complex Time-Series Data
Outline Introduction Mining by Learning Conclusion Mining Patterns Contributions
Thank You for Your Attention
More details and demos can be accessed online at:http://dbgroup.cs.tsinghua.edu.cn/wangyi/vlhmm
Wang, et al Mining Complex Time-Series Data