mining change events in large datasets
TRANSCRIPT
- 1.Hashmat Rohian Jiashu Zhao
2.
- Discover patterns whose frequency dramatically changes over time or any other dimension (FP mining extension)
- Discover new rules associating changes (Financial markets)
- Predict changes in one variable based on the changes in another dimensions (Outbreak detection)
3.
- Design practical and useful approach to discovering novel and interesting change knowledge from large databases
- Analyze and present the knowledge mined in a clear and coherent manner
- Evaluate the knowledge based on a gold standard
4.
- Qian's CPD(Change Point Detection) Algorithm
-
- Based on Qians measure
- Improved CPD1 { Divide and Conquer }
-
- Using Divide & Conquer with global ratios
- Improved CPD2 { Divide and Conquer }
-
- Using Divide & Conquer with local ratios
- Binomial method
- The Kolmogorov-Smirnov test (KS-test)
5.
- Level-wise search
- k-itemsets (itensets with k items) are used to explore (k+1)- itemsets from transactional databases
- First, the set of frequent 1-itemsets is found (denoted L1)
- L1 is used to find L2, the set of frquent 2-itemsets
- L2 is used to find L3, and so on, until no frequent k-itemsets can be found
- Generate strong association rules from the frequent itemsets
6.
- Transitional ratio
- First Derivative
- Second Derivative
-
- the rate of change of the rate of change
- Etc.
7. 8. 9. 10. 11. 12.
- A stock market index is a method of measuring a section of the stock market. We use 27 stock market indices.
13. 14. 15.
- Statistical tools are more accurate for CPD
- Binary points produce robust change points
- The transitional ratio and the slope change measures have very similar results
- Local change point estimation based on true and false points produce consistent measure
- Both transitional ratio and slope robust for noisy or incomplete datasets
16.
- Use binary data for CPD and real data for change measure
- Use regression to predict changes in one dimension using variables
- Incorporate our system in the FP mining
- Apply our methods on other real datasets
- Make our system more efficient and automated
17.
- Questions?
- Comments?
- Feedbacks?