(1) abstract and contents new model selection criteria called matchability which is based on...
TRANSCRIPT
(1) Abstract and Contents
New model selection criteria called Matchability which is based on maximizing matching opportunity is proposed. Given data set is decomposed into set of reusable partial situations for powerful prediction. This technology is effective
for pre-processing of data analysis and pattern recognition.
Contents:
1. Matchable principle for Prediction (2)-(5)
2. Formalization and Matchability (6)-(9)
3. Search algorithm (10)-(13)
4. Simulation and Results (14)-(15)
(2) Models for Prediction
(1)Memory based reasoning• Weak prediction ability
• No model
(2)Model based reasoning• Powerful prediction ability
• Needs Model Selection Criteria
Our approach Prediction based on informational Scrap&Build. Set of Small Situation is one kind of Models.
(3) Origin of Model Selection Criteria
It is the preconception that model is selected from the hypothesis space which can explain data.
Three Origin
・ Simplicity of Model
・ Consistency for Data: Accuracy, Minimize Error
・ Coverage for Data: Increasing covering case/feature
Ockham’s razor → MDL 、 AIC“Simplest model is selected with increasing Consistency for Data”
Matchable principal (maximizing Matching Opportunity)
“Simplest model is selected with increasing Coverage for Data”
(4) Criteria based on Trade-off of factors
Simplicityof Model
Ockham’s razor
Consistencyfor Data
Coveragefor Data
Matchable principal
Feature general criteria should include these three factors
Case-increasingFeature-increasing Accuracy
Minimize error
(5)
“Simplest model is selected with increasing Coverage for Data”
Deriving Matchable Principal
Matchable principal
Maximizing Matching Opportunity
Powerful predictable ModelEmpirical based processing
based on Matching
Model which has large matching opportunity can predict powerfully
(6) Situation Decomposition
Extracting partial situations which are combination of selected Feature and Case from spread sheet.
MS1
MS2
MS3 MS4
Matchability=This criteria evaluates Matching Opportunity
Matchable Situation = Local maximums of Matchability
(7) Whole situation and Partial situations
Whole situation J=(D, N) : Contains N features and D cases.
Feature selection vector: d = (d 1 , d 2 ,…,dD)
Case selection vector : n = (n1, n2,…,nN)
Vector element di,ni are binary indicator of
selection/unselection.
Number of selected features: d
Number of selected cases : n
Selecting all features: D
Selecting all cases: N
Situation decomposition extracts some matchable situations from whole situation J=(D, N) which potentially contains 2D+N partial situation.
(8) Case selection using Segment space
Segment space is multiplication of separation of each selected features.
n : Number of selected cases
→ Make Larger
Sd : Number of total segments
→ Make Larger
rd : Number of selected
segments
→ Make Smaller
※ Cases inside the chosen segments are surely chosen.
Sd =s1 s2
(9)
dd SC
rC
NCNSrnM logloglog),,,( 321 dd
Matchability on Spread sheet form
Three factors enlarge a matching opportunity.[Case-increasing in situation]
n →Make Larger
[Feature-increasing in situation]
Sd →Make Larger
[Simplicity of situation]
rd →Make Smaller
nn
Sd
rd
rdN: Total number of cases, C1, C 2 , C 3 : Positive
constant
(10) Algorithm Overview
for each subset of d of D Search Local maximums
(procedure 2) Reject saddle point (procedure 3) end
Time complexity 2∝
D
(11) Segment selecting space without no-case
① We don’t take care of the segment which contains no case from Matchability nature.
Size of this searching space
= 2Rd
rd
where Rd is number ofsegment that contains
one or more cases.
①
②
(12) Searching on Sorted segments
② Many case containing segment are selected prior to the less containing segment from Matchability nature.
Only one set of segment could be local maximum for one
number of selecting segments rd
Sorting segment and search local maximums.
rd
(13) Reject Saddle point
Local maximums for feature selecting vector d is tested by changing selecting features.
If superior to every that is not saddle point.
Then is local
maximum
→ Matchable situation
(14) Simulation and Result
Input situation 11×11 cases are arranged to notches at a
regular interval of 0.1 on a plane• Situation A: plane x + z = 1• Situation B: plane y +z = 1
Extracted situation Input Situations
• MS 1= Input Situation A• MS 2= Input Situation B
A New Situation
• MS 3 :
line x = y, x + z = 1
(15) Powerful prediction using Matchable situations
Multi-valued function φ:(x,y)→z
1. Generalization ability• Even if the input situation A (x+z=1) lacks half of its parts, such that
no data exists in the range y>0.5, our method outputs φMS1(0,1)=1.0.
2. The output of every situation• Output is generated depending on situations.
Output could be average value (φ(0,1)=0.5 ), without decomposed situation.
(16) Conclusions & Future work
Matchability is new model selection criterion maximizing matching opportunity, which emphasize Coverage for data. In opposition ockham’s razor emphasize the Consistency for data. Decomposed situations by matchability criterion has powerful prediction ability. Situation decomposition method can be applied to pre-processing of data analysis, self-organization, pattern recognition and so on.
Future work Needs theoretical studies on Matchabilty criterion.
• This criteria is delivered intuitively.
Needs speed up for large-scale problem.• Exponential time complexity for number of future is awful.
Combing this method to other data analyses method• This method could be the pre-processing for neural network, liner
regression etc....