identifying critical factors in case-based prediction
DESCRIPTION
R. Weber College of Information Science & Technology Drexel University. Identifying Critical Factors in Case-Based Prediction. Outline. Case-Based Prediction, Critical Factors Motivation Background: Use of Domain Knowledge Methods to Identify Critical Factors - PowerPoint PPT PresentationTRANSCRIPT
Identifying Critical Factors in Case-Based Prediction
R. WeberCollege of Information Science & Technology
Drexel University
Identifying Critical Factors in Case-Based PredictionR Weber 18-May FLAIRS 2004
Outline1. Case-Based Prediction, Critical Factors2. Motivation3. Background: Use of Domain Knowledge4. Methods to Identify Critical Factors
Gradient descent, Logistic regression, Feature-oriented
Case-based, Knowledge-based, Union 5. Comparative Study
Dataset, Methodology, Results6. Conclusions7. Future Work
Identifying Critical Factors in Case-Based PredictionR Weber 18-May FLAIRS 2004
Case-Based Prediction• The predicted outcome can be:
– Irreversible• Path of natural disasters, e.g. hurricane, tornados
– Reversible• Ongoing project outcome, project effort, cost; health
conditions• Critical Factors:
– features (feature-value) that support the outcome
– significant changes in their values can potentially reverse the prediction either alone or in conjunction with changes in values of other critical factors
– Critical Success and Critical Failure Factors
Identifying Critical Factors in Case-Based PredictionR Weber 18-May FLAIRS 2004
Motivation• Assumption:
– Users are interested in prediction of reversible outcomes so they can reverse unwanted predictions
• Health conditions, project/system failure• Aamodt and Nygaard (1995):
– Consider the entire application context (including user’s perspective) to maximize usefulness of CBR systems
• Motivation:– Case-based prediction systems that do not indicate
effective and efficient ways to reverse unwanted outcomes do not take into account the user’s perspective.
– Find a minimal set of critical factors that maximize the chances of reversing unwanted outcomes
Identifying Critical Factors in Case-Based PredictionR Weber 18-May FLAIRS 2004
Background on Case-Based Prediction
• ICCBR 2001: Kadoda et al. has stated that design decisions depend on the dataset
• FLAIRS 2002: Watson et al. has evaluated different design decisions because of such bias
• CBRW91: Cain, Pazzani, Silverstein proposed EBL+CBR to improve accuracy of case-based prediction when features outnumber cases
• ICCBR03: Weber et al. confirmed the improvement in accuracy (scarce data, bias) against other CBR techniques and logistic regression Microsoft
PowerPoint Presentation
Identifying Critical Factors in Case-Based PredictionR Weber 18-May FLAIRS 2004
Methods to Identify Critical Factors
Scope• Personalized
– Methods that identify failure and success factors that are specific to the case under assessment and to its actual values
• Collective– They only identify the features– Provide trends based upon a community of
cases. When this community consists of real world experiences, they represent evidence of the importance of these factors
Identifying Critical Factors in Case-Based PredictionR Weber 18-May FLAIRS 2004
Collective Methods• Gradient descent
– Critical factors are those features whose resulting importance values are above the overall average.
• Logistic regression– Critical factors are those features with the
strongest correlations to the outcome and then these features are used for prediction purposes
• Feature-oriented – Using LOOCV, submit a project description for
prediction and observe the resulting accuracy; then, submit each feature separately and the success factors the features that produce accuracy closest to the overall accuracy of true positives and as failure factors the ones with overall accuracy closest to true negatives
Identifying Critical Factors in Case-Based PredictionR Weber 18-May FLAIRS 2004
Personalized Methods• Case-based
– Failure factors are feature-value pairs that co-occur in both the target case and in the similar case(s) that was(ere) used to predict failure in the target???????
• Knowledge-based– Submit new case to the EBL method to identify
relevance factors with the resulting prediction– In predictions of failure, the feature-values
assigned relevance factors are critical failure factors
– For the remaining features, we replaced the predicted outcome to assign relevance factors for the alternate outcome
• Union– We combined the knowledge-based and the case-
based methods by taking the union of the factors each individually identify.
Identifying Critical Factors in Case-Based PredictionR Weber 18-May FLAIRS 2004
Comparative Study: Dataset• Dataset
– 20 out of 88 real cases of software development projects
– 23 symbolic features – The 12 out of 21 projects have all
originally failed and when submitted to the EBL+CBR prediction, they were predicted to fail.
Identifying Critical Factors in Case-Based PredictionR Weber 18-May FLAIRS 2004
Comparative Study: Methodology • Methodology consists of 3 stages:
– 1) Identification of critical factors– 2) Overturn– 3) Prediction
Identifying Critical Factors in Case-Based PredictionR Weber 18-May FLAIRS 2004
• GD maximizes reversal but does minimize the set of factors
• Feature-oriented is the most efficient• Methods currently used performed most
poorly
Results for Collective Methods
Identifying Critical Factors in Case-Based PredictionR Weber 18-May FLAIRS 2004
Results for Personalized Methods
Results for Knowledge-Based Overturning
Identifying Critical Factors in Case-Based PredictionR Weber 18-May FLAIRS 2004
Knowledge-Based Overturning• Personalized
– Different methods are able to reverse a project’s prediction using different sets of factors, and one method reversed a prediction contrary to domain knowledge.
• Collective– GD failed to reverse one project. However,
when we perform knowledge-based overturning we found that it still cannot reverse that one project. More interestingly, some projects are no longer reversed.
Identifying Critical Factors in Case-Based PredictionR Weber 18-May FLAIRS 2004
Conclusions ?? Recommendations
• Domain specific conclusion– 2 factors were identified by all of them
• a well defined scope• end users having time for requirements gathering--
• Domain knowledge combined with contextual experiential knowledge may uncover knowledge
• Define the level of reversibility of factors, e.g., using measures of efficiency of factors throughout the dataset and by project. Factors that are easy to reverse should receive priority.
Identifying Critical Factors in Case-Based PredictionR Weber 18-May FLAIRS 2004
Future Work• Case-based framework to learn:
– Weights for EBL rules– Dependencies between rules– Dependencies between factors
• How to use contextual knowledge embedded in cases to reverse unwanted outcomes?– Use collective methods to identify
critical factors and then use cases to assess their potential to reverse unwanted outcomes
Identifying Critical Factors in Case-Based PredictionR Weber 18-May FLAIRS 2004
Acknowledgements• Co-authors
– William Evanco, Michael Waller, June Verner
• Colleagues– This and previous work
• Anonymous reviewers• National Institute for Systems Test
and Productivity
Identifying Critical Factors in Case-Based PredictionR Weber 18-May FLAIRS 2004
Questions? Ideas? Comments?