nima hazar amin dehesh. several induction algorithm have been developed that vary in the methods...

42
Induction Systems Nima Hazar Amin Dehesh

Upload: loraine-goodwin

Post on 28-Dec-2015

214 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Nima Hazar Amin Dehesh.  Several induction algorithm have been developed that vary in the methods employed to build the decision tree or set of rules

Induction SystemsNima Hazar

Amin Dehesh

Page 2: Nima Hazar Amin Dehesh.  Several induction algorithm have been developed that vary in the methods employed to build the decision tree or set of rules

Several induction algorithm have been developed that vary in the methods employed to build the decision tree or set of rules.

ID3 is a general purpose rule induction algorithm developed by Quinlan(1979), and today is most expert system shell.

ID3

Page 3: Nima Hazar Amin Dehesh.  Several induction algorithm have been developed that vary in the methods employed to build the decision tree or set of rules

Predicting weather: temperature, wind direction, condition of sky, barometric pressure.

ID3 (cont.)

Page 4: Nima Hazar Amin Dehesh.  Several induction algorithm have been developed that vary in the methods employed to build the decision tree or set of rules

Choose most important issue first. No-data result. Excludes irrelevant factors.

ID3 Features

Page 5: Nima Hazar Amin Dehesh.  Several induction algorithm have been developed that vary in the methods employed to build the decision tree or set of rules

ID3 Algorithm

Page 6: Nima Hazar Amin Dehesh.  Several induction algorithm have been developed that vary in the methods employed to build the decision tree or set of rules

The central choice in the ID3 is selecting which attribute to test at each node in the tree.

Define a statistical property, called information gain, that measures how well a give attribute separates the training examples according to their target classification.

ID3 Algorithm (cont.)

Page 7: Nima Hazar Amin Dehesh.  Several induction algorithm have been developed that vary in the methods employed to build the decision tree or set of rules

Characterizes the purity of an arbitrary of examples.

Suppose S is collection of 14 examples: 9 positive and 5 negative ([9+,5-]).

Entropy

Page 8: Nima Hazar Amin Dehesh.  Several induction algorithm have been developed that vary in the methods employed to build the decision tree or set of rules

0log0 = 0 Entropy is 0 if all members of S belong to

same class. Entropy is 1 when the collection contains an

equal number of positive and negative. If the target attribute can take on C different

values :

Entropy (cont.)

Page 9: Nima Hazar Amin Dehesh.  Several induction algorithm have been developed that vary in the methods employed to build the decision tree or set of rules

Gain(S,A) of an attribute A, relate to be a collection of examples S, is defined as:

Values(A) is the set of all possible values for attribute A, and Sv is the subset of S for which attribute A has value v.

Information Gain

Page 10: Nima Hazar Amin Dehesh.  Several induction algorithm have been developed that vary in the methods employed to build the decision tree or set of rules

An example :

Information Gain (cont.)

Page 11: Nima Hazar Amin Dehesh.  Several induction algorithm have been developed that vary in the methods employed to build the decision tree or set of rules

Playing Tennis:

An ID3 Example

Page 12: Nima Hazar Amin Dehesh.  Several induction algorithm have been developed that vary in the methods employed to build the decision tree or set of rules

The best is “outlook”

An ID3 Example (cont.)

Page 13: Nima Hazar Amin Dehesh.  Several induction algorithm have been developed that vary in the methods employed to build the decision tree or set of rules

An ID3 Example (cont.)

Page 14: Nima Hazar Amin Dehesh.  Several induction algorithm have been developed that vary in the methods employed to build the decision tree or set of rules

This process continues for each new leaf until either of two conditions is met:

1.Every attribute has already been included along this path through the tree

2.The training examples associated with this leaf node all have the same target attribute value.

An ID3 Example (cont.)

Page 15: Nima Hazar Amin Dehesh.  Several induction algorithm have been developed that vary in the methods employed to build the decision tree or set of rules

An ID3 Example (cont.)

Page 16: Nima Hazar Amin Dehesh.  Several induction algorithm have been developed that vary in the methods employed to build the decision tree or set of rules

Determine objective

Determine decision factors: attribute nodes of the decision tree

Determine decision factor values :attribute values of the decision tree

Determine solution : list of final decisions that the system can make

Developing an induction expert system

Page 17: Nima Hazar Amin Dehesh.  Several induction algorithm have been developed that vary in the methods employed to build the decision tree or set of rules

Form example set: problem knowledge

Create decision tree

Test the system

Revise the system

Developing an induction expert system (cont.)

Page 18: Nima Hazar Amin Dehesh.  Several induction algorithm have been developed that vary in the methods employed to build the decision tree or set of rules

Predict the outcome of a football game.

Objective: a football game prediction system that can predict if our team will win or lose its next game

Decision factors : location of the game, weather, our own team’s record, and the opponent’s record

Football game prediction system

Page 19: Nima Hazar Amin Dehesh.  Several induction algorithm have been developed that vary in the methods employed to build the decision tree or set of rules

Decision factor values :

Solution: a simple binary decision

Football game prediction system (cont.)

Page 20: Nima Hazar Amin Dehesh.  Several induction algorithm have been developed that vary in the methods employed to build the decision tree or set of rules

Examples :go back over first 8 games.

Football game prediction system (cont.)

Page 21: Nima Hazar Amin Dehesh.  Several induction algorithm have been developed that vary in the methods employed to build the decision tree or set of rules

Decision Tree : ID3 Algorithm

Football game prediction system (cont.)

Page 22: Nima Hazar Amin Dehesh.  Several induction algorithm have been developed that vary in the methods employed to build the decision tree or set of rules

Testing : Next 8 games

Football game prediction system (cont.)

Page 23: Nima Hazar Amin Dehesh.  Several induction algorithm have been developed that vary in the methods employed to build the decision tree or set of rules

Revising: After discussing, another factor is the team’s health. With the values of {poor, average, good}

Football game prediction system (cont.)

Page 24: Nima Hazar Amin Dehesh.  Several induction algorithm have been developed that vary in the methods employed to build the decision tree or set of rules

Football game prediction system (cont.)

Page 25: Nima Hazar Amin Dehesh.  Several induction algorithm have been developed that vary in the methods employed to build the decision tree or set of rules

Attempts to determine the influence of various system elements on the system’s behavior.

Some shells like 1STCLASS permit you to deactivate decision factors or examples without removing.

Advantages:◦ Impact of decision factors.◦ Examples from different sources.

Sensitivity Study

Page 26: Nima Hazar Amin Dehesh.  Several induction algorithm have been developed that vary in the methods employed to build the decision tree or set of rules

Sensitivity Study

Page 27: Nima Hazar Amin Dehesh.  Several induction algorithm have been developed that vary in the methods employed to build the decision tree or set of rules

Advantages Of Induction Discovers rules from examples. Avoids knowledge elicitation problem. Can produce new knowledge. Can uncover critical decision factors. Can eliminate irrelevant decision factors. Can uncover contradictions.

Page 28: Nima Hazar Amin Dehesh.  Several induction algorithm have been developed that vary in the methods employed to build the decision tree or set of rules

Contradictions Example: Pump diagnosis system: Sometimes contradiction is acceptable. Indicates a problem:

◦ A bad example.◦ Inadequate decision factors or values.

Page 29: Nima Hazar Amin Dehesh.  Several induction algorithm have been developed that vary in the methods employed to build the decision tree or set of rules

Disadvantages of Induction Often difficult to choose good decision

factors. Difficult to understand rules. Applicable only for classification problems.

Page 30: Nima Hazar Amin Dehesh.  Several induction algorithm have been developed that vary in the methods employed to build the decision tree or set of rules

Expert Systems Developed Through Induction AQ11 WILLARD Transformer Fault Diagnosis Customer Support VAX-VMS Operating System Tuning Predicting Stock Market Behavior

Page 31: Nima Hazar Amin Dehesh.  Several induction algorithm have been developed that vary in the methods employed to build the decision tree or set of rules

AQ11 Developed in 1980 for diagnosing soybean

diseases. Capable of identifying 15 diseases. 630 examples of diseased soybean plants. Uses 35 decision factors. A special example selection program, ESEL,

was used to select 290 considerably different examples.

Page 32: Nima Hazar Amin Dehesh.  Several induction algorithm have been developed that vary in the methods employed to build the decision tree or set of rules

AQ11 (cont.) AQ11 formulated a set of rule for classifying

a new example into one of the 15 categories.

They also developed a rule-based system from knowledge of a plant pathologist.

The 340 examples were used to compare. The rule-based results 71.8%. AQ11 scored 97.6%.

Page 33: Nima Hazar Amin Dehesh.  Several induction algorithm have been developed that vary in the methods employed to build the decision tree or set of rules

WILLARD To aid forecasting thunderstorms in the

United States NSSFC (1984). Uses 140 examples of thunderstorm

weather data. 30 modules, each with a single decision

tree. Characterize to none, approaching, slight,

moderate or high with a probability range.

Page 34: Nima Hazar Amin Dehesh.  Several induction algorithm have been developed that vary in the methods employed to build the decision tree or set of rules

Was developed using RULEMASTER inductive tool.

Was tested for one week in spring of 1984, in Texas, Oklahoma, and Colorado.

5 thunderstorms passed through this region. WILLARD was found to compare favorable

with those made by an expert meteorologist.

WILLARD (cont.)

Page 35: Nima Hazar Amin Dehesh.  Several induction algorithm have been developed that vary in the methods employed to build the decision tree or set of rules

Transformer Fault Diagnosis Designed to detect early signs of transformer

faults in order to avoiding potential failure. An expert system for evaluating the gas-in-oil

test results. From knowledge contained in the form of past

examples using RULEMASTER rule induction tool.

Contains 27 modules, each with its own induced rule.

Tested using data from 900 tests. 90% agreed with the expert’s analysis.

Page 36: Nima Hazar Amin Dehesh.  Several induction algorithm have been developed that vary in the methods employed to build the decision tree or set of rules

Customer Support SCREENIO, a software by NORCOM, allows

the user to design IBM PC screens for Realia COBOL programs.

NORCOM developed an expert system to aid their support personnel.

9 months of data on typical customer problems.

9 decision factors and was developed in 1 day using 1STCLASS.

Page 37: Nima Hazar Amin Dehesh.  Several induction algorithm have been developed that vary in the methods employed to build the decision tree or set of rules

Is used by support personnel who ask the customer for values for each of the decision factors.

Made a major improvement in customer support responsiveness and efficiency.

Customer Support (cont.)

Page 38: Nima Hazar Amin Dehesh.  Several induction algorithm have been developed that vary in the methods employed to build the decision tree or set of rules

An expert system to help tune the VAX-VMS operating system (1984).

Tuning is a complex and dynamic task. Over 150 parameter must be set by the system manager.

Adjustments and modifications are required in response to changes in configuration and loading.

The developed system collects data on present system performance and generate a summary report.

VAX-VMS Operating System Tuning

Page 39: Nima Hazar Amin Dehesh.  Several induction algorithm have been developed that vary in the methods employed to build the decision tree or set of rules

Interacts with the system manager by asking questions that lead to a recommended action.

Recommendation:◦ Adjusting system parameters◦ Adjusting user authorization values◦ Redistribution or reducing user demand◦ Purchasing new hardware◦ …

VAX-VMS Operating System Tuning (cont.)

Page 40: Nima Hazar Amin Dehesh.  Several induction algorithm have been developed that vary in the methods employed to build the decision tree or set of rules

Was developed using the induction tool TIMM.

The effectiveness is measured by comparing the performance before and after the recommended actions are taken.

Has made the management of the operating system a more efficient task.

VAX-VMS Operating System Tuning (cont.)

Page 41: Nima Hazar Amin Dehesh.  Several induction algorithm have been developed that vary in the methods employed to build the decision tree or set of rules

Is a difficult challenge. Analysts use historical data analysis techniques which provide a degree of uncertainty.

Braun developed an expert system based on ID3 to improve reliability of stock market prediction. (1987)

20 decision factors, values were determined over a time between March 1981 to April 1983 from Wall Street Journal.

Predicting Stock Market Behavior

Page 42: Nima Hazar Amin Dehesh.  Several induction algorithm have been developed that vary in the methods employed to build the decision tree or set of rules

3 different results:◦ Bullish: upward trend◦ Bearish: downward trend◦ Neutral: either call is too risky

Correctly predicted 64.4% of the time. The expert analyst predicted 60.2%. The analyst was impressed with general

structure of the decision tree and with system predictions.

Predicting Stock Market Behavior (cont.)