classic case studies

33
COM362 Knowledge Engineering Classic Case Studies 1 Classic Case Studies John MacIntyre 0191 515 3778 [email protected]

Upload: les

Post on 05-Jan-2016

37 views

Category:

Documents


0 download

DESCRIPTION

Classic Case Studies. John MacIntyre 0191 515 3778 [email protected]. The Classics. DENDRAL: determine molecular structure of an unknown compound started in 1965 MYCIN: medical diagnosis system started in 1972. DENDRAL. Developed at Stanford University in 1965 - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Classic Case Studies

COM362 Knowledge EngineeringClassic Case Studies

1

Classic Case Studies

John MacIntyre0191 515 3778

[email protected]

Page 2: Classic Case Studies

COM362 Knowledge EngineeringClassic Case Studies

2

The Classics

DENDRAL: determine molecular structure of an

unknown compound started in 1965

MYCIN: medical diagnosis system started in 1972

Page 3: Classic Case Studies

COM362 Knowledge EngineeringClassic Case Studies

3

DENDRAL

Developed at Stanford University in 1965

Possibly the first computer program EVER to rival human experts in a specialized field

Determine molecular structure of an unknown compound

Used a modified form of “generate and test” methodology

Page 4: Classic Case Studies

COM362 Knowledge EngineeringClassic Case Studies

4

The DENDRAL Problem

Chemist is presented with an unknown chemical compound

Chemist must determine the molecular structure

Therefore needs to find out which atoms are in the structure

Needs to know how the atoms are connected to form molecules

Page 5: Classic Case Studies

COM362 Knowledge EngineeringClassic Case Studies

5

The DENDRAL Problem

Data from mass spectrometer Not straight-forward!

Molecules can fragment in different ways need to make some predictions about how

molecules are LIKELY to break sub-components of the molecule may be

found in many different compounds chemists therefore determine compound

sub-components, and apply constraints that other sub-components must satisfy

Page 6: Classic Case Studies

COM362 Knowledge EngineeringClassic Case Studies

6

The DENDRAL Problem

Not a trivial problem! Consider the formula: C6H13NO2

There are 10,000 isomers of this compound!! Each permutation can be uniquely identified Could simply generate each of the10,000

permutations in turn and test Very expensive in computing time! There would like to constrain the

generation of candidate permutations to save time

Page 7: Classic Case Studies

COM362 Knowledge EngineeringClassic Case Studies

7

Constrained Generation

CONGEN: DENDRAL program for constrained

generation of complete chemical structures Manipulates symbols representing atoms

and molecules Uses a set of constraints on how atoms can

be inter-connected Chemist can specify and vary the initial

constraints (eg based on experimental evidence)

Page 8: Classic Case Studies

COM362 Knowledge EngineeringClassic Case Studies

8

Specifying Constraints

Defining “constraining structures”: specify “superatoms” that compound must

contain typically in organic compounds, rings or

chains of carbon atoms linked to hydrogens

Defining other constraints: open for the chemist to hypothesize eg “compound must contain a carbon ring

of 6 carbon atoms” etc….

Page 9: Classic Case Studies

COM362 Knowledge EngineeringClassic Case Studies

9

Assessing Candidates

CONGEN may produce hundreds or thousands of candidate structures

First pass at assessing the candidates: Use basic rules of mass spectrometry to test

candidates and remove most unlikely ones MSPRUNE: another DENDRAL program

which does this MSRANK: ranks remaining structures

according to how their graphs match expected graphs for known compounds

Page 10: Classic Case Studies

COM362 Knowledge EngineeringClassic Case Studies

10

Scoring Candidates

Peaks (features) in the spectral graphs are weighted to represent their importance

Weighted scores are produced to give the rank ordering for each candidate structure

Essentially this is a “hypothesize-and-test” strategy

Page 11: Classic Case Studies

COM362 Knowledge EngineeringClassic Case Studies

11

Evaluating DENDRAL

Available on the network of Stanford University, California

Used by hundreds of people around the world every day

Has been used to challenge long-published chemical literature successfully

The first stepping-stone between “traditional” problem solving and modern expert systems

Page 12: Classic Case Studies

COM362 Knowledge EngineeringClassic Case Studies

12

Features of DENDRAL

Uses information from domain experts to help limit the search space for candidate structures

Uses an explicit representation of knowledge - fragmentation rules

No real inference mechanism - iterative passes through the rules controlled by user

Page 13: Classic Case Studies

COM362 Knowledge EngineeringClassic Case Studies

13

The Keys to Success?

DENDRAL was successful because: It did not set out to replace the expert, only

to assist the expert The search technique is based on a proven

model of knowledge with known mathematical properties

There is a language which can be used to represent the structures easily and is well specified

Page 14: Classic Case Studies

COM362 Knowledge EngineeringClassic Case Studies

14

MYCIN

Developed at Stanford University in 1972

Regarded as the first true “expert system”

Assist physicians in the treatment of blood infections

Many revisions and extensions to MYCIN over the years

Page 15: Classic Case Studies

COM362 Knowledge EngineeringClassic Case Studies

15

The MYCIN Problem

Physician wishes to specify an “antimicrobial agent” - basically an antibiotic - to kill bacteria or arrest their growth

Some agents are poisonous! No agent is effective against all bacteria Most physicians are not expert in the

field of antibiotics

Page 16: Classic Case Studies

COM362 Knowledge EngineeringClassic Case Studies

16

The Decision Process

There are four questions in the process of deciding on treatment: Does the patient have a significant

infection? What are the organism(s) involved? What set of drugs might be

appropriate to treat the infection? What is the best choice of drug or

combination of drugs to treat the infection?

Page 17: Classic Case Studies

COM362 Knowledge EngineeringClassic Case Studies

17

MYCIN Components KNOWLEDGE BASE:

facts and knowledge about the domain

DYNAMIC PATIENT DATABASE: information about a particular case

CONSULTATION PROGRAM:asks questions, gives advice on a particular case

EXPLANATION PROGRAM:answers questions and justifies advice

KNOWLEDGE ACQUISITION PROGRAM:adds new rules and changes exisiting rules

Page 18: Classic Case Studies

COM362 Knowledge EngineeringClassic Case Studies

18

Basic MYCIN Structure

Explanation Program

Consultation Program

Knowledge Acquisition Program

Static Knowledge

Base

DynamicPatient

Data

Physician User

Infectious Disease Expert

Page 19: Classic Case Studies

COM362 Knowledge EngineeringClassic Case Studies

19

The MYCIN Knowledge Base

Where the rules are held Basic rule structure in MYCIN is:

if condition1 and….and conditionm hold

then draw conclusion1 and….and conditionn

Rules written in the LISP programming language

Rules can include certainty factors to help weight the conclusions drawn

Page 20: Classic Case Studies

COM362 Knowledge EngineeringClassic Case Studies

20

An Example Rule

IF:(1) The stain of the organism is Gram negative, and

(2) The morphology of the organism is rod, and

(3) The aerobicity of the organism is aerobic

THEN:

There is strongly suggestive evidence (0.8) that the class of the organism is Enterobacteriaceae

Page 21: Classic Case Studies

COM362 Knowledge EngineeringClassic Case Studies

21

Calculating Certainty

Rule certainties are regarded as probabilities

Therefore must apply the rules of probability in combining rules

Multiplying probabilities which are less than certain results in lower and lower certainty!

Eg 0.8 x 0.6 = 0.48

Page 22: Classic Case Studies

COM362 Knowledge EngineeringClassic Case Studies

22

Other Types of Knowledge

Facts and definitions such as: lists of all organisms known to the system “knowledge tables” of clinical parameters

and the values they can take (eg morphology)

classification system for clinical parameters and the context in which they are applied (eg referring to patient or organism)

Much of MYCIN’s knowledge refers to 65 clinical parameters

Page 23: Classic Case Studies

COM362 Knowledge EngineeringClassic Case Studies

23

MYCIN’s Context Trees

Used to organise case data Helps to visualise how information

within the case is related Easily extended and adapted as more

clinical evidence becomes available

Page 24: Classic Case Studies

COM362 Knowledge EngineeringClassic Case Studies

24

Example Context TreePATIENT-1

CULTURE-1

ORGANISM-1

CULTURE-2 CULTURE-3 OPERATION

ORGANISM-2 ORGANISM-3

DRUG-1 DRUG-2

Page 25: Classic Case Studies

COM362 Knowledge EngineeringClassic Case Studies

25

MYCIN Control Structure

Uses a goal-based strategy to attempt to solve, in the first instance, a TOP LEVEL GOAL RULE

Establishes sub-goals required to satisfy the top level goal

Therefore establishes the concept of backward chaining

Page 26: Classic Case Studies

COM362 Knowledge EngineeringClassic Case Studies

26

Top Level Goal

IF:(1) There is an organism which requires therapy;

and

(2) consideration has been given to any other

organism requiring therapy

THEN:

compile a list of possible therapies, and

determine the best one in this list

Page 27: Classic Case Studies

COM362 Knowledge EngineeringClassic Case Studies

27

MYCIN Subgoals

Sub-goals are a generalised form of the top-level goal

Hence sub-goals consider the proposition that there is a particular organism

Exhaustive search on all relevant rules to test this proposition (until or unless one succeeds with total certainty)

More like exhaustive search than backward chaining

Page 28: Classic Case Studies

COM362 Knowledge EngineeringClassic Case Studies

28

Selection of Therapy

Done after the diagnostic phase is complete

Two phases: Selection of a list of candidate drugs Choice of preferred drugs or combinations

of drugs from the list

Therapy rules use information on: Sensitivity of organism to drug Contraindications on the drug

Page 29: Classic Case Studies

COM362 Knowledge EngineeringClassic Case Studies

29

Example Recommendation

IF: The identity of the organism is Pseudomonas

THEN:

I recommend therapy from the following drugs:

1 - COLISTIN (0.98)

2 - POLYMYXIN (0.96)

3 - GENTAMICIN (0.96)

4 - CARBENICILLIN (0.65)

5 - SULFISOXAZOLE (0.64)

Page 30: Classic Case Studies

COM362 Knowledge EngineeringClassic Case Studies

30

Evaluating MCYIN

Many studies show that MYCIN’s recommendations compare favourably with experts for diseases like meningitis

Study compared on real patients with expert and non-expert physicians: MYCIN matched experts MYCIN was better than non-experts

Page 31: Classic Case Studies

COM362 Knowledge EngineeringClassic Case Studies

31

MYCIN Limitations

A research tool - never intended for practical application

Limited knowledge base - only covers a small number of infectious diseases

Needed more computing power than most hospitals had at the time!

Doctors reluctant to use it Poor interface

Page 32: Classic Case Studies

COM362 Knowledge EngineeringClassic Case Studies

32

Conclusions

DENDRAL was a ground-breaking program as it showed that computers could match experts in a specific domain

DENDRAL was always intended as an “expert assistant”

MYCIN was the first “expert system” which included an inference control structure

MYCIN is limited for practical use

Page 33: Classic Case Studies

COM362 Knowledge EngineeringClassic Case Studies

33

Further Reading

Introduction to Expert Systems P. Jackson, Addison Wesley, 1990

Expert Systems: Principles and Programming J. Giarratano, G. Riley, PWS Publishing, 1994

Artificial Intelligence: Tools, Techniques and Applications T. O’Shea, M. Eisenstadt, Open University,

1984