s10

1. Artificial Intelligence10. Machine Learning Overview Course V231 Department of Computing Imperial College Simon Colton

Learning in humans consists of (at least):

memorisation, comprehension, learning from examples

Learning from examples

Square numbers: 1, 4, 9 ,16

1 = 1 * 1;4 = 2 * 2;9 = 3 * 3;16 = 4 * 4;

What is next in the series?

We can learn this by example quite easily

Machine learning is largely dominated by

Inductive reasoning

Induce a pattern (hypothesis) from a set of examples

This is anunsoundprocedure (unlike deduction)

Categorisation

Learn why certain objects are categorised a certain way

E.g, why are dogs, cats and humans mammals, but trout, mackeral and tuna are fish?

Learn attributes of members of each category from background information, in this case: skin covering, eggs, homeothermic,

Prediction

Learn how to predict how to categorise unseen objects

E.g., given examples of financial stocks and a categorisation of them into safe and unsafe stocks

Learn how to predict whether a new stock will be safe

Agents can learn these from examples:

which chemicals are toxic (biochemistry)

which patients have a disease (medicine)

which substructures proteins have(bioinformatics)

what the grammar of a language is (natural language)

which stocks and shares are about to drop (finance)

which vehicles are tanks (military)

which style a composition belongs to (music)

Specify your problem as a learning task

Choose the representation scheme

Choose the learning method

Apply the learning method

Assess the results and the method

The example set

The background concepts

The background axioms

The errors in the data

Express as aconcept learningproblem

Whereby the concept solves the categorisation problem

Usually need to supply pairs (E, C)

Where E is an example, C is a category

Positives: (E,C) where C is thecorrectcategory for E

Negatives: (E,C) where C is anincorrectcategory for E

Techniques which dont need negatives

Can learnfrom positives only

Questions about examples:

How many does the technique need to perform the task?

Do we need both positive and negative examples?

Problem: learn reasons for animal taxonomy

Into mammals, fish, reptile and bird

Positives:

(cat=mammal); (dog=mammal); (trout=fish);

(eagle=bird); (crocodile=reptile);

Negatives:

(condor=fish); (mouse=bird); (trout=mammal);

(platypus=bird); (human=reptile)

Concepts which describe the examples

(Some of) which will be found in the solution to the problem

Some concepts are required to specify examples

Example: pixel data for handwriting recognition (later)

Cannot say what the example is without this

Some concepts are attributes of examples (functions)

number_of_legs(human) = 2; covering(trout) = scales

Some concepts specify binary categorisations:

is_homeothermic(human); lays_eggs(trout);

Questions about background concepts

Which will be most useful in the solution?

Which can be discarded without worry?

Which are binary, which are functions?

Similar to axioms in automated reasoning

Specify relationships between

Pairs of background concepts

Example:

has_legs(X) = 4covering(X) = hair or scales

Can be used in the search mechanism

To speed up the search

Questions about background axioms:

Are they correct?

Are they useful for the search, or surplus?

In real world examples

Errors are many and varied, including:

Incorrect categorisations:

E.g., (platypus=bird) given as a positive example

Missing data

E.g., no skin covering attribute for falcon

Incorrect background information

E.g, is_homeothermic(lizard)

Repeated data

E.g., two different values for the same function and input

covering(platypus)=feathers & covering(platypus)=fur

Question: Why are the LH trains going Eastwards?

What are the positives/negatives?

What are the background concepts?

What is the solution?

Toy problem (IQ test problem)

No errors & a single perfect solution

Background concepts:

Pixel information

Categorisations:

(Matrix, Letter) pairs

Both positive & negative

Correctly categorise

An unseen example

Into 1 of 26 categories

Positive:

This is a letter S:

Negative:

This is a letter Z:

The representation scheme

The search method

The method for choosing from rival solutions

Must choose how to represent the solution

Very important decision

Three assessment methods for solutions

Predictive accuracy (how good it is as the task)

Can use black box methods (accurate but incomprehensible)

Comprehensibility (how well we understand it)

May trade off some accuracy for comprehensibility

Utility (problem-specific measures of worth)

Might override both accuracy and comprehensibility

Example: drug design (must be able to synthesise the drugs)

Inductive logic programming

Representation scheme is logic programs

Decision tree learning

Representation scheme is decision trees

Neural network learning

Representation scheme is neural networks

Other representation schemes

Hidden Markov Models

Bayesians Networks

Support Vector Machines

Some techniques dont really search

Example: neural networks

Other techniques do perform search

Example: inductive logic programming

Can specify search as before

Search states, initial states, operators, goal test

Important consideration

General to specific or specific to general search

Both have their advantages

Some learning techniques return one solution

Others produce many solutions

May differ in accuracy, comprehensibility & utility

Question: how to choose just one from the rivals?

Need to do this in order to

(i) give the users just one answer

(ii) assess the effectiveness of the technique

Usual answer: Occams razor

All else being equal, choose the simplest solution

When everything is equal

May have to resort to choosing randomly

This is a specific to general search

Guaranteed to find the most specific solutions (of the best)

At the start:

Generate a set of most specific hypotheses

From the positives (solution must be true of at least 1 pos)

Repeatedly generalise hypotheses

So that they become true of more and more positives

At the end:

Work out which hypothes(es) are true of

The most positives and the fewest negatives (pred. accuracy)

Take the most specific one out of the most accurate ones

Use a representation which consists of:

A set of conjoined attributes of the examples

Look at positive P 1

Find all the most specific solutions which are true of P 1

Call this set H = {H 1 , , H n }

Look at positive P 2

Look at each H i

Generalise H iso that it is true of P 2(if necessary)

If generalised, call the generalised version H n+1and add to H

Generalise by making ground instances into variables

i.e., find theleast general generalisation

Look at positive P 3and so on

Template:

Examples:

Three attributes

5 possible values (H,O,C,N,?)

Look at P1:

Possible hypotheses are: &

(not counting their reversals)

Hence H = {, }

Look at P2:

is not true of P2

But we can generalise this to:

And this is now true of P2 (dont generalise any further)

is not true of P2

But we can generalise this to:

Hence H becomes {,,,}

Now look at P3 and continue until the end

Thenmust start the whole process again with P2 first

Generalisation process gives 9 answers:

s10

Documents

examples example

background concepts

learning method

problem constituents

positive example

comprehensibility example

negative examples

negatives problem