using qualitative knowledge in numerical learning

USING

QUALITATIVE KNOWLEDGE

IN NUMERICAL LEARNING

Ivan Bratko

Faculty of Computer and Info. Sc.

University of Ljubljana

Slovenia

THIS TALK IS ABOUT:

AUTOMATED MODELLING FROM DATA

WITH MACHINE LEARNING

COMBINING NUMERICAL AND QUALITATIVE

REPRESENTATIONS

3

BUILING MODELS FROM DATA

Observed

system

Machine learning,

numerical regression

Model of system

Data

EXAMPLE: POPULATION DYNAMICS

A lake with zooplankton, phytoplankton and nutrient nitrogen

Variables in system:

Nut

Phyto

Zoo

POPULATION DYNAMICS

Observed behaviour in time

0 20 40 60 800.0

0.4

0.8

1.2

1.6

NutPhytoZoo

Data provided by Todorovski&Džeroski

PRIOR KNOWLEDGE

We would like our modelling methods to make use of expert’s prior knowledge (possibly qualitative)

Phytoplankton feeds on Nutrient,

Zooplankton feeds on Phytoplankton

Nutrient Phyto Zoo

QUALITATIVE DIFFICULTIES OF

NUMERICAL LEARNING

Learn time behavior of water level:

h = f( t, initial_outflow)

Level h

outflow t

h

0

20

40

60

80

100

h

t=1

t=19

TIME BEHAVIOUR OF WATER LEVEL

Initial_ouflow=12.5

0

20

40

60

80

100

h

t=1

t=19

VARYING INITIAL OUTFLOW

Initial_ouflow=12.5

11.25

10.08.75

6.25

0

20

40

60

80

100

h

t=1

t=19

PREDICTING WATER LEVEL WITH M5

Qualitatively incorrect – water level cannot increase

M5 prediction

11.25

10.08.75

6.25

7.5Initial_ouflow

=12.5

QUALITATIVE ERRORS OF

NUMERICAL LEARNERS

Experiments with regression (model) trees (M5; Quinlan 92), LWR (Atkenson et.al. 97) in Weka (Witten & Frank 2000), neural nets, ...

Qualitative errors:

water level should never increase

water level should not be negative

An expert might accept numerical errors, but such qualitative errors are particularly disturbing

Q2 LEARNING

AIMS AT OVERCOMING THESE

DIFFICULTIES

Q2 LEARNINGŠuc, Vladušič, Bratko; IJCAI’03, AIJ 2004, IJCAI’05

Aims at overcoming these difficulties of numerical learning

Q2 = Qualitatively faithful Quantitative learning

Q2 makes use of qualitative constraints

QUALITATIVE CONSTRAINTS

FOR WATER LEVEL

For any initial outflow:

Level is always decreasing with time

For any time point:

Greater the initial outflow, greater the level

SUMMARY OF Q2 LEARNING

Standard numerical learning approaches make qualitative errors.

As a result, numerical predictions are qualitatively inconsistent with expectations

Q2 learning (Qualitatively faithful Quantitative prediction);

A method that enforces qualitative consistency

Resulting numerical models enable clearer interpretation, and also significantly improve quantitative prediction

IDEA OF Q2

First find qualitative laws in data

Respect these qualitative laws in numerical learning

CONTENTS OF REST OF TALK

Building blocks of Q2 learning:

Ideas from Qualitative Reasoning,

Algorithms QUIN, QFILTER, QCGRID

Experimental analysis

Applications:

Car modelling, ecological modelling, behavioural cloning (operating a crane, flying an aircraft)

HOW CAN WE DESCRIBE QUALITATIVE

PROPERTIES?

We can use concepts from field of qualitative

reasoning in AI

Related terms:

Qualitative physics, Naive physics, Qualitative

modelling

ESSENCE OF NAIVE PHYSICS

Describe physical processes qualitatively, without numbers or exact numerical relations

“Naive physics”, as opposed to "proper physics“

Close to common sense descriptions

EXAMPLE: BATH TUB

What will happen?

Amount of water will keep increasing,so will level,until the level reaches the top.

EXAMPLE: U-TUBE

What will happen?

La

Lb

Level La will be decreasing,and Lb increasing,until La = Lb.

QUALITATIVE REASONING ABOUT U-TUBE

Total amount of water in system constant

If La > Lb then flow from A to B

Flow causes amount in A to decrease

Flow causes amount in B to increase

All changes in time happen continuously and smoothly

Level La

Level Lb

A B


In any container: the greater the amount, the greater the

level

So, La will keep decreasing, Lb increasing

Level La

Level Lb


La will keep decreasing, Lb increasing, until they equalise

Level La

Level Lb

La

Lb

Time

THIS REASONING IS VALID FOR ALL

CONTAINERS

OF ANY SHAPE AND SIZE,

REGARDLESS OF ACTUAL NUMBERS!

QHY REASON QUALITATIVELY?

Because it is easier than quantitatively

Because it is easy to understand -

facilitates explanation

We want to exploit these advantages in ML

RELATION BETWEEN

AMOUNT AND LEVEL

The greater the amount, the greater the level

A = M+(L)

A is a monotonically increasing function of L

MONOTONIC FUNCTIONS

Y = M+(X) specifies a family of functions

X

Y

MONOTONIC QUALITATIVE CONSTRAINTS,

MQCs

Generalisation of monotonically increasing functions to several arguments

Example: Z = M+,- ( X, Y)

Z increases with X, and decreases with Y

More precisely: if X increases and Y stays unchanged then Z increases

EXAMPLE: BEHAVIOUR OF GAS

Pressure = M+,- (Temperature, Volume)

Pressure increases with Temperature

Pressure decreaseswith Volume

Q2 LEARNING

Induce qualitative constraints(QUIN)

Qualitative to Quantitative Transformation (Q2Q)

Numerical predictor: • respects qualitative constraints • fits data numerically

Numerical dataOne possibility: QFILTER

PROGRAM QUIN

INDUCING QUALITATIVE CONSTRAINTS

FROM NUMERICAL DATA

Šuc 2001 (PhD Thesis, also as book 2003)

Šuc and Bratko, ECML’01

QUIN

QUIN = Qualitative Induction

Numerical examples

QUIN

Qualitative tree

Qualitative tree: similar to decision tree,

qualitative constraints in leaves

EXAMPLE PROBLEM FOR QUIN

Noisy examples:

z = x 2 - y 2 + noise(st.dev. 50)

EXAMPLE PROBLEM FOR QUIN

In this region:z = M+,+(x,y)

INDUCED QUALITATIVE TREE FOR

z = x2 - y2 + noise

z=M-,+

(x,y) z=M-,-

(x,y) z=M+,+

(x,y) z=M+,-

(x,y)

0> 0 > 0

0 > 0

0

y

x

y

QUIN ALGORITHM: OUTLINE

Top-down greedy algorithm (similar to induction of decision trees)

For every possible split, find the “most consistent” MQC (min. error-cost) for each subset of examples

Select the best split according to MDL

Q2Q

Qualitative to Quantitative Transformation

Q2Q EXAMPLE

X < 5

y n

Y = M+(X) Y = M-(X)

5 X

Y

QUALITATIVE TREES IMPOSE

NUMERICAL CONSTRAINTS

MQCs impose numerical constraints on class

values, between pairs of examples

y = M+(x) requires:

If x1 > x2 then y1 > y2

RESPECTING MQCs NUMERICALLY

z = M+,+(x,y) requires:

If x1 < x2 and y1 < y2 then z1 < z2

(x2, y2)

(x1, y1)

x

y

QFILTER

AN APPROACH TO Q2Q

TRANSFORMATION

Šuc and Bratko, ECML’03

TASK OF QFILTER

Given: qualitative tree points with class predictions by arbitrary numerical

learner learning examples (optionally)

Modify class predictions to achieve consistency with qualitative tree

QFILTER IDEA

Force numerical predictions to respect

qualitative constraints:

find minimal changes of predicted values so that qualitative constraints become satisfied

“minimal” = min. sum of squared changes

a quadratic programming problem

RESPECTING MQCs NUMERICALLY

Y = M+(X)

X

Y

QFILTER APPLIED TO WATER OUTFLOW

Qualitative constraint that applies to

water outflow:

h = M -,+(time, InitialOutflow)

This could be supplied by domain expert,

or induced from data by QUIN

0

20

40

60

80

100

h

t=1

t=19

PREDICTING WATER LEVEL WITH M5

M5 prediction

7.5

QFILTER’S PREDICTION

QFILTER predictions

True values

POPULATION DYNAMICS

Aquatic ecosystem with zooplankton, phytoplankton and nutrient nitrogen

Phyto feeds on Nutrient,

Zoo feeds on Phyto

Nutrient Phyto Zoo

POPULATION DYNAMICS WITH Q2

Behaviour in time

0 20 40 60 800.0

0.4

0.8

1.2

1.6

NutPhytoZoo

PREDICTION PROBLEM

Predict the change in zooplankton population:

ZooChange(t) = Zoo(t + 1) - Zoo(t)

Biologist’s rough idea:

ZooChange = Growth - Mortality

M+,+(Zoo, Phyto) M+(Zoo)

APPROXIMATE QUALITATIVE MODEL

OF ZOO CHANGE

Induced from data by QUIN

EXPERIMENT WITH NOISY DATA

Domain no noise

LWR; Q2

5 % noise

LWR; Q2

20 % noise

LWR; Q2

ZooChange 0.015 ; 0.008 0.112 ; 0.102 2.269 ; 1.889

All results as MSE (Mean Squared Error)

APPLICATIONS OF Q2

FROM REAL ECOLOGICAL DATA

Growth of algae Lagoon of Venice

Plankton in Lake Glumsoe

Lake Glumsø

Location and properties:

Lake Glumsø is located in a sub-glacial valley in Denmark

Average depth 2 m

Surface area 266000 m2

Pollution

Receives waste water from community with 3000 inhabitants (mainly agricultural)

High nitrogen and phosphorus concentration in waste water caused hypereutrophication

No submerged vegetation

low transparency of water

oxygen deficit at the bottom of the lake

Lake Glumsø – data

Relevant variables for modelling are:

phytoplankton phyto

zooplankton zoo

soluble nitrogen ns

soluble phosphorus ps

water temperature temp

PREDICTION ACCURACY

• Over all (40) experiments.• Q2 better than LWR in 75% (M5, 83%) of the test cases• The differences were found significant (t-test)

at 0.02 significance level

OTHER ECOLOGICAL MODELLING

APPLICATIONS

Predicting ozone concentrations in Ljubljana and Nova Gorica

Predicting flooding of Savinja river

Q2 model by far superior to any predictor so far used in practice

CASE STUDY

INTEC’S CAR SIMULATION MODELS

Goal: simplify INTEC’s car models to speed up simulation

Context: Clockwork European project (engineering design)

Intec’s wheel model

WHEEL MODEL: PREDICTING TOE ANGLE

10 30 50 70 90

-0.010

-0.006

-0.002

0.002

alpha

time in steps dt=0.7 sec.


10 30 50 70 90

-0.010

-0.006

-0.002

0.002

alphaLWR predicted alpha



10 30 50 70 90

-0.010

-0.006

-0.002

0.002

alphaLWR predicted alphaM5 predicted alpha


10 30 50 70 90

-0.010

-0.006

-0.002

0.002

alphaLWR predicted alphaM5 predicted alphaM5 predicted alpha



Q2

Qualiative errors

Q2 predicted alpha

BEHAVIOURAL CLONING

Given a skilled operator, reconstruct the human’s sub cognitive skill

EXAMPLE: GANTRY CRANE

Control force

Load

Carriage

USE MACHINE LEARNING:

BASIC IDEA

Controller

System

Observe

Execution trace

Learning program

Reconstructed controller (“clone”)

ActionsStates

CRITERIA OF SUCCESS

Induced controller description has to:

Be comprehensible

Work as a controller

WHY COMPREHENSIBILITY?

To help the user’s intuition about the

essential mechanism and

causalities

that enable the controller achieve the goal

SKILL RECONSTRUTION IN CRANE

X0=0L0=20

load

trolley

X

L

Xg=60Lg=32

Control forces: Fx, FL

State: X, dX, , d, L, dL

CARRIAGE CONTROL

QUIN: dXdes= f(X, , d)

M-(X) M+()

X < 20.7

X < 60.1M+(X)

yes

yes

no

no

First the trolley velocity is increasing

First the trolley velocity is increasing

From about middle distance from the goal until the goal the trolley velocity is decreasing

From about middle distance from the goal until the goal the trolley velocity is decreasing

At the goal reduce the swing of the rope (by acceleration of the trolley when the rope angle increases)

At the goal reduce the swing of the rope (by acceleration of the trolley when the rope angle increases)

CARRIAGE CONTROL: dXdes= f(X, , d)

M-(X) M+()

X < 20.7

X < 60.1

X < 29.3

M+(X) d < -0.02

M-(X) M-,+(X,)

M+,+,-(X, , d)

yes

yes

yes

yes

no

no

no

no

Enables reconstruction of

individual differences in control styles

Operator S Operator L

CASE STUDY IN REVERSE

ENGINEERING:

ANTI-SWAY CRANE

ANTI-SWAY CRANE

Industrial crane controller minimising load swing, “anti-sway crane”

Developed by M. Valasek (Czech Technical University, CTU)

Reverse engineering of anti-sway crane: a case study in the Clockwork European project

ANTI-SWAY CRANE OF CTU

Crane parameters:

travel distance 100m

height 15m, width 30m

80-120 tons

In daily use at Nova Hut metallurgical factory, Ostrava

EXPLAINING

HOW CONTROLLER WORKS

Load swinging to right;

Accelerate cart to right to reduce swing

EMPIRICAL EVALUATION

Compare errors of base-learners and corresponding Q2 learners

differences btw. a base-learner and a Q2 learner are only due to the induced qualitative constraints

Experiments with three base-learners:

Locally Weighted Regression (LWR)

Model trees

Regression trees

Y1 Y2

Robot Arm Domain

Two-link, two-joint robot arm

Link 1 extendible: L1 [2, 10]

Y1 = L1 sin(1)

Y2 = L1 sin(1) + 5 sin(1+2) 1

2

Four learning problems:

A: Y1 = f(L1, 1)

B: Y2 = f(L1, 1, 2, sum, Y1)

C: Y2 = f(L1, 1, 2 , sum)

D: Y2 = f(L1, 1, 2)

L1

Derived attribute

sum= 1 + 2

Difficulty for Q2

Robot Arm: LWR and Q2 at different noise levels

0.1

0.3

0.5

A 0%n. A 5%n. A 10%n B 0%n. B 5%n. B 10%n C 0%n. C 5%n. C 10%n D 0%n. D 5%n. D 10%n

RR

E

LWR

Q2+LWR

Q2 outperforms LWR with all four learning problems (at all three noise levels)

A 0, 5, 10% n.| B 0, 5, 10% n.| C 0, 5, 10% n.| D 0, 5, 10% n.

UCI and Dynamic Domains

Five smallest regression data sets from UCI

Dynamic domains: typical domains where QUIN was applied so far to explain

the control skill or control the system until now was not possible to measure accuracy of the

learned concepts (qualitative trees)

AntiSway logged data from an anti-sway crane controller

CraneSkill1, CraneSkill2: logged data of experienced human operators controlling a

crane

UCI and Dynamic Domains: LWR compared to Q2

0

0.1

0.2

0.3

0.4

0.5

0.6

AutoMpg AutoPrice Housing Mach.CPU Servo CraneSkil1 CraneSkill2 AntiSway

RR

E

LWR Q2+LWR

Similar results with other two base-learners. Q2

significantly better than base-learners in 18 out of 24 comparisons (24 = 8 datasets * 3 base-learners)

Q2 - CONCLUSIONS

A novel approach to numerical learning

Can take into account qualitative prior knowledge

Advantages:

qualitative consistency of induced models and data – important for interpretation of induced models

improved numerical accuracy of predictions

Q2 TEAM + ACKNOWLEDGEMENTS

Q2 learning, QUIN, Qfilter, QCGRID (AI Lab, Ljubljana):

Dorian Šuc

Daniel Vladušič

Car modelling data

Wolfgan Rulka (INTEC, Munich)

Zbinek Šika (Czech Technical Univ.)

Population dynamics data

Sašo Džeroski, Ljupčo Todorovski (J. Stefan Institute, Ljubljana)

Lake Glumsoe

Sven Joergensen

Boris Kompare, Jure Žabkar, D. Vladušič

RELEVANT PAPERS

Clark and Matwin 93: also used qualitative constraints in numerical predictions

Šuc, Vladušič and Bratko; IJCAI’03

Šuc, Vladušič and Bratko; Artificial Intelligence Journal, 2004

Šuc and Bratko; ECML’03

Šuc and Bratko; IJCAI’05

using qualitative knowledge in numerical learning

Documents