using qualitative knowledge in numerical learning
TRANSCRIPT
USING
QUALITATIVE KNOWLEDGE
IN NUMERICAL LEARNING
Ivan Bratko
Faculty of Computer and Info. Sc.
University of Ljubljana
Slovenia
THIS TALK IS ABOUT:
AUTOMATED MODELLING FROM DATA
WITH MACHINE LEARNING
COMBINING NUMERICAL AND QUALITATIVE
REPRESENTATIONS
3
BUILING MODELS FROM DATA
Observed
system
Machine learning,
numerical regression
Model of system
Data
EXAMPLE: POPULATION DYNAMICS
A lake with zooplankton, phytoplankton and nutrient nitrogen
Variables in system:
Nut
Phyto
Zoo
POPULATION DYNAMICS
Observed behaviour in time
0 20 40 60 800.0
0.4
0.8
1.2
1.6
NutPhytoZoo
Data provided by Todorovski&Džeroski
PRIOR KNOWLEDGE
We would like our modelling methods to make use of expert’s prior knowledge (possibly qualitative)
Phytoplankton feeds on Nutrient,
Zooplankton feeds on Phytoplankton
Nutrient Phyto Zoo
QUALITATIVE DIFFICULTIES OF
NUMERICAL LEARNING
Learn time behavior of water level:
h = f( t, initial_outflow)
Level h
outflow t
h
0
20
40
60
80
100
h
t=1
t=19
TIME BEHAVIOUR OF WATER LEVEL
Initial_ouflow=12.5
0
20
40
60
80
100
h
t=1
t=19
VARYING INITIAL OUTFLOW
Initial_ouflow=12.5
11.25
10.08.75
6.25
0
20
40
60
80
100
h
t=1
t=19
PREDICTING WATER LEVEL WITH M5
Qualitatively incorrect – water level cannot increase
M5 prediction
11.25
10.08.75
6.25
7.5Initial_ouflow
=12.5
QUALITATIVE ERRORS OF
NUMERICAL LEARNERS
Experiments with regression (model) trees (M5; Quinlan 92), LWR (Atkenson et.al. 97) in Weka (Witten & Frank 2000), neural nets, ...
Qualitative errors:
water level should never increase
water level should not be negative
An expert might accept numerical errors, but such qualitative errors are particularly disturbing
Q2 LEARNING
AIMS AT OVERCOMING THESE
DIFFICULTIES
Q2 LEARNINGŠuc, Vladušič, Bratko; IJCAI’03, AIJ 2004, IJCAI’05
Aims at overcoming these difficulties of numerical learning
Q2 = Qualitatively faithful Quantitative learning
Q2 makes use of qualitative constraints
QUALITATIVE CONSTRAINTS
FOR WATER LEVEL
For any initial outflow:
Level is always decreasing with time
For any time point:
Greater the initial outflow, greater the level
SUMMARY OF Q2 LEARNING
Standard numerical learning approaches make qualitative errors.
As a result, numerical predictions are qualitatively inconsistent with expectations
Q2 learning (Qualitatively faithful Quantitative prediction);
A method that enforces qualitative consistency
Resulting numerical models enable clearer interpretation, and also significantly improve quantitative prediction
IDEA OF Q2
First find qualitative laws in data
Respect these qualitative laws in numerical learning
CONTENTS OF REST OF TALK
Building blocks of Q2 learning:
Ideas from Qualitative Reasoning,
Algorithms QUIN, QFILTER, QCGRID
Experimental analysis
Applications:
Car modelling, ecological modelling, behavioural cloning (operating a crane, flying an aircraft)
HOW CAN WE DESCRIBE QUALITATIVE
PROPERTIES?
We can use concepts from field of qualitative
reasoning in AI
Related terms:
Qualitative physics, Naive physics, Qualitative
modelling
ESSENCE OF NAIVE PHYSICS
Describe physical processes qualitatively, without numbers or exact numerical relations
“Naive physics”, as opposed to "proper physics“
Close to common sense descriptions
EXAMPLE: BATH TUB
What will happen?
Amount of water will keep increasing,so will level,until the level reaches the top.
EXAMPLE: U-TUBE
What will happen?
La
Lb
Level La will be decreasing,and Lb increasing,until La = Lb.
QUALITATIVE REASONING ABOUT U-TUBE
Total amount of water in system constant
If La > Lb then flow from A to B
Flow causes amount in A to decrease
Flow causes amount in B to increase
All changes in time happen continuously and smoothly
Level La
Level Lb
A B
QUALITATIVE REASONING ABOUT U-TUBE
In any container: the greater the amount, the greater the
level
So, La will keep decreasing, Lb increasing
Level La
Level Lb
QUALITATIVE REASONING ABOUT U-TUBE
La will keep decreasing, Lb increasing, until they equalise
Level La
Level Lb
La
Lb
Time
THIS REASONING IS VALID FOR ALL
CONTAINERS
OF ANY SHAPE AND SIZE,
REGARDLESS OF ACTUAL NUMBERS!
QHY REASON QUALITATIVELY?
Because it is easier than quantitatively
Because it is easy to understand -
facilitates explanation
We want to exploit these advantages in ML
RELATION BETWEEN
AMOUNT AND LEVEL
The greater the amount, the greater the level
A = M+(L)
A is a monotonically increasing function of L
MONOTONIC FUNCTIONS
Y = M+(X) specifies a family of functions
X
Y
MONOTONIC QUALITATIVE CONSTRAINTS,
MQCs
Generalisation of monotonically increasing functions to several arguments
Example: Z = M+,- ( X, Y)
Z increases with X, and decreases with Y
More precisely: if X increases and Y stays unchanged then Z increases
EXAMPLE: BEHAVIOUR OF GAS
Pressure = M+,- (Temperature, Volume)
Pressure increases with Temperature
Pressure decreaseswith Volume
Q2 LEARNING
Induce qualitative constraints(QUIN)
Qualitative to Quantitative Transformation (Q2Q)
Numerical predictor: • respects qualitative constraints • fits data numerically
Numerical dataOne possibility: QFILTER
PROGRAM QUIN
INDUCING QUALITATIVE CONSTRAINTS
FROM NUMERICAL DATA
Šuc 2001 (PhD Thesis, also as book 2003)
Šuc and Bratko, ECML’01
QUIN
QUIN = Qualitative Induction
Numerical examples
QUIN
Qualitative tree
Qualitative tree: similar to decision tree,
qualitative constraints in leaves
EXAMPLE PROBLEM FOR QUIN
Noisy examples:
z = x 2 - y 2 + noise(st.dev. 50)
EXAMPLE PROBLEM FOR QUIN
In this region:z = M+,+(x,y)
INDUCED QUALITATIVE TREE FOR
z = x2 - y2 + noise
z=M-,+
(x,y) z=M-,-
(x,y) z=M+,+
(x,y) z=M+,-
(x,y)
0> 0 > 0
0 > 0
0
y
x
y
QUIN ALGORITHM: OUTLINE
Top-down greedy algorithm (similar to induction of decision trees)
For every possible split, find the “most consistent” MQC (min. error-cost) for each subset of examples
Select the best split according to MDL
Q2Q
Qualitative to Quantitative Transformation
Q2Q EXAMPLE
X < 5
y n
Y = M+(X) Y = M-(X)
5 X
Y
QUALITATIVE TREES IMPOSE
NUMERICAL CONSTRAINTS
MQCs impose numerical constraints on class
values, between pairs of examples
y = M+(x) requires:
If x1 > x2 then y1 > y2
RESPECTING MQCs NUMERICALLY
z = M+,+(x,y) requires:
If x1 < x2 and y1 < y2 then z1 < z2
(x2, y2)
(x1, y1)
x
y
QFILTER
AN APPROACH TO Q2Q
TRANSFORMATION
Šuc and Bratko, ECML’03
TASK OF QFILTER
Given: qualitative tree points with class predictions by arbitrary numerical
learner learning examples (optionally)
Modify class predictions to achieve consistency with qualitative tree
QFILTER IDEA
Force numerical predictions to respect
qualitative constraints:
find minimal changes of predicted values so that qualitative constraints become satisfied
“minimal” = min. sum of squared changes
a quadratic programming problem
RESPECTING MQCs NUMERICALLY
Y = M+(X)
X
Y
QFILTER APPLIED TO WATER OUTFLOW
Qualitative constraint that applies to
water outflow:
h = M -,+(time, InitialOutflow)
This could be supplied by domain expert,
or induced from data by QUIN
0
20
40
60
80
100
h
t=1
t=19
PREDICTING WATER LEVEL WITH M5
M5 prediction
7.5
QFILTER’S PREDICTION
QFILTER predictions
True values
POPULATION DYNAMICS
Aquatic ecosystem with zooplankton, phytoplankton and nutrient nitrogen
Phyto feeds on Nutrient,
Zoo feeds on Phyto
Nutrient Phyto Zoo
POPULATION DYNAMICS WITH Q2
Behaviour in time
0 20 40 60 800.0
0.4
0.8
1.2
1.6
NutPhytoZoo
PREDICTION PROBLEM
Predict the change in zooplankton population:
ZooChange(t) = Zoo(t + 1) - Zoo(t)
Biologist’s rough idea:
ZooChange = Growth - Mortality
M+,+(Zoo, Phyto) M+(Zoo)
APPROXIMATE QUALITATIVE MODEL
OF ZOO CHANGE
Induced from data by QUIN
EXPERIMENT WITH NOISY DATA
Domain no noise
LWR; Q2
5 % noise
LWR; Q2
20 % noise
LWR; Q2
ZooChange 0.015 ; 0.008 0.112 ; 0.102 2.269 ; 1.889
All results as MSE (Mean Squared Error)
APPLICATIONS OF Q2
FROM REAL ECOLOGICAL DATA
Growth of algae Lagoon of Venice
Plankton in Lake Glumsoe
Lake Glumsø
Location and properties:
Lake Glumsø is located in a sub-glacial valley in Denmark
Average depth 2 m
Surface area 266000 m2
Pollution
Receives waste water from community with 3000 inhabitants (mainly agricultural)
High nitrogen and phosphorus concentration in waste water caused hypereutrophication
No submerged vegetation
low transparency of water
oxygen deficit at the bottom of the lake
Lake Glumsø – data
Relevant variables for modelling are:
phytoplankton phyto
zooplankton zoo
soluble nitrogen ns
soluble phosphorus ps
water temperature temp
PREDICTION ACCURACY
• Over all (40) experiments.• Q2 better than LWR in 75% (M5, 83%) of the test cases• The differences were found significant (t-test)
at 0.02 significance level
OTHER ECOLOGICAL MODELLING
APPLICATIONS
Predicting ozone concentrations in Ljubljana and Nova Gorica
Predicting flooding of Savinja river
Q2 model by far superior to any predictor so far used in practice
CASE STUDY
INTEC’S CAR SIMULATION MODELS
Goal: simplify INTEC’s car models to speed up simulation
Context: Clockwork European project (engineering design)
Intec’s wheel model
WHEEL MODEL: PREDICTING TOE ANGLE
10 30 50 70 90
-0.010
-0.006
-0.002
0.002
alpha
time in steps dt=0.7 sec.
WHEEL MODEL: PREDICTING TOE ANGLE
10 30 50 70 90
-0.010
-0.006
-0.002
0.002
alphaLWR predicted alpha
time in steps dt=0.7 sec.
WHEEL MODEL: PREDICTING TOE ANGLE
10 30 50 70 90
-0.010
-0.006
-0.002
0.002
alphaLWR predicted alphaM5 predicted alpha
time in steps dt=0.7 sec.
10 30 50 70 90
-0.010
-0.006
-0.002
0.002
alphaLWR predicted alphaM5 predicted alphaM5 predicted alpha
time in steps dt=0.7 sec.
WHEEL MODEL: PREDICTING TOE ANGLE
Q2
Qualiative errors
Q2 predicted alpha
BEHAVIOURAL CLONING
Given a skilled operator, reconstruct the human’s sub cognitive skill
EXAMPLE: GANTRY CRANE
Control force
Load
Carriage
USE MACHINE LEARNING:
BASIC IDEA
Controller
System
Observe
Execution trace
Learning program
Reconstructed controller (“clone”)
ActionsStates
CRITERIA OF SUCCESS
Induced controller description has to:
Be comprehensible
Work as a controller
WHY COMPREHENSIBILITY?
To help the user’s intuition about the
essential mechanism and
causalities
that enable the controller achieve the goal
SKILL RECONSTRUTION IN CRANE
X0=0L0=20
load
trolley
X
L
Xg=60Lg=32
Control forces: Fx, FL
State: X, dX, , d, L, dL
CARRIAGE CONTROL
QUIN: dXdes= f(X, , d)
M-(X) M+()
X < 20.7
X < 60.1M+(X)
yes
yes
no
no
First the trolley velocity is increasing
First the trolley velocity is increasing
From about middle distance from the goal until the goal the trolley velocity is decreasing
From about middle distance from the goal until the goal the trolley velocity is decreasing
At the goal reduce the swing of the rope (by acceleration of the trolley when the rope angle increases)
At the goal reduce the swing of the rope (by acceleration of the trolley when the rope angle increases)
CARRIAGE CONTROL: dXdes= f(X, , d)
M-(X) M+()
X < 20.7
X < 60.1
X < 29.3
M+(X) d < -0.02
M-(X) M-,+(X,)
M+,+,-(X, , d)
yes
yes
yes
yes
no
no
no
no
Enables reconstruction of
individual differences in control styles
Operator S Operator L
CASE STUDY IN REVERSE
ENGINEERING:
ANTI-SWAY CRANE
ANTI-SWAY CRANE
Industrial crane controller minimising load swing, “anti-sway crane”
Developed by M. Valasek (Czech Technical University, CTU)
Reverse engineering of anti-sway crane: a case study in the Clockwork European project
ANTI-SWAY CRANE OF CTU
Crane parameters:
travel distance 100m
height 15m, width 30m
80-120 tons
In daily use at Nova Hut metallurgical factory, Ostrava
EXPLAINING
HOW CONTROLLER WORKS
Load swinging to right;
Accelerate cart to right to reduce swing
EMPIRICAL EVALUATION
Compare errors of base-learners and corresponding Q2 learners
differences btw. a base-learner and a Q2 learner are only due to the induced qualitative constraints
Experiments with three base-learners:
Locally Weighted Regression (LWR)
Model trees
Regression trees
Y1 Y2
Robot Arm Domain
Two-link, two-joint robot arm
Link 1 extendible: L1 [2, 10]
Y1 = L1 sin(1)
Y2 = L1 sin(1) + 5 sin(1+2) 1
2
Four learning problems:
A: Y1 = f(L1, 1)
B: Y2 = f(L1, 1, 2, sum, Y1)
C: Y2 = f(L1, 1, 2 , sum)
D: Y2 = f(L1, 1, 2)
L1
Derived attribute
sum= 1 + 2
Difficulty for Q2
Robot Arm: LWR and Q2 at different noise levels
0.1
0.3
0.5
A 0%n. A 5%n. A 10%n B 0%n. B 5%n. B 10%n C 0%n. C 5%n. C 10%n D 0%n. D 5%n. D 10%n
RR
E
LWR
Q2+LWR
Q2 outperforms LWR with all four learning problems (at all three noise levels)
A 0, 5, 10% n.| B 0, 5, 10% n.| C 0, 5, 10% n.| D 0, 5, 10% n.
UCI and Dynamic Domains
Five smallest regression data sets from UCI
Dynamic domains: typical domains where QUIN was applied so far to explain
the control skill or control the system until now was not possible to measure accuracy of the
learned concepts (qualitative trees)
AntiSway logged data from an anti-sway crane controller
CraneSkill1, CraneSkill2: logged data of experienced human operators controlling a
crane
UCI and Dynamic Domains: LWR compared to Q2
0
0.1
0.2
0.3
0.4
0.5
0.6
AutoMpg AutoPrice Housing Mach.CPU Servo CraneSkil1 CraneSkill2 AntiSway
RR
E
LWR Q2+LWR
Similar results with other two base-learners. Q2
significantly better than base-learners in 18 out of 24 comparisons (24 = 8 datasets * 3 base-learners)
Q2 - CONCLUSIONS
A novel approach to numerical learning
Can take into account qualitative prior knowledge
Advantages:
qualitative consistency of induced models and data – important for interpretation of induced models
improved numerical accuracy of predictions
Q2 TEAM + ACKNOWLEDGEMENTS
Q2 learning, QUIN, Qfilter, QCGRID (AI Lab, Ljubljana):
Dorian Šuc
Daniel Vladušič
Car modelling data
Wolfgan Rulka (INTEC, Munich)
Zbinek Šika (Czech Technical Univ.)
Population dynamics data
Sašo Džeroski, Ljupčo Todorovski (J. Stefan Institute, Ljubljana)
Lake Glumsoe
Sven Joergensen
Boris Kompare, Jure Žabkar, D. Vladušič
RELEVANT PAPERS
Clark and Matwin 93: also used qualitative constraints in numerical predictions
Šuc, Vladušič and Bratko; IJCAI’03
Šuc, Vladušič and Bratko; Artificial Intelligence Journal, 2004
Šuc and Bratko; ECML’03
Šuc and Bratko; IJCAI’05