deep neural decision forestsmfiterau/papers/2015/iccv2015...decision forest prediction trees he...
TRANSCRIPT
![Page 1: Deep Neural Decision Forestsmfiterau/papers/2015/ICCV2015...Decision forest prediction Trees he forest M o t i v a t i o n Deep Neural Decision Forests, M o d e l A deep network generates](https://reader035.vdocuments.us/reader035/viewer/2022070709/5ebd1fb729a4e030d225d2b7/html5/thumbnails/1.jpg)
Peter Kontschieder1
Samuel Rota Bulo1,3
Antonio Criminisi1
Madalina Fiterau2
Raw data
Probabilistic decision function
Parametrized feature map
Class distributionin each leaf
Routing function: probability of reaching a leaf
Decision tree prediction
Decision forest prediction
Trees in the forest
Motivation
Deep Neural Decision Forests,
Model
A deep network generatesthe node feature
maps Deep network
parameters
Prefixed feature map
+Feature
representation
Decisionforest
Empirical risk minimization
Log-lossTraining set
Decision nodes update
Leaf nodes update
Stochastic
Gradient
Descent
Mini-batch
Back-propagation
through deep network
Forward pass over tree
Backward pass over tree
compute
node gradient
computed with deep
network forward pass
TRAINING
Convex optimizationproblem
normalization factor
Step-size-free
update rule
Converges to global optimum
... ... ... ... ...
JOINTLYtrained
alternating optimiz
ati
on
Interleave
leaf nodes updates
and 1 epoch of
decision nodes updates
alternatively
per mini-batch
leaf nodes updatesIndependent
updates per tree
Random selection of tree to updateper mini-batch
Train decision forestStandard decision forests cannot build new feature representations
but rely on predefined ones
We allow decision forests to build ad-hoc
feature representations through
a deep network, which is jointly
trained with the forest
Proposed solution
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0.5
1
1.5
2
2.5x 109 Split responses (1000 epochs)
Split node output
His
tog
ram
co
unts
00 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
1
2
3
4
5
6
7x 108 Split responses (100 epochs)
Split node output
His
tog
ram
co
unts
0
500 550 600 650 700 750 800 850 900 950 1000
# training epochs
1211.5
1110.5
109.5
98.5
87.5
76.5
65.5
54.5
43.5
32.5
21.5
10.5
0
To
p5-e
rro
r [
%]
ImageNet Top5-error
100 200 300 400 500 600 700 800 900 1000
# training epochs
9
8
7
6
5
4
3
Averag
e L
eaf
Entr
op
y [
bit
s]
Average Leaf Entropy during Training
MACHINE LEARNINGDATASETS
Leaf entropy
0 200 400 600 800 1000# training epochs
100
80
60
40
20
0
To
p5-e
rro
r [
%]
ImageNet Top5-error
dNDF0 on validationdNDF1 on validationdNDF2 on validationdNDF.NET on validationdNDF.NET on training
deep Neural Decision Forest(dNDF)
ExperimenTS IMAGENETMNIST
LeNet 5
... ......x10
5
Convergence to deterministic routing
shallow
Neural Decision Forest
(sNDF)
... ......x10
15
... ......
... ......
dNDF 0.7%LeNet-5 0.9%
dNDF0
dNDF1
dNDF2
dNDF.NET = dNDF0 + dNDF1 + dNDF2
GoogLeNet
dNDF.NET
111777117
110
1441
10144
1101
10.07%9.15%7.89%8.09%7.62%6.67%7.84%7.08%6.38%
# models
# crops
Top5-err
.
1
32
GoogLeNet