cost-aware pre-training for multiclass cost-sensitive deep...
TRANSCRIPT
www.postersession.com
Cost-aware Pre-training for MulticlassCost-sensitive Deep Learning
Yu-An Chung1 Hsuan-Tien Lin1 Shao-Wen Yang2
1 Dept. of Computer Science and Information Engineering 2 Intel LabsNational Taiwan University, Taipei, Taiwan Intel Corporation, USA
Cost-sensitive Classification
What is the status of the patient?
H1N1-infected Cold-infected Healthy
l Cost of each kind of mis-prediction:
C = 0 1000 𝟏𝟎𝟎𝟎𝟎𝟎100 0 3000100 30 0Healthy
Cold
H1N1H1N1 Cold Healthy
PredictedActual
Predict H1N1-infectedas Healthy: very high cost!
Predict Cold-infectedas Healthy: high cost
Predict correctly: no cost
l Input: a training set 𝑆 = x+, 𝑦+ +./0 and a Cost Matrix C, where x+ ∈ 𝒳,𝑦+∈ 𝒴 = 1,2,… , 𝐾 ,
and C 𝑖, 𝑗 is the cost of classifying a class 𝑖 example as class 𝑗
l Goal: Use 𝑆 and C to train a classifier 𝑔:𝒳 ⇢ 𝒴 such that the expected cost C 𝑦, 𝑔 x ontest example x, 𝑦 is minimal
Our Goal & ContributionsShallow Models (e.g., SVM) Deep Learning
Regular (Cost-insensitive) Classification Well-studied Popular and undergoing
Cost-sensitive Classification Well-studied Our work lies here!
l First work that studies thoroughly on Cost-sensitive Deep Learning1) a novel cost-sensitive loss function for any deep model
2) a Cost-sensitive Autoencoder (CAE) equipped with the loss function for pre-training
fully-connected deep model
3) a combination of 1) and 2) as a complete cost-sensitive deep learning (CSDNN) solution
The Input-to-Cost Regression Network
l Regression network: estimate the costsl Train a regression network
• any end-to-end loss function for regression (e.g., MSE linear regression) could be applied• a loss function built on top of [Tu and Lin, 2010] is derived in this work, given a training set
𝑆 = x+, 𝑦+ +./0 and C, we define
𝛿+,= ≡ ln 1 + exp 𝑧+,= E 𝑟= x+ − C 𝑦+, 𝑘 ,
where 𝑧+,= ≡ 2 c+ 𝑘 = C 𝑦+, 𝑘 − 1.• train the regression network by minimizing the derived Cost-Sensitive Loss (CSL) over the
training set 𝑆:
𝐿LMN =O O 𝛿+,=P
=./
0
+./l Prediction: 𝑔 x ≡ argmin
/V=VP𝑟= x
Cost-sensitive Autoencoder (CAE)l Autoencoder (AE): pre-training a fully-connected neural network (FCNN) for regular classification
l Cost-sensitive Autoencoder (CAE): pre-training the DNN for cost-sensitive classification
Autoencoder (AE)• Goal: to reconstruct the original input x
• Reconstructed errors measured by the cross-
entropy loss 𝐿LW
Cost-sensitive Autoencoder(CAE)
• Goal: to reconstruct both the
original input xand the
cost information C 𝑦, :
• Mixture reconstructed errors:
𝐿LXW 𝑆 = 1 − 𝛼 E 𝐿LW + 𝛼 E 𝐿LMN
Conclusionsl CSL: make any deep model cost-sensitive (see paper for CNN with CSL)
l CSDNN = CAE pre-training + CSL training: both techniques lead to significant improvements
Cost-aware
Experimentsl FCNN: traditional fully-connected neural network for regular classification
l FCNN_CSL: the fully-connected regression network trained by the loss function 𝐿LMNl The proposed Cost-sensitive Deep Neural Network (CSDNN)