multi-task machine learning - yunsheng...
TRANSCRIPT
![Page 1: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/1.jpg)
Multi-Task Machine LearningWasi Ahmad
![Page 2: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/2.jpg)
Overview
● What is Multi-Task Learning?● Multi-Task Learning: Motivation● Multi-Task Learning Methods● Recent Works on MTL for Deep Learning
2
![Page 3: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/3.jpg)
What is Multi-Task Learning?
Multi-task learning (MTL) is a subfield of machine learning in which multiple learning tasks are solved at the same time, while exploiting commonalities and
differences across tasks.
- Wikipedia
3
![Page 4: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/4.jpg)
What is Multi-Task Learning?
Multitask Learning is an approach to inductive transfer that improves generalization by using the domain information contained in the training
signals of related tasks as an inductive bias.
- Rich Caruana, 1997
4
![Page 5: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/5.jpg)
Overview
● What is Multi-Task Learning?● Multi-Task Learning: Motivation● Multi-Task Learning Methods● Recent Works on MTL for Deep Learning
5
![Page 6: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/6.jpg)
Motivation
● Learning multiple tasks jointly with the aim of mutual benefit● Inductive transfer helps to improve a model by introducing inductive bias
○ Common form of inductive bias: L1 regularization○ L1 regularization leads to a preference for sparse solutions
● Improves generalization on other tasks○ Caused by the inductive bias provided by the auxiliary task
6
![Page 7: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/7.jpg)
Web Pages Categorization
● Classify documents into categories● The classification of each category is a task● The tasks of predicting different categories may be latently related
7Courtesy: Multi-Task Learning: Theory, Algorithms, and Applications SDM 2012
![Page 8: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/8.jpg)
Collaborative Ordinal Regression
● The preference prediction of each user can be modeled using ordinal regression
● Some users have similar tastes and their predictions may also have similarities
● Simultaneously perform multiple prediction to use such similarity information
8Courtesy: Multi-Task Learning: Theory, Algorithms, and Applications SDM 2012
![Page 9: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/9.jpg)
MTL for HIV Therapy Screening
● Hundreds of possible combinations of drugs, some of which use similar biochemical mechanisms
● The sample available for each combination is limited● For a patient, the prediction of using one combination is a task● Use the similarity information by simultaneously infer multiple tasks
9Courtesy: Multi-Task Learning: Theory, Algorithms, and Applications SDM 2012
![Page 10: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/10.jpg)
Image courtesy: Multi-Task Learning: Theory, Algorithms, and Applications SDM 2012 10
Single Task Learning vs. Multi-Task Learning
![Page 11: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/11.jpg)
Overview
● What is Multi-Task Learning?● Multi-Task Learning: Motivation● Multi-Task Learning Methods● Recent Works on MTL for Deep Learning
11
![Page 12: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/12.jpg)
Learning Methods
12
Source: Multi-Task Learning: Theory, Algorithms, and Applications SDM 2012
![Page 13: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/13.jpg)
Key QuestionWhat to Share? How to Share?
13
![Page 14: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/14.jpg)
MTL Methods (based on what to share?)
● Feature-based MTL○ Aims to learn common features among different tasks
● Parameter-based MTL○ Learns model parameters to help learn parameters for other tasks
● Instance-based MTL○ Identify useful data instances in a task for others task
14
![Page 15: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/15.jpg)
MTL Methods (based on what to share?)
● Feature-based MTL○ Aims to learn common features among different tasks
● Parameter-based MTL○ Learns model parameters to help learn parameters for other tasks
● Instance-based MTL○ Identify useful data instances in a task for others task
15
![Page 16: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/16.jpg)
MTL Methods (based on how to share?)
● Feature-based MTL○ Feature learning approach○ Deep learning approach
● Parameter-based MTL○ Low-rank approach○ Task clustering approach○ Task relation learning approach○ Dirty approach○ Multi-level approach
16
![Page 17: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/17.jpg)
MTL Methods (based on how to share?)
● Feature-based MTL○ Feature learning approach○ Deep learning approach
● Parameter-based MTL○ Low-rank approach○ Task clustering approach○ Task relation learning approach○ Dirty approach○ Multi-level approach
17
![Page 18: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/18.jpg)
Feature Learning Approach
● Why we need to learn common feature representations?○ Original features may not have enough expressive power
● Two sub-categories of feature learning approach○ Feature transformation approach○ Feature selection approach
18
![Page 19: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/19.jpg)
Feature Learning Approach
● Feature transformation approach○ The learned features are a linear or nonlinear transformation of the original feature
representations.
● Feature selection approach○ Selects a subset of the original features as the learned representations○ Eliminates useless features based on different criteria
19
![Page 20: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/20.jpg)
Feature Transformation Approach
● Multi-task feedforward NN
20A Survey on Multi-Task Learning, 2017
![Page 21: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/21.jpg)
Feature Transformation Approach
● Context-sensitive multi-task feedforward NN
21Inductive transfer with context-sensitive neural networks, 2008
![Page 22: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/22.jpg)
Feature Transformation Approach
● Regularization Framework
● First term measures the empirical loss on the training sets of all the tasks● Second term enforces parameter matrix to be row-sparse
○ Equivalent to selecting features after transformation
22A Survey on Multi-Task Learning, 2017
![Page 23: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/23.jpg)
Feature Transformation Approach
● Regularization Framework
23A Survey on Multi-Task Learning, 2017
![Page 24: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/24.jpg)
Feature Selection Approach
● Regularization Framework
● The regularizer on W is to enforce W to be row-sparse, which helps to select important features
24A Survey on Multi-Task Learning, 2017
![Page 25: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/25.jpg)
Feature Transformation vs. Selection
● Feature transformation fits data better than selection approach● Feature transformation can generalize well
○ If there is no overfitting
● Feature selection has better interpretability● Feature transformation is preferred -
○ If an application needs better performance
● Feature selection is preferred - ○ If the application needs a decision support
25A Survey on Multi-Task Learning, 2017
![Page 26: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/26.jpg)
MTL Methods (based on how to share?)
● Feature-based MTL○ Feature learning approach○ Deep learning approach
● Parameter-based MTL○ Low-rank approach○ Task clustering approach○ Task relation learning approach○ Dirty approach○ Multi-level approach
26
![Page 27: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/27.jpg)
Two MTL Methods for Deep Learning
● Hard Parameter Sharing○ Generally applied by sharing the hidden layers between all tasks.○ Keeps several task-specific output layers.
● Soft Parameter Sharing○ Each task has its own model with its own parameters.○ The distance between the parameters of the model is regularized in
order to encourage the parameters to be similar.
27
![Page 28: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/28.jpg)
Two MTL Methods for Deep Learning
28Hard parameter sharing
Soft parameter sharing
![Page 29: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/29.jpg)
Cross-stitch Networks
29Courtesy: http://ruder.io/multi-task/
![Page 30: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/30.jpg)
MTL Methods (based on how to share?)
● Feature-based MTL○ Feature learning approach○ Deep learning approach
● Parameter-based MTL○ Low-rank approach○ Task clustering approach○ Task relation learning approach○ Dirty approach○ Multi-level approach
30
![Page 31: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/31.jpg)
MTL Methods (based on how to share?)
● Feature-based MTL○ Feature learning approach○ Deep learning approach
● Parameter-based MTL○ Low-rank approach○ Task clustering approach○ Task relation learning approach○ Dirty approach○ Multi-level approach
31
![Page 32: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/32.jpg)
Low-Rank Approach
● Assumes the model parameters of different tasks share a low-rank subspace*.
● The objective function can be formulated as:
32* A framework for learning predictive structures from multiple tasks and unlabeled data, 2005
![Page 33: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/33.jpg)
MTL Methods (based on how to share?)
● Feature-based MTL○ Feature learning approach○ Deep learning approach
● Parameter-based MTL○ Low-rank approach○ Task clustering approach○ Task relation learning approach○ Dirty approach○ Multi-level approach
33
![Page 34: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/34.jpg)
Task Clustering Approach
● First, cluster the tasks into groups○ Learn a task transfer matrix○ Minimizing pairwise within-class distances○ Maximizing pairwise between-class distances
● Second, learn classifier on the training data of tasks in a cluster○ A weighted nearest neighbor classifier is proposed*
34* Discovering Structure in Multiple Learning Tasks: The TC Algorithm, ICML 1996
![Page 35: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/35.jpg)
Task Clustering Approach
● Learn task clusters under regularization framework○ Considering three orthogonal aspects
● Aspect 1: A global penalty to measure on average on average how large the parameters are:
35* Clustured Multi-task Learning: A Convex Formulation, NIPS 2008
![Page 36: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/36.jpg)
Task Clustering Approach
● Learn task clusters under regularization framework○ Considering three orthogonal aspects
● Aspect 2: A measure of between-cluster variance to quantify the distance among different clusters.
36* Clustured Multi-task Learning: A Convex Formulation, NIPS 2008
Where,
![Page 37: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/37.jpg)
Task Clustering Approach
● Learn task clusters under regularization framework○ Considering three orthogonal aspects
● Aspect 3: A measure of within-cluster variance to quantify the compactness of task clusters.
● Final regularizer:
37* Clustured Multi-task Learning: A Convex Formulation, NIPS 2008
![Page 38: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/38.jpg)
Task Clustering Approach
● Cluster tasks by identifying representative tasks○ A subset of the given tasks
38* Flexible clustered multi-task learning by learning representative tasks, IEEE transactions on Pattern Analysis and Machine Intelligence, 2016
![Page 39: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/39.jpg)
MTL Methods (based on how to share?)
● Feature-based MTL○ Feature learning approach○ Deep learning approach
● Parameter-based MTL○ Low-rank approach○ Task clustering approach○ Task relation learning approach○ Dirty approach○ Multi-level approach
39
![Page 40: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/40.jpg)
Task Relation Learning Approach
● Two type of studies○ Task relations are assumed to be known as a priori information○ Learn task relations automatically from data
● Type 1: Task relations are given○ Similar task parameters are expected to be close○ Utilize task similarities to design regularizers
40A Survey on Multi-Task Learning, 2017
![Page 41: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/41.jpg)
Task Relation Learning Approach
● Two type of studies○ Task relations are assumed to be known as a priori information○ Learn task relations automatically from data
● Type 2: Learn task relations from data○ Global learning model
■ Multi-task Gaussian process (defined as prior on functional values for training data)■ Keep the task covariance matrix positive definite
○ Local learning model■ Ex., kNN classifier (learning function as a weighted voting of neighbors)
41A Survey on Multi-Task Learning, 2017
![Page 42: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/42.jpg)
MTL Methods (based on how to share?)
● Feature-based MTL○ Feature learning approach○ Deep learning approach
● Parameter-based MTL○ Low-rank approach○ Task clustering approach○ Task relation learning approach○ Dirty approach○ Multi-level approach
42
![Page 43: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/43.jpg)
Dirty Approach
● Assumption: parameter matrix, W can be decomposed into two component matrices U and V
● Objective function can be defined as:
● g(U) and h(V) can be defined as*:
43* A dirty model for multi-task learning, NIPS 2010
![Page 44: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/44.jpg)
Dirty Approach
● g(U) and h(V) defined in different ways in the following works○ Learning incoherent sparse and low-rank patterns from multiple tasks, SIGKDD, 2010○ Integrating low-rank and group-sparse structures for robust multi-task learning, SIGKDD
2011○ Robust multi-task feature learning, SIGKDD 2012○ Convex multi-task learning with flexible task clusters, ICML 2012
44A Survey on Multi-Task Learning, 2017
![Page 45: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/45.jpg)
MTL Methods (based on how to share?)
● Feature-based MTL○ Feature learning approach○ Deep learning approach
● Parameter-based MTL○ Low-rank approach○ Task clustering approach○ Task relation learning approach○ Dirty approach○ Multi-level approach
45
![Page 46: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/46.jpg)
Multi-Level Approach
● An extension of the dirty approach● Assumption: parameter matrix, W can be decomposed into h component
matrices
● Multi-level approach has more expressive power than the dirty approach● Represent task clusters as a tree - learn relations from structure
46A Survey on Multi-Task Learning, 2017
![Page 47: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/47.jpg)
Overview
● What is Multi-Task Learning?● Multi-Task Learning: Motivation● Multi-Task Learning Methods● Recent Works on MTL for Deep Learning
47
![Page 48: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/48.jpg)
Deep Relationship Networks
48Courtesy: http://ruder.io/multi-task/
![Page 49: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/49.jpg)
Fully Adaptive Feature Sharing
49Courtesy: http://ruder.io/multi-task/
![Page 50: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/50.jpg)
Cross-stitch Networks
50Courtesy: http://ruder.io/multi-task/
![Page 51: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/51.jpg)
A Joint Many Task Model
51
A Joint Many-Task Model: Growing a Neural Network for Multiple NLP Tasks, EMNLP 2017
![Page 52: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/52.jpg)
Sluice Networks
52Courtesy: http://ruder.io/multi-task/
![Page 53: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/53.jpg)
Multi-Task Sequence to Sequence Learning
53Multi-Task Sequence to Sequence Learning, ICLR 2016
![Page 54: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/54.jpg)
Multi-Task Learning for IR tasks
54
Representation Learning Using Multi-Task Deep Neural Networks for Semantic Classification and Information Retrieval, NAACL 2015
![Page 55: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/55.jpg)
Multi-Task Domain Adaptation
55Multi-Task Domain Adaptation for Sequence Tagging, Rep4NLP, 2017
![Page 56: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/56.jpg)
Adversarial Multi-Task Learning
56Adversarial Multi-task Learning for Text Classification, ACL 2017
![Page 57: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/57.jpg)
One Model to Learn Them All
57
![Page 58: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/58.jpg)
One Model to Learn Them All
58
![Page 59: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/59.jpg)
Reference
● A Survey on Multi-Task Learning● An Overview of Multi-Task Learning in Deep Neural Networks
59
![Page 60: Multi-Task Machine Learning - Yunsheng Baiyunshengb.com/wp-content/uploads/2017/11/Multi-Task...Dirty Approach g(U) and h(V) defined in different ways in the following works Learning](https://reader034.vdocuments.us/reader034/viewer/2022050600/5fa7a79eedd9275f2c4740e8/html5/thumbnails/60.jpg)
Thank You
60