(2019) aravind rajeswaran, chelsea finn, sham …...aravind rajeswaran, chelsea finn, sham kakade,...
TRANSCRIPT
![Page 1: (2019) Aravind Rajeswaran, Chelsea Finn, Sham …...Aravind Rajeswaran, Chelsea Finn, Sham Kakade, Sergey Levine (2019) Mayank Mittal Model-Agnostic Meta-Learning Images from: Model-Agnostic](https://reader036.vdocuments.us/reader036/viewer/2022071005/5fc2ac0ce9f67f09ad4e6f86/html5/thumbnails/1.jpg)
Meta-Learning with Implicit GradientsAravind Rajeswaran, Chelsea Finn, Sham Kakade, Sergey Levine (2019)
Mayank Mittal
![Page 2: (2019) Aravind Rajeswaran, Chelsea Finn, Sham …...Aravind Rajeswaran, Chelsea Finn, Sham Kakade, Sergey Levine (2019) Mayank Mittal Model-Agnostic Meta-Learning Images from: Model-Agnostic](https://reader036.vdocuments.us/reader036/viewer/2022071005/5fc2ac0ce9f67f09ad4e6f86/html5/thumbnails/2.jpg)
Model-Agnostic Meta-Learning
Images from: Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks, Finn et al. (2017)
▪ Learn general purpose internal representation that is transferable across different tasks
distribution over tasks
drawn from task
Meta-Learning objective:
loss function
task-specific parameters
![Page 3: (2019) Aravind Rajeswaran, Chelsea Finn, Sham …...Aravind Rajeswaran, Chelsea Finn, Sham Kakade, Sergey Levine (2019) Mayank Mittal Model-Agnostic Meta-Learning Images from: Model-Agnostic](https://reader036.vdocuments.us/reader036/viewer/2022071005/5fc2ac0ce9f67f09ad4e6f86/html5/thumbnails/3.jpg)
Model-Agnostic Meta-Learning
Images from: Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks, Finn et al. (2017)
Inner Level/ Adaptation Step
Outer Level/ Meta-Optimization
Step
![Page 4: (2019) Aravind Rajeswaran, Chelsea Finn, Sham …...Aravind Rajeswaran, Chelsea Finn, Sham Kakade, Sergey Levine (2019) Mayank Mittal Model-Agnostic Meta-Learning Images from: Model-Agnostic](https://reader036.vdocuments.us/reader036/viewer/2022071005/5fc2ac0ce9f67f09ad4e6f86/html5/thumbnails/4.jpg)
Images from: Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks, Finn et al. (2017)
Model-Agnostic Meta-Learning
![Page 5: (2019) Aravind Rajeswaran, Chelsea Finn, Sham …...Aravind Rajeswaran, Chelsea Finn, Sham Kakade, Sergey Levine (2019) Mayank Mittal Model-Agnostic Meta-Learning Images from: Model-Agnostic](https://reader036.vdocuments.us/reader036/viewer/2022071005/5fc2ac0ce9f67f09ad4e6f86/html5/thumbnails/5.jpg)
Images from: Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks, Finn et al. (2017)
Model-Agnostic Meta-Learning
source of problem
Gradients flow over iterative updates of inner-level! ● Need to store optimization path● Computationally expensive
Hence, heuristics such as first-order MAML and Reptile are used
![Page 6: (2019) Aravind Rajeswaran, Chelsea Finn, Sham …...Aravind Rajeswaran, Chelsea Finn, Sham Kakade, Sergey Levine (2019) Mayank Mittal Model-Agnostic Meta-Learning Images from: Model-Agnostic](https://reader036.vdocuments.us/reader036/viewer/2022071005/5fc2ac0ce9f67f09ad4e6f86/html5/thumbnails/6.jpg)
make this path independent?
Motivation
![Page 7: (2019) Aravind Rajeswaran, Chelsea Finn, Sham …...Aravind Rajeswaran, Chelsea Finn, Sham Kakade, Sergey Levine (2019) Mayank Mittal Model-Agnostic Meta-Learning Images from: Model-Agnostic](https://reader036.vdocuments.us/reader036/viewer/2022071005/5fc2ac0ce9f67f09ad4e6f86/html5/thumbnails/7.jpg)
Implicit MAML (iMAML)
▪ In MAML, is an iterative gradient descent ▪ dependence of model-parameters on
meta-parameters shrinks and vanishes as number of gradient steps increases
▪ Instead, iMAML uses an optimum solution of the regularized loss
encourages dependency between model-parameters
and meta-parameters
![Page 8: (2019) Aravind Rajeswaran, Chelsea Finn, Sham …...Aravind Rajeswaran, Chelsea Finn, Sham Kakade, Sergey Levine (2019) Mayank Mittal Model-Agnostic Meta-Learning Images from: Model-Agnostic](https://reader036.vdocuments.us/reader036/viewer/2022071005/5fc2ac0ce9f67f09ad4e6f86/html5/thumbnails/8.jpg)
Implicit MAML (iMAML)
Issue: Still the meta-gradient requires to computation of the Jacobian, i.e.
Soln: Use implicit Jacobian!
![Page 9: (2019) Aravind Rajeswaran, Chelsea Finn, Sham …...Aravind Rajeswaran, Chelsea Finn, Sham Kakade, Sergey Levine (2019) Mayank Mittal Model-Agnostic Meta-Learning Images from: Model-Agnostic](https://reader036.vdocuments.us/reader036/viewer/2022071005/5fc2ac0ce9f67f09ad4e6f86/html5/thumbnails/9.jpg)
Implicit Jacobian Lemma
where
Proof. Apply stationary point condition:
![Page 10: (2019) Aravind Rajeswaran, Chelsea Finn, Sham …...Aravind Rajeswaran, Chelsea Finn, Sham Kakade, Sergey Levine (2019) Mayank Mittal Model-Agnostic Meta-Learning Images from: Model-Agnostic](https://reader036.vdocuments.us/reader036/viewer/2022071005/5fc2ac0ce9f67f09ad4e6f86/html5/thumbnails/10.jpg)
Practical Algorithm
Issues:▪ obtaining exact solution to inner level optimization
problem may be not feasible, in practice▪ computing Jacobian by explicitly forming and
inverting the matrix may be intractable for large networks
Meta gradient:
Soln: Use approximations!
![Page 11: (2019) Aravind Rajeswaran, Chelsea Finn, Sham …...Aravind Rajeswaran, Chelsea Finn, Sham Kakade, Sergey Levine (2019) Mayank Mittal Model-Agnostic Meta-Learning Images from: Model-Agnostic](https://reader036.vdocuments.us/reader036/viewer/2022071005/5fc2ac0ce9f67f09ad4e6f86/html5/thumbnails/11.jpg)
Practical Algorithm
1. Approximate algorithm solution:
2. Approximate matrix inversion product:
Soln: Use approximations!
Solved using Conjugate Gradient (CG) method
Meta gradient:
![Page 12: (2019) Aravind Rajeswaran, Chelsea Finn, Sham …...Aravind Rajeswaran, Chelsea Finn, Sham Kakade, Sergey Levine (2019) Mayank Mittal Model-Agnostic Meta-Learning Images from: Model-Agnostic](https://reader036.vdocuments.us/reader036/viewer/2022071005/5fc2ac0ce9f67f09ad4e6f86/html5/thumbnails/12.jpg)
Practical AlgorithmImplicit gradient accuracy: (valid under other regularity assumptions)
1. Approximate algorithm solution:
2. Approximate matrix inversion product:
Soln: Use approximations!
Solved using Conjugate Gradient (CG) method
![Page 13: (2019) Aravind Rajeswaran, Chelsea Finn, Sham …...Aravind Rajeswaran, Chelsea Finn, Sham Kakade, Sergey Levine (2019) Mayank Mittal Model-Agnostic Meta-Learning Images from: Model-Agnostic](https://reader036.vdocuments.us/reader036/viewer/2022071005/5fc2ac0ce9f67f09ad4e6f86/html5/thumbnails/13.jpg)
Implicit MAML Algorithm
Inner Level/ Adaptation
Step
Outer Level/ Meta-Optimiza
tion Step
![Page 14: (2019) Aravind Rajeswaran, Chelsea Finn, Sham …...Aravind Rajeswaran, Chelsea Finn, Sham Kakade, Sergey Levine (2019) Mayank Mittal Model-Agnostic Meta-Learning Images from: Model-Agnostic](https://reader036.vdocuments.us/reader036/viewer/2022071005/5fc2ac0ce9f67f09ad4e6f86/html5/thumbnails/14.jpg)
Theory: iMAML computes meta-gradient by small error in memory efficient way
Experiment:
Computation Bounds
![Page 15: (2019) Aravind Rajeswaran, Chelsea Finn, Sham …...Aravind Rajeswaran, Chelsea Finn, Sham Kakade, Sergey Levine (2019) Mayank Mittal Model-Agnostic Meta-Learning Images from: Model-Agnostic](https://reader036.vdocuments.us/reader036/viewer/2022071005/5fc2ac0ce9f67f09ad4e6f86/html5/thumbnails/15.jpg)
Results on few-shots learning
![Page 16: (2019) Aravind Rajeswaran, Chelsea Finn, Sham …...Aravind Rajeswaran, Chelsea Finn, Sham Kakade, Sergey Levine (2019) Mayank Mittal Model-Agnostic Meta-Learning Images from: Model-Agnostic](https://reader036.vdocuments.us/reader036/viewer/2022071005/5fc2ac0ce9f67f09ad4e6f86/html5/thumbnails/16.jpg)
Future Work
▪ Broader classes for inner loop optimization▪ dynamic programming▪ energy-based models▪ actor-critic RL
▪ More flexible regularizers▪ Learn a vector- or matrix-valued λ
(regularization coefficient)
![Page 17: (2019) Aravind Rajeswaran, Chelsea Finn, Sham …...Aravind Rajeswaran, Chelsea Finn, Sham Kakade, Sergey Levine (2019) Mayank Mittal Model-Agnostic Meta-Learning Images from: Model-Agnostic](https://reader036.vdocuments.us/reader036/viewer/2022071005/5fc2ac0ce9f67f09ad4e6f86/html5/thumbnails/17.jpg)
References
▪ Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks, Finn et al. (2017)
▪ Meta-Learning with Implicit Gradients, Rajeswaran et al. (2019)
▪ Notes on iMAML, Ferenc Huszar (2019)
![Page 18: (2019) Aravind Rajeswaran, Chelsea Finn, Sham …...Aravind Rajeswaran, Chelsea Finn, Sham Kakade, Sergey Levine (2019) Mayank Mittal Model-Agnostic Meta-Learning Images from: Model-Agnostic](https://reader036.vdocuments.us/reader036/viewer/2022071005/5fc2ac0ce9f67f09ad4e6f86/html5/thumbnails/18.jpg)
Thank you!