Fine-Tuning with Regularized Optimization
SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models Princi- pled Regularized Optimization [Jiang et al., 2020]
Aggressive fine-tuning can often cause over-fitting. This can cause the model to fail to generalize to unseen data. To combat this in a principled manner, [Jiang et al., 2020] propose (1) Smoothness-inducing regularization, which effectively manages the complexity of the model and (2) Bregman proximal point optimization, which is an instance of trust-region methods and can prevent aggressive updating. See [Jiang et al., 2020] and their repository for additional details.
Edited by Christopher Lee Lübbers