Skip to content

Feat/sophia

Lars Benedikt Kaesberg requested to merge feat/sophia into main
  • Implemented the Sophia optimizer.
  • Added SophiaH (with the Hutchinson Estimator) as an option to the train loop.
  • Added gradient clipping to the train loop.
  • Added AMP Autocast to bfloat16.
Edited by Niklas Bauer

Merge request reports