Trying new order of Batches
We noticed that by the training on the last data_set (para_data_set), that our model learns more via it, since it has a hight number of batches.
Idea: Splitting the batches of all data_sets in a way, such that, our model learns from the three data_sets 'almost' in the same rate.
Do: size_language_training = (dataloaders.para_train_dataloader_size-dataloaders.sst_train_dataloader_size) size_language_pretrain = (dataloaders.sst_train_dataloader_size-dataloaders.sts_train_dataloader_size) size_language_finetune = dataloaders.sts_train_dataloader_size
Train: 1. train on para_data_set for size_language_training batches 2. train on set_data_set and para_data_set for next size_language_pretrain batches 3. train on all data_sets for the last size_language_finetune batches
Implementation is in branch train_with_different_batch_order to find.