I am training a multi label BERT Model and the training and Validation loss for every batch looks as per the PDFs attached. I am trying to understand why Validation Loss is oscillating so much. Could you please give me some pointers to explore?
Training Loss Curve for every batch -
Training.pdf (27.8 KB)
Validation Loss Curve for every batch -
Validation.pdf (33.9 KB)
Train vs Val Loss (by Epoch) -
Train vs Val.pdf (30.7 KB)
The training configuration is -
model_checkpoint = “emilyalsentzer/Bio_ClinicalBERT”
tokenizer = AutoTokenizer.from_pretrained(model_checkpoint,do_lower_case=True,force_download=True)
bert_config = AutoConfig.from_pretrained(model_checkpoint)
bert_config.num_labels = 20
bert_config.problem_type = “multi_label_classification”
model = AutoModelForSequenceClassification.from_config(bert_config)
model = model.to(device)
epochs: 50
batch_size: 16
lr: 2e-05
optimizer = AdamW(model.parameters(),lr=lr,eps=1e-8)
scheduler = get_linear_schedule_with_warmup(optimizer,
num_warmup_steps=0,
num_training_steps=total_steps)
Please let me know if you need any additional details.