I am using a custom dual branch architecture (that takes as input a frontal and lateral xray of the same patient) based on DenseNet121. I am using a Learner object and a custom ImageList.
This is how my architecture looks. The frontal and lateral branches are pretrained on ImageNet.
This is how I’ve tried to split the model after passing it to the learner:
In these cases, I would rather suggest that 1e-2 and 1e-4 are the best learning rates, respectively. Why don’t you try those values and see if the model trains well?
Thank you very much for the advice! I will use those, but I was wondering about the reason the plot looks so weird and if it somehow indicates a bug somewhere in the pipeline.
Again a value like lr=0.01 or lr=0.1 is probably fine. I wouldn’t overthink it. As long as the loss is decreasing, your metrics are improving, and your data is not overfitting, you will be fine.