I am currently trying to solve a similar bug which somehow prevents my small custom NN to train properly on sequence data transformed to images, i.e. train, valid loss and accuracy do not improve and almost stay the same over several epochs.
So far I tried these strategies but was not successful yet:
- Verified that the y values are handed over as int values and the data is stored as categorical variable (print DataBunch object). Otherwise fastai will misinterpret your application and will not choose the right loss function (In my case with two labels the loss function should be
torch.nn.functional.cross_entropy
, afaik.) - Be sure that you NN end-stage is compatible with your loss function (cross entropy loss has a LogSoftMax included so you don’t need it in your NN.)
- The
create_cnn
function usesapply_init(model[1], nn.init.kaiming_normal_)
for the new head, i.e. the new and untrained part. (However, so far I could not see a huge difference when I use it for my NN, maybe a little more change in the parameters which should be a good thing when visualized with tensorboard.)
I also tried it with an adapted pretrained ResNet18 with my image data and I got the same error. Because of that I am currently checking my data object setup, loss functions, and my simple NN setup.
Maybe this helps you and maybe you have tried other approaches to solve this problem?
Kind regards
Michael