I get a working learner… well, almost working. It runs forward fine, but when calculating loss it turns out that train_dl labels are floats, when pytorch expects it to be longs. image_data_from_csv is using ImageMultiDataset.from_folder , which explicitly creates target self.y with dtype=np.int64. DataBunch.create() seems to be only taking datasets and passing them to Dataloader.
I tried the same approach for multi-class image classification and it works fine.
So any ideas where to look?
Nope, pytorch expects the target to be float… as long as you use the right loss .
Since you have a multiclassification problem, you shouldn’t use F.cross_entropy as a loss function (which relies on a softmax) but F.binary_cross_entropy_with_logits (which relies on a sigmoid). This one expects floats, which is why we set up the dataloader to send you target as floats.
We’re not sure - if anyone would like to help us, figuring that out would be great! A first step would be to import each thing in fastai/vision/__init__.py separately to see which module is doing this. Then try running the code in that module a bit at a time to see what function is causing it.
All modules in vision cause this bug, and when I moved all the py files up a folder, adjusting the relative paths, they all imported no problems - maddening bug!