Thanks for that. I managed to figure it out after alot of trial and error. I also wanted to check if this was a ‘windows’ issue so I tested on AWS, similar error:
RuntimeError: "lerp_cuda" not implemented for 'Long'
Error on Colab is maybe a bit more to the point:
RuntimeError: expected dtype long for weights but got dtype long
And it all came down to 1 line!
It seems that you have to run learn twice (not sure why), The first is when you want to view the batch (as in the example):
learn = cnn_learner(dls, resnet50, loss_func=CrossEntropyLossFlat(), cbs=cutmix, metrics=[accuracy, error_rate]) learn._do_begin_fit(1) learn.epoch,learn.training = 0,True learn.dl = dls.train b = dls.one_batch() learn._split(b) learn('begin_batch') _,axs = plt.subplots(3,3, figsize=(9,9)) dls.show_batch(b=(cutmix.x,cutmix.y), ctxs=axs.flatten())
As learn was already called I was then doing learn.fit_one_cycle(1) and that resulted in the errors. It works if you do this instead:
learn = cnn_learner(dls, resnet50, loss_func=CrossEntropyLossFlat(), cbs=cutmix, metrics=[accuracy, error_rate])
learn.fit_one_cycle(1)