I am using CrossEntropyLoss with weight for an imbalanced dataset. My problem is that I cannot reproduce the same validation loss in pytroch as I get in fastai.
Here is a simple code example:
from fastai.callbacks import MixUpCallback
from fastai.vision import *
path = untar_data(URLs.MNIST_SAMPLE)
data = ImageDataBunch.from_folder(path)
weight = torch.Tensor([0.3, 0.7]).cuda()
learn = cnn_learner(data, models.resnet18, metrics=accuracy, loss_func=FlattenedLoss(torch.nn.CrossEntropyLoss, weight))
learn.fit(1, lr=0.01)
_, _, losses = learn.get_preds(ds_type=DatasetType.Valid, with_loss=True)
print(torch.mean(losses))
learn = cnn_learner(data, models.resnet18, metrics=accuracy, loss_func=FlattenedLoss(torch.nn.CrossEntropyLoss, weight), callback_fns=[MixUpCallback])
learn.fit(1, lr=0.01)
_, _, losses = learn.get_preds(ds_type=DatasetType.Valid, with_loss=True)
print(torch.mean(losses))
which outputs
epoch train_loss valid_loss accuracy time
0 0.053328 0.018560 0.995584 00:04
tensor(0.0057)
epoch train_loss valid_loss accuracy time
0 0.131673 0.020873 0.991168 00:04
tensor(0.0209)
Why is the first loss not equal to the valid_loss I get when training the model (this is the problem I have with pytroch training to use my fastai model)? Funny enough I get the same loss as the valid_loss when using MixUpCallback.
Should the get_preds
loss always be equal the valid_loss of the last epoch?
With weight = torch.Tensor([1, 1]).cuda()
I always get the right outputs:
epoch train_loss valid_loss accuracy time
0 0.048890 0.021840 0.994112 00:03
tensor(0.0218)
epoch train_loss valid_loss accuracy time
0 0.283747 0.039491 0.996565 00:03
tensor(0.0395)