Accuracy drop after restarting kernel and loading model

When I save and load a model during the same session, I get the same accuracy as before saving, as expected. However, when I restart the kernel after saving and then load the model, accuracy will be much lower. I suspect this has to do with the randomized train/validation split, since for datasets which are split into training and validation data (i.e. CIFAR-10), I don’t have this issue.

Below is a minimal example. I’m using version 1.0.28.

from fastai import *
from fastai.vision import *
DATA_DIR = ‘path/to/planet/dataset/’
size = 224
bs = 32
np.random.seed(42)
data = ImageDataBunch.from_csv(path=Path(DATA_DIR), folder=‘train-jpg’,
csv_labels=‘train_v2.csv’,
suffix=’.jpg’, sep=’ ', size=size, ds_tfms=get_transforms(),
bs=bs)
data = data.normalize(imagenet_stats)
learn = create_cnn(data, models.resnet34, metrics=accuracy_thresh)
learn.fit_one_cycle(1)
learn.validate() # get [0.09976922, tensor(0.9611)]
learn.save(‘test’)

Successfully load the saved model

del data
del learn
np.random.seed(42)
data = ImageDataBunch.from_csv(path=Path(DATA_DIR), folder=‘train-jpg’,
csv_labels=‘train_v2.csv’,
suffix=’.jpg’, sep=’ ', size=size, ds_tfms=get_transforms(),
bs=bs)
data = data.normalize(imagenet_stats)
learn = create_cnn(data, models.resnet34, metrics=accuracy_thresh).load(‘test’)
learn.validate() # get same values as above

Now restart the kernel and try loading again

from fastai import *
from fastai.vision import *
DATA_DIR = ‘path/to/planet/dataset/’
size = 224
bs = 32
np.random.seed(42)
data = ImageDataBunch.from_csv(path=Path(DATA_DIR), folder=‘train-jpg’,
csv_labels=‘train_v2.csv’,
suffix=’.jpg’, sep=’ ', size=size, ds_tfms=get_transforms(),
bs=bs)
data = data.normalize(imagenet_stats)
learn = create_cnn(data, models.resnet34, metrics=accuracy_thresh).load(‘test’)
learn.validate() # get [1.3317215, tensor(0.7516)]

There are now methods to save the filenames of your validation set, you should use them to be able to reload it the same way on a new kernel.

Thanks!

For posterity, I think sgugger was referring to data.export() and data.load_empty(data.path);. These are new in version 1.0.29, which is currently work in progress.

FWIW, I don’t understand:

  • why np.random.seed does not allow you produce the same train/validation split multiple times.
  • why using a different train/validation split reduces your validation accuracy. If anything, having images that you trained on in your validation set should increase your validation accuracy.
2 Likes

You also have data.valid_ds.to_csv() to be able to get a csv with filenames labels.
I’m not sure why the seed doesn’t behave as it should.

The problem has also been mentioned here: Loading a saved model, but received no answer as far as I can see.

After opening new kernel, trainign the saved model for one epoch sort of gets model’s accuracy back, but that’s not really good solution at all.

@sgugger - how does saving filenames of validation set help to get the same accuracy levels? Regardless of that, if you want to deploy your model somewhere, you don’t necessarily want to bother with any data that was used to create it.

@tarvoc - did you manage to solve this just with commands suggested by sgugger?

Does anyone know if this problem lies on the Pytorch side or is it fastai thing?

Well,sorry, I just asnwer my own question - after going round the web and searching - it seems that simply changing the model mode from .train() to .eval() like learner.model.eval() fixes the issue.

1 Like