Question on learn.load('<saved_model_name>')

wgpubs · November 20, 2017, 2:17am

I noticed, or I believe I noticed, that in doing the planet competition that I needed to run the following in order to load the model saved after training on 256x256 images:

img_sz = 64
data = get_data(arch, img_sz, val_idxs)
data = data.resize(int(img_sz * 1.3), 'tmp') # this creates /tmp/83

learn = ConvLearner.pretrained(arch, data, metrics=metrics)

img_sz = 256
learn.set_data(get_data(arch, img_sz, val_idxs))
learn.load(f'{arch.__name__}_{img_sz}')

After restarting my notebook, if I didn’t run all that code, the learn.load() call at the end wouldn’t work, or it would work but the results against my validation dataset would be worse. I just wanted to know if I was right in assuming this is what needs to be run or if there is a better way to rehydrate a model?

jeremy · November 20, 2017, 2:21am

I think the problem may be that you didn’t re-run data.resize after increasing img_sz. So effectively you’ve trained a model on 64x64 images that have been upscaled to 256x256! It depends on how your get_data is defined.

wgpubs · November 20, 2017, 2:24am

Yah I was pretty much following the lesson2-image_models notebook code. data.resize() is only called once and never again for the 128 or 256 sized images.

I thought maybe there was a reason for this.

My get_data() function …

def get_data(p_model, img_sz, val_idxs=[0]):
    tfms = tfms_from_model(p_model, img_sz, aug_tfms=transforms_top_down, max_zoom=1.05)
    
    return ImageClassifierData.from_csv(f'{PATH}', 'train-jpg', labels_csv, bs=bsz, tfms=tfms, 
                                        val_idxs=val_idxs, suffix='.jpg', test_name='test-jpg')

wgpubs · November 20, 2017, 2:28am

In retrospect, the better move may have been to resize as such:

data = data.resize(int(256 * 1.3), 'tmp')

… and then use those images to build the 64 and 128 sized datasets.

jeremy · November 20, 2017, 2:43am

Since you’re calling get_data which creates a new data object from scratch, this looks fine.

jakcycsl · November 20, 2017, 4:31am

I am not sure if this can help you, but for my case, I found out that the models are saved under the path ~/tmp/83/models/*.
So, I just moved all the models back to the directory /data/planet/models/*, and re-run learn.load.

wgpubs · November 20, 2017, 4:43am

Yes, good point.

Once you use data.resize, forever afterwards it looks under the /tmp folder (which really shouldn’t be the case for the 128 and 256 sized images since they don’t have anything to saved 83x83 images in there). So, like you said, move the models up to your root /models directory and you can forego the data.resize code above.