Recommendation to save models in root /models folder when using resized images

wgpubs · November 13, 2017, 4:13pm

def get_data(sz, bsz, val_idxs=[0], test_name='test'):
    tfms = tfms_from_model(arch, sz, aug_tfms=transforms_side_on, max_zoom=1.1)

    data = ImageClassifierData.from_csv(PATH, csv_fname=f'{DS_PATH}/labels.csv', 
                                        folder='train', test_name=test_name, 
                                        bs=bsz, tfms=tfms, val_idxs=val_idxs, suffix='.jpg')

    return data if sz > 300 else data.resize(340, 'tmp')

Using the code above to build the data for a learner results in any saved models being saved in /tmp/340/tmp/models.

I would like to recommend that the default be to look for a root /models directory in PATH as a first option because if you ever have to blow out your /tmp folder to get rid of precomputed activations, you’ll blow out your saved models as well. Another option is to make the models path an optional argument.

jeremy · November 13, 2017, 4:44pm

These are good points. OTOH, I’ve found keeping different resized dataset models separate to be helpful. Rather than removing tmp, I just remove tmp/*.bc. A better approach still would be to have models/340 for instance, although I suspect that may need some significant refactoring to make that work cleanly. PRs welcome, of course!

wgpubs · November 13, 2017, 7:09pm

Will do. I’m thinking about adding a couple new optional params:

overwrite_precomputed=False: If True, will rebuild the precomputed activations instead of re-using existing
models_dir=None: Will save/load models to/from this location if specified, else save to default path.