Reducing size of CNN learner from 850MB to 125MB

After training a resnet-50 model with a large training set and then calling learner.export(…), fastai outputted a rather large (~850MB) .pkl file.

After inspecting the learner object further, I realized that I could reduce the size of the .pkl file from ~850MB to ~125MB by running the following:

def clear_splits(dls):
    for loader_idx, loader in enumerate(dls.loaders):
        for tls_idx, tls in enumerate(loader.dataset.tls):
            tls.splits.clear()
            dls.loaders[loader_idx].tls[tls_idx] = dls.loaders[loader_idx].tls[tls_idx].new_empty()

def clear_loaders(dls):
    dls.loaders.clear()

clear_splits(learn.dls)
clear_splits(learn.__stored_args__['dls'])
# clear_loaders(learn.dls) doesn't seem to do anything
clear_loaders(learn.__stored_args__['dls'])

This suggests part of the cleanup code in learner.export(…) is not cleaning up everything it should. Is this bug in learner.export(…). If so, should we add something like the code above into learner.export(…)?

What version of the library are you using? And can you make a minimal reproducer? (IE does this happen with the PETS dataset?) We worked on such an issue a few months back

Just tested, this is definitely an old version as the code you have doesn’t work anymore. We already fixed many of these reference issues. What versions are you using @kply

1 Like

I don’t remember now that I upgraded to the newest version of fastai. I will try again later this week (sorry have been very busy)