How to save Dataloaders?

pierreguillou · June 24, 2020, 2:20pm

Hello,

in fastai v1, it was possible to Load and save a DataBunch.

I do not see the same methods for Dataloaders of fastai v2. However, it would be very useful in particular in the case of Text Dataloaders where the training and validation tuples (x,y) are pre-processed at the Dataloaders creation time.

As an example, you can see the following code in the paragraph “Preparing the data” of the Transformers tutorial:

bs,sl = 8,1024
dls = tls.dataloaders(bs=bs, seq_len=sl)

If the training and validation datasets are huge, the Dataloaders creation can take a long time… It would be great to be able to save and load it.

What do you think?

muellerzr · June 24, 2020, 2:36pm

You can simply do:

torch.save(dls, 'fname.pkl')

And then it’ll work. Just do a torch.load() to bring it back in (this is actually most of what export does anyways )

pierreguillou · June 24, 2020, 4:38pm

Thank you very much Zachary!
PS: you should apply for the free Sylvain Gugger seat cc @jeremy

pierreguillou · June 24, 2020, 8:55pm

@muellerzr: your code works like a charm no notebook 10_nlp.ipynb do fastbook but not in the Transformer tutorial do @sgugger (see screen shot).

How to adapt your code to this case? Thanks.

Note: same problem with learn.export() but not with learn.save() which works.

muellerzr · June 24, 2020, 8:58pm

Hmmmm… that would be up to Sylvain and how he has the transformers done… I l haven’t looked into that yet (pickles that much or transformers) though it looks like that was an old issue with HF: "can't pickle Tokenizer objects" · Issue #87 · huggingface/tokenizers · GitHub

sgugger · June 25, 2020, 11:49am

Yes, not all tokenizers from Hugging Face are serializable. It should be fixed in the next release from what I’ve followed.

pierreguillou · June 25, 2020, 1:33pm

Thanks Sylvain and Zachary.

Jdemlow · March 16, 2021, 2:02am

@muellerzr Question on the Dataloaders one thing that can be nice, but also is a pain is that I can only save the data loader with the data until I have a model I don’t have the ability to take the preprocessing steps from a recently saved TabularDataLoader. ( I think this is why I am asking )

This might be a thing in the DataBlockAPI, but I am currently in a tabular project mode for work.

dl_test = dl_train.test_dl(X_test, with_label=False) # could be true doesn't matter

This is fine when you are going to train and do inference in the same place and have enough ram to hold both data sets. However when using a tabular learn I don’t believe the training data is available and as I write this maybe it is, but I don’t think so.

learn_inf = load_learner(os.path.join(model_path, yaml.get('process_name') + yaml.get('dl_model_suffix')),
                             cpu=True)
test_dl = learn_inf.dls.test_dl(df_test, with_label=False)

Even though the fastai model is a little bigger than a typical model like an xgb model that is completely okay for the functionality it gives me.

Do you know of a way when

dl_train = (TabularDataLoaders.from_df(df_transform, procs=procs,
                                       cat_names=cat_vars, cont_names=cont_vars,
                                       y_names=0, y_block=y_block,
                                       valid_idx=splits[1], bs=bs))
if os.path.exists(p) is False:
     os.makedirs(f'{p}')
logging.info(f'{fn} getting saved to {p}')
file_path = os.path.join(p, '' f"{process_name}_{fn}.pkl")
logging.info(f'file saved to {file_path}')
torch.save(dl, file_path)

Rather than save the entire dataset in the Dataloader is there way to pop out the data have that this be similar to a sklearn pipeline that is there to then use what’s above without the overhead of the memory and large object movement

EDCaz · March 25, 2021, 3:33am

The code for Dataloader is not working.

EDCaz · March 25, 2021, 4:30am

I get the same error.

joshiharshit5077 · September 9, 2022, 7:20am

When I try to load the learner using learn.load(.pth file) it gives me an error. I have different data this time which has the same image dimensions but the number of classes in data loader has reduced from 3 to 2.
Can someone tell me how can I load my model for inferencing on a completely new dataset?

learn.load("/content/drive/MyDrive/trained_model",with_opt=True)

This is the error that I get:
RuntimeError Traceback (most recent call last)
in
----> 1 learn.load(“/content/drive/MyDrive/trained_model”,with_opt=True)

1 frames
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in load_state_dict(self, state_dict, strict)
1496 if len(error_msgs) > 0:
1497 raise RuntimeError(‘Error(s) in loading state_dict for {}:\n\t{}’.format(
→ 1498 self.class.name, “\n\t”.join(error_msgs)))
1499 return _IncompatibleKeys(missing_keys, unexpected_keys)
1500

RuntimeError: Error(s) in loading state_dict for RetinaNet:
size mismatch for classifier.3.weight: copying a param with shape torch.Size([3, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([2, 128, 3, 3]).
size mismatch for classifier.3.bias: copying a param with shape torch.Size([3]) from checkpoint, the shape in current model is torch.Size([2]).