What is the desired way to save a TextClasDataBunch to disk and then load it later?
I tried ‘obj.save’ which goes back in the inheritance chain to `DataBunch.save’ followed by various loaders, but all of them failed for pretty trivial reasons.
A few questions:
Should I be using the
I see the warning “Serializing the
DataBunch only works when you created it using the data block API”, so maybe I need to create my
In general, how do I know if my bunch was created with the data block API?
Yes, that is the function you should use
The factory methods use the data block API behind the scenes, they are just shortcuts, so you’re safe
If you used the data block API or a factory method from fastai, it’s by opposition to people creating their
DataBunch by passing PyTorch
Awesome! Is there a reason
load_data isn’t a factory method? Sort of asymmetric.
It would require you to type the same class as for the data you created if it was. Since the class is saved with the rest of the data object, it’s just easier to use a regular function to load everything.
For posterity, there is one more tiny trick needed: passing ‘.’ to
data_clas = TextClasDataBunch.from_df(...)
data_clas = load_data('.', 'text_clas_data_bunch')
Thanks for your help!
‘.’ doesn’t always work. It should be the same value in path when you define your TextLMDataBunch and your TextClasDataBunch