TabularLearner export.pkl from learn.export() is Very Large

I’m not sure if this is intended but the export.pkl is about 471 MB which is somewhat prohibitive in the deployment in certain applications.

The model itself from SaveModelCallback is only 131 KB and I’m only looking to use the Learner in order to apply the same transforms/processing (Normalization, FillMissing, Categorify).

Is there a reason this is so large? I’ve also confirmed

learn.xb
(None, )

learn.yb
(None, )

Hey Jason,

When you export a model (without the optimizer state), you basically need to save all the weights to disk. You can do a quick ballpark estimation of the expected file size depending on the number of model parameters (and assuming float32 = 4 bytes for each), but it’s likely to be several hundred megabytes.

Saying that it’s “prohibitive” to deployment in certain applications may be true for your use case, but that means that you likely cannot use neural networks at all (or you have to use specific architectures designed to be as light-weight as possible, which usually also impacts accuracy). Another option is to see what exactly prevents you from being able to deploy this model and trying to solve that problem.

There are options available right now @orendar and @jasonho28 :slight_smile: what I would recommend is doing a torch.save() to save the weights and export the TabularPandas object instead. I would expect this could reduce the size. As a result during inference you’d go TP -> DataLoader rather than just a plain DL:

See the very bottom for a usage example, the library is wwf :slight_smile:

2 Likes

Hi.

This topic looks like this one that was not solved :

The size of the pkl file created by learn.export() depends on the batch size (at least in my test, it depends on large batch size).

@muellerzr

The solution worked great. For reference:

We manually save the Model in the Learner
torch.save(learn.model, f'{model_dir}/2_{REF}_LEARNER_MODEL.pt')

We export the Tabular Object as well
to.export(f'{model_dir}/3_{REF}_TABULAR_OBJECT.pkl')

We load the Tabular Object

to_new = load_pandas(f'{model_dir}/3_{REF}_TABULAR_OBJECT.pkl')
to_new = to_new.train.new(df[:20])
to_new.process()

We load the Model

model_2 = torch.load(f'{model_dir}/2_{REF}_LEARNER_MODEL.pt')
learn_new = TabularLearner(dls_new, model)

We do Inference

row, clas, probs = learn_new.predict(df.iloc[0])
row.show()
probs

The savings are substantial:
Model: 135 kb
Tabular Object: 6 kb

vs.

learn.export() 417 mb

3 Likes

@orendar @jasonho28 @pierreguillou over the weekend Jeremy and I solved this issue, it was due to log_args plus some extraneous references inside of ReadTabBatch. Happy to report that my export.pkl is a calm cool 142.5 kb’s :slight_smile:

4 Likes

I’m curious as to how this is even possible, as I don’t use tabular much - how many parameters are in the model? I don’t think I’ve ever seen a model weigh less than 100 mb on any of the libraries I’ve used.

The tabular model is only 3(ish) fully connected layers and some embeddings. It’s a very very tiny model :slight_smile:

A param count is almost 30k

The exact model from the adult sample example is 128kb :slight_smile: as you can see here:

Resnet’s aren’t bad either, the 34 is still under 100mb (84mb) and the 50 is 94mb

1 Like

That’s amazing indeed!

How can I use to.export()?

!pip install wwf

gave me

Successfully installed wwf-0.0.5.

But then,

from wwf.tabular.export import *

resulted in

ModuleNotFoundError: No module named ‘wwf.tabular’.

The cited page shows

Site last generated: Oct 22, 2020

and

wwf: 0.0.4.

That’s a typo. Should be from wwf.tab.export import *

Thanks, @muellerzr. Now, I could use to.export().

What is dls_new in this example?

More specifically, how can dls_new be obtained from the to_new to be used as a test dataset?

Hi @muellerzr

fastai 2.2.0 still has the data leakage issue.
My naive screening showed similar patterns as you previously mentioned in [ReadTabBatch] (https://github.com/fastai/fastai/pull/2948). Many tabular transform are storing the intermediate data and was accidentally exported.

I reproduced the issue in:

You could see we can access the data by doing

learn_loaded.dls.loaders[0].procs.categorify.to.items

or

learn_loaded.dls.loaders[0].procs.normalize.to.items
1 Like

Thanks! Looks like this is a very recent bug, so thank you for flagging! I’ll look into this :slight_smile:

1 Like

A fix has now been pushed to master. Thanks again :slight_smile:

1 Like

Thanks for the fix @muellerzr! The “to” attribute is no longer appeared in the exported learner.

Sorry, I forget to point out that the “dset” attribute of the FillMissing transformation also store a copy of the dataset.

learn_loaded.dls.loaders[0].procs.fill_missing.dsets.items

Can you look into that also?

Ah, I see why that’s a thing. I have an interim PR that actually fixes that, that’s an inconsistency oversight on my part. Will lyk when that gets merged.

Interim this can fix it for folks:

@patch
def setups(self:FillMissing, dsets):
        missing = pd.isnull(dsets.conts).any()
        store_attr(but='dsets', na_dict={n:self.fill_strategy(dsets[n], self.fill_vals[n])
                            for n in missing[missing].keys()})
        self.fill_strategy = self.fill_strategy.__name__

(cc @Haotong)

2 Likes

Thanks for the solution!