It’s not deprecated, you just need to pass them in config=tabular_config(). All customization of models are done this way in fastai v2, to avoid mixing the kwargs of the models with those of the Learner.


Hi, I’ve watched Tabular lesson-1 and I’ve a couple of doubts.
Zach shows how to plot using matplotlib, but the .plot() function plots any column with the serial number only. So how to plot between two columns(like a scatter plot between age and working hours)?

Also towards the end, he creates a tabular model called ‘net’. Why can’t we do a lr_find or .fit on net like we do on a tabular_learner?


Short answer is it’s just a model, not a Learner instance. We need to make it a Learner with our net to do so.

And how to do the plotting between 2 columns?

I"d recommend reading up here: https://seaborn.pydata.org/tutorial/categorical.html

Hi @muellerzr I passed the model into a Learner and used ‘CrossEntropyLossFlat’ as the loss function used in tabular_learner is flattened form of nn.CrossEntropyLoss.
But when i do .lr_find i get the following error:

bool value of Tensor with more than one value is ambiguous

How do i solve this?


You need to use an instance of it. CrossEntropyLossFlat()

Has anyone created lime library for fastdotai yet?

Thanks everyone who joined! I’ve added in the notebook about looking at research and the important ideas you should be keeping in mind at the top post, next week we’ll be looking at a few new architectures for tabular and then that will be it! We’ll move onto NLP. If there’s anything that people want me to cover specifically for tabular in this last lecture, please let me know and we’ll include it in if possible! Thanks! :slight_smile:

More on SHAP values: https://towardsdatascience.com/explain-your-model-with-the-shap-values-bc36aac4de3d

falling really far behind :frowning: @muellerzr back to the Adults notebook, one thing that is still confusing is in the Normalize section, since data is scaled between 0 to 1, do you know why there are negative age/fnlwgt values?

If I mentioned that in the video than I’m wrong. It’s scaled from -1 to +1. We can see this in the encodes function:

def setups(self, dsets): self.means,self.stds = dsets.conts.mean(),dsets.conts.std(ddof=0)+1e-7
    def encodes(self, to): to.conts = (to.conts-self.means) / self.stds <- HERE
    def decodes(self, to): to.conts = (to.conts*self.stds ) + self.means

So if we have a particular x that is less than the mean then we can very easily get a negative value

I’ve made this adjustment in the notebook. Thanks for pointing this out!


Is there a way to extract the learned embedding vectors post training a tabular_learner. I would like to use it for training models from sklearn. The demo of ensembling by @muellerzr greatly helps. But xs extracted from dataloaders are still just the label encoded version of the categorical columns…

I’ll try to do a demo of that next week. Great question!

from fastai2.metrics import *
to = TabularPandas(df, procs, cat_names, dep_var, y_block=RegressionBlock(),
dls = to.dataloaders(bs=32)
learn = tabular_learner(dls, layers=[10,10], metrics= [msle],
loss_func=MSELossFlat(), n_out = 1)

When no cont_names is passed while creating dataloaders I get below error. I have all categorical columns. Also passing is msle as in above results in an error.

RuntimeError: The size of tensor a (32) must match the size of tensor b (0) at non-singleton dimension 0

First try specifying the continues variables as nothing. IE cont_names = []

Edit: I think the issue lies in you need to specify the number of outputs. That was the right track.

