A walk with fastai2 - Tabular - Study Group and Online Lectures Megathread

It’s not deprecated, you just need to pass them in config=tabular_config(). All customization of models are done this way in fastai v2, to avoid mixing the kwargs of the models with those of the Learner.

4 Likes

Hi, I’ve watched Tabular lesson-1 and I’ve a couple of doubts.
Zach shows how to plot using matplotlib, but the .plot() function plots any column with the serial number only. So how to plot between two columns(like a scatter plot between age and working hours)?

Also towards the end, he creates a tabular model called ‘net’. Why can’t we do a lr_find or .fit on net like we do on a tabular_learner?

Thanks,

Short answer is it’s just a model, not a Learner instance. We need to make it a Learner with our net to do so.

1 Like

Yeahh… Just now did that… Works great!
And how to do the plotting between 2 columns?

I"d recommend reading up here: https://seaborn.pydata.org/tutorial/categorical.html

1 Like

Sure, I’ll look into it.
Thank you! :slight_smile:
And thank you so much for the videos it really helped me understand way lot better.

Thanks got it !

Hi @muellerzr I passed the model into a Learner and used ‘CrossEntropyLossFlat’ as the loss function used in tabular_learner is flattened form of nn.CrossEntropyLoss.
But when i do .lr_find i get the following error:

bool value of Tensor with more than one value is ambiguous

How do i solve this?

Thanks,

You need to use an instance of it. CrossEntropyLossFlat()

1 Like

Alright everyone we’ll be live streaming today! Here’s the link: https://youtu.be/XoWX_YOrtPg

We’ll be covering two different methods of model interpretation, ClassConfusion and SHAP, along with some general guidelines and pitfalls into doing research in this field I’ve found

  • live stream up at 4:45pm CST
2 Likes

Has anyone created lime library for fastdotai yet?

Thanks everyone who joined! I’ve added in the notebook about looking at research and the important ideas you should be keeping in mind at the top post, next week we’ll be looking at a few new architectures for tabular and then that will be it! We’ll move onto NLP. If there’s anything that people want me to cover specifically for tabular in this last lecture, please let me know and we’ll include it in if possible! Thanks! :slight_smile:

More on SHAP values: https://towardsdatascience.com/explain-your-model-with-the-shap-values-bc36aac4de3d

1 Like

falling really far behind :frowning: @muellerzr back to the Adults notebook, one thing that is still confusing is in the Normalize section, since data is scaled between 0 to 1, do you know why there are negative age/fnlwgt values?

1 Like

If I mentioned that in the video than I’m wrong. It’s scaled from -1 to +1. We can see this in the encodes function:

def setups(self, dsets): self.means,self.stds = dsets.conts.mean(),dsets.conts.std(ddof=0)+1e-7
    def encodes(self, to): to.conts = (to.conts-self.means) / self.stds <- HERE
    def decodes(self, to): to.conts = (to.conts*self.stds ) + self.means

So if we have a particular x that is less than the mean then we can very easily get a negative value

1 Like

I’ve made this adjustment in the notebook. Thanks for pointing this out!

2 Likes

Is there a way to extract the learned embedding vectors post training a tabular_learner. I would like to use it for training models from sklearn. The demo of ensembling by @muellerzr greatly helps. But xs extracted from dataloaders are still just the label encoded version of the categorical columns…

1 Like

Thanks, understood :slight_smile:

I’ll try to do a demo of that next week. Great question!

1 Like

from fastai2.metrics import *
to = TabularPandas(df, procs, cat_names, dep_var, y_block=RegressionBlock(),
splits=splits)
dls = to.dataloaders(bs=32)
learn = tabular_learner(dls, layers=[10,10], metrics= [msle],
loss_func=MSELossFlat(), n_out = 1)

When no cont_names is passed while creating dataloaders I get below error. I have all categorical columns. Also passing is msle as in above results in an error.

RuntimeError: The size of tensor a (32) must match the size of tensor b (0) at non-singleton dimension 0

First try specifying the continues variables as nothing. IE cont_names = []

Edit: I think the issue lies in you need to specify the number of outputs. That was the right track.

1 Like