A walk with fastai2 - Tabular - Study Group and Online Lectures Megathread

@faib awesome work! (Reading through now) IIRC @Pak looked into this and found that it didn’t really make that much of a difference at the end of the day (back in v1) so the results aren’t too surprising.

Thank you!
I just realized I could write a processor like Categorify and append it to the procs argument in TabularPandas right?

I still have problems understanding where TabularPandas fits in. It’s not a high level API
like TabularDataLoaders nor does it belong to the DataBlock category. Does it logically sit below that or somewhere in between?

Somewhere in between. The role of TabularPandas is to prepare the data for being made into a DataLoader. It’s high-level API but it’s not a DataBlock (this is in development)

Got it! I’m looking forward to using the DataBlock API for tabular and being able to rely on using a unified interface :slight_smile:

When I try to add an additional metric to my learner, it results in below error during learn.fit():

*TypeError: unsupported operand type(s) for : ‘AccumMetric’ and ‘int’

Learn object defined as below:-

from fastai2.callback.all import *
learn = tabular_learner(dls,
layers=[1000,500],
metrics=[accuracy,RocAuc])

Need some help on this …

Hi Haroon!

I think this can help you:

1 Like

I also get below error on passing the dropouts (ps, embed_p). I remember this working fine earlier.

TypeError: init() got an unexpected keyword argument 'ps’

learn = tabular_learner(dls,
layers=[1000,500],
ps=[0.001, 0.01],
embed_p=0.04,
metrics=[accuracy])

Thanks this fixed the issue.

I think the more recent version deprecates dropouts from the tabular_learner thus there will be no ps nor embed_p…

It’s not deprecated, you just need to pass them in config=tabular_config(). All customization of models are done this way in fastai v2, to avoid mixing the kwargs of the models with those of the Learner.

4 Likes

Hi, I’ve watched Tabular lesson-1 and I’ve a couple of doubts.
Zach shows how to plot using matplotlib, but the .plot() function plots any column with the serial number only. So how to plot between two columns(like a scatter plot between age and working hours)?

Also towards the end, he creates a tabular model called ‘net’. Why can’t we do a lr_find or .fit on net like we do on a tabular_learner?

Thanks,

Short answer is it’s just a model, not a Learner instance. We need to make it a Learner with our net to do so.

1 Like

Yeahh… Just now did that… Works great!
And how to do the plotting between 2 columns?

I"d recommend reading up here: https://seaborn.pydata.org/tutorial/categorical.html

1 Like

Sure, I’ll look into it.
Thank you! :slight_smile:
And thank you so much for the videos it really helped me understand way lot better.

Thanks got it !

Hi @muellerzr I passed the model into a Learner and used ‘CrossEntropyLossFlat’ as the loss function used in tabular_learner is flattened form of nn.CrossEntropyLoss.
But when i do .lr_find i get the following error:

bool value of Tensor with more than one value is ambiguous

How do i solve this?

Thanks,

You need to use an instance of it. CrossEntropyLossFlat()

1 Like

Alright everyone we’ll be live streaming today! Here’s the link: https://youtu.be/XoWX_YOrtPg

We’ll be covering two different methods of model interpretation, ClassConfusion and SHAP, along with some general guidelines and pitfalls into doing research in this field I’ve found

  • live stream up at 4:45pm CST
2 Likes

Has anyone created lime library for fastdotai yet?