TabularDataBunch gives 'ignore_empty' error

Hi, I’m new to fastai and I’ve been having a hard time TabularDataBunch.from_df in order to fill missing data and categorify my data. I’ve read through the documentation several times and it gives me AttributeError: ‘series’ object has no attribute ‘ignore_empty’.

Kindly assist please. Also the what does the arguments “path” and “valid_idx” stand for?? Here is my code:
data = TabularDataBunch.from_df(“PATH”, train_raw, ‘SalePrice’, valid_idx=20, procs=procs, cat_names=cat)

Where PATH is my system path to my original CSV train file, train_raw = df, cat is my categorical variables.

Thank you for your support

Hi David,
First of all, is PATH a variable in your notebook, in the following line of code?
data = TabularDataBunch.from_df(“PATH”, train_raw, ‘SalePrice’, valid_idx=20, procs=procs, cat_names=cat)
If so, it shouldn’t be in double quotes.
path stands for the system path where your data (csv) is located, and valid_idx stands for the indexes that will be used for validation (say you have 10,000 rows of items in your csv, then you may choose to use the last 2000 rows as your validation data, and so the remaining rows will be used for training.)
valid_idx is a range of numbers. So you’d write range(8000,10000) or whatever is the case in your dataframe.
Do tell me if the problem still exists!
Cheers, stay safe!

Thank you very much Palaash.

I am still getting NotImplementedError: “Function applied to ‘df’ if it’s the test set.”

Here is my code:

Valid_idx = range(288898, 412698)
data = TabularDataBunch.from_df(‘PATH’, train_raw, dep_var=‘SalePrice’ , valid_idx=valid_idx, procs = procs, cat_names =cat)

PATH is my system path to the train CSV file.
Thank you very much for your support.

please see image

Successfully now.
My procs variable was wrong.
Formerly procs = TabularProc(cat, cont)

Now, procs= [FillMissing, Categorify, Normalize]

Thank you very much.

1 Like