I am getting this error while doing inference with my tabular learner using the predict function.
At first I thought it could be a conflict between libraries, but after installing older versions i’m still getting the same result. My pandas version is 1.1.0 and my fastai version is 2.3.1. After that I tried executing the same code in one of the Colab notebooks and i got the same error, which makes me think there is a problem with the way i manage my dataframe, here are the dtypes: (not all the variables are used in the model)
Date datetime64[ns]
pmc int8
growth float32
credit float32
revenue_last_m float32
revenue_last_2_m float32
revenue_last_3_m float32
credit_last_year float32
Year int16
Month int8
Week int8
Day int8
Dayofweek int8
Dayofyear int16
Is_month_end bool
Is_month_start bool
Is_quarter_end bool
Is_quarter_start bool
Is_year_end bool
Is_year_start bool
Elapsed float32
WeekOfMonth int8
is_holiday bool
next_holiday int8
previous_holiday int8
dtype: object
Can anyone help me with this? I’m at a loss and don’t really know what else to try.
Thanks in advance!
There seems to be a bug in fastai.tabular.all.TabularDataLoaders.from_df() where bool is seen as an object instead of being accepted as-is. Try converting all bool columns to ‘uint8’. Bug still exists in fastai 2.5.2 and pytorch 3.9.
# workaround for fastai/pytorch bug where bool is treated as object and thus erroring out.
for n in df:
if pd.api.types.is_bool_dtype(df[n]):
df[n] = df[n].astype('uint8')
You hit the mark! That was actually the issue. Removing the variables from the dataframe if not used in the model or casting them with another type as you suggested solve the bug.
Hmm… I’m getting this error despite removing any unused columns, and not having any bool columns, and even running the above type conversion code.
Specificly the problematic line is in the tutorial (with my own data);
row, clas, probs = learn.predict(df.iloc[0])
(where df is the dataframe after I removed unused columns and did the conversion, it is the same dataframe used to define the dls and the learner.)
I don’t understand why this error is occurring given that when I run df.dtypes I get a listing of things like int64, float64 and object– same as when I run df.dtypes in the original tabular data tutorial.
So eventually I figured out what it was. I was giving learn.predict() one row (as a Pandas Series) from my test df. Series don’t have type information when extracted from DataFrames (unless all columns have the same type) – it’ll just have the object dtype. Normally this is fine since fastai converts categorical and continuous columns to the right types, but in my case my row also included the y column. fastai didn’t know how to convert it, which produced the above error. The solution was to drop it from the series.