I am getting this error while doing inference with my tabular learner using the predict function.
At first I thought it could be a conflict between libraries, but after installing older versions i’m still getting the same result. My pandas version is 1.1.0 and my fastai version is 2.3.1. After that I tried executing the same code in one of the Colab notebooks and i got the same error, which makes me think there is a problem with the way i manage my dataframe, here are the dtypes: (not all the variables are used in the model)
Can anyone help me with this? I’m at a loss and don’t really know what else to try.
Thanks in advance!
There seems to be a bug in fastai.tabular.all.TabularDataLoaders.from_df() where
bool is seen as an
object instead of being accepted as-is. Try converting all
bool columns to ‘uint8’. Bug still exists in fastai 2.5.2 and pytorch 3.9.
# workaround for fastai/pytorch bug where bool is treated as object and thus erroring out.
for n in df:
df[n] = df[n].astype('uint8')
You hit the mark! That was actually the issue. Removing the variables from the dataframe if not used in the model or casting them with another type as you suggested solve the bug.
Hmm… I’m getting this error despite removing any unused columns, and not having any
bool columns, and even running the above type conversion code.
Specificly the problematic line is in the tutorial (with my own data);
row, clas, probs = learn.predict(df.iloc)
df is the dataframe after I removed unused columns and did the conversion, it is the same dataframe used to define the dls and the learner.)
I don’t understand why this error is occurring given that when I run
df.dtypes I get a listing of things like
object– same as when I run
df.dtypes in the original tabular data tutorial.
My colab link: Google Colab
Any other suggestions?
@drscotthawley I’m running into the exact same issue – did you ever figure out a way to solve it?
So eventually I figured out what it was. I was giving
learn.predict() one row (as a Pandas Series) from my test df. Series don’t have type information when extracted from DataFrames (unless all columns have the same type) – it’ll just have the
object dtype. Normally this is fine since fastai converts categorical and continuous columns to the right types, but in my case my row also included the y column. fastai didn’t know how to convert it, which produced the above error. The solution was to drop it from the series.