NaN Loss with Tabular Learner

jgilbert2000 · August 24, 2019, 4:38pm

I’m getting NaN values for train loss and valid loss when training with a tabular learner, with training time of 0:00, and I’m getting an accuracy of about 0.55. The only main difference between my notebook and the lesson4-tabular notebook is my dataframe only has continuous variables (with the exception of dep_var, which is a boolean).

If anyone has any idea as to how I could go about troubleshooting this problem, any help would be appreciated.

Here’s some of my code:

cont_names = ['A','B','C','D'] # (where A, B,C, D are columns of the dataframe)
cat_names = []
procs = [FillMissing, Categorify, Normalize]


# Percent of original dataframe
test_pct = 0.1
valid_pct = 0.2

# Masks for separating dataframe sets
cut_test = int(test_pct * len(df))+1
cut_valid = int(valid_pct*len(df))+cut_test

valid_indx = range(cut_test,cut_valid) # range of validation indices, used for fastai
dep_var = 'result'

test = TabularList.from_df(df.iloc[cut_test:cut_valid].copy(), cat_names=cat_names, cont_names=cont_names)

data = (TabularList.from_df(df=df, path=path, cat_names=cat_names, cont_names=cont_names)
#                            .split_by_idx(valid_indx)
                           .split_by_rand_pct(0.2)
                           .label_from_df(cols=dep_var)
                           .add_test(test)
                           .databunch())

data.show_batch()

learn = tabular_learner(data, layers=[200,100],metrics=accuracy)
learn.fit(1, 1e-2)

muellerzr · August 24, 2019, 4:50pm

Try manually filling in the missing values instead of having it as a proc to double check it’s doing it properly. I had that issue a few times.

jgilbert2000 · August 25, 2019, 7:18am

Thank you, that worked