I am trying to use fastai.tabular for a regression problem. I have done feature engineering and all good.
Here is what I have
Learn.recorder.plot() returns nothing and Training returns NaN for all the losses for few epochs.
I am able to use xgboost/lgbm to train the data.
Any ideas? What should I test?
Try setting the skip start and end parameters in recorder.plot() eg:
Ive found this useful sometimes after unfreeze, may also work for your issue above.
I just read your losses are nan-so nothing is going to work till you fix that. I recall a post/some code by sgugger on how to debug nan losses, I’ll see if I can find it.
Here it is, may help you debug:
This file has been truncated.
"from fastai.torch_core import *\n",
"from fastai.data import DataBunch\n",
"from fastai.callback import *\n",
"from fastai.basic_train import Learner, LearnerCallback"
As your data is tabular-check your data types, for example see the tabulat.ipynb in fastai examples where dtypes are int64, object or float64.
Try a split by indexes instead of by pct
I still can’t resolve the issue. I followed the example and change dtypes accordingly(float64 and int64).
Use the call back function comfirmed the NAN loss issue. However, I have no idea how to debug it. Any suggessions? What things should I try?
I would try cutting down the training dataframe to a small subset of columns, eg 4 categorical and 2 continuous. See if that runs. Then keep expanding the columns till you work out where problem is.
Also, do a git pull to update fastai to latest version
I had the same issue…try reducing batch size. It worked for me!