TypeError on learn.fit() after one cycle

Luckymator · November 30, 2018, 11:12am

Hello everybody,
I just started out with fastai v1, after following along the courses with 0.7. Now I want to try the House Prices Competition from Kaggle.
I tried to follow the tabular example from the fastai git.
I read in the data into pandas, declare the dep., categorical and continuous variables, create a Tabluar learner and run fit(). (Code below)

The function runs until it reaches 100%, however before completion, it crashes with one of the following messages:

TypeError: batch must contain tensors, numbers, dicts or lists; found <class 'NoneType>.

TypeError: an integer is required (got type NoneType)

Does this mean, there are incompatible values in the data, or is there something going wrong inside of the library?

Thanks for your help!

path='data/house/'
df = pd.read_csv(path + 'train.csv')

dep_var = 'SalePrice'
cat_names = ['MSSubClass', 'MSZoning', 'Street', 'Alley', 'LotShape', 'LandContour',
        'Utilities', 'LotConfig', 'LandSlope', 'Neighborhood', 'Condition1', 'Condition2',
        'BldgType', 'HouseStyle', 'OverallQual', 'OverallCond', 'YearBuilt', 'YearRemodAdd',
        'RoofStyle', 'RoofMatl', 'Exterior1st', 'Exterior2nd', 'MasVnrType', 'ExterQual', 'ExterCond',
        'Foundation', 'BsmtQual', 'BsmtCond', 'BsmtExposure', 'BsmtFinType1', 'BsmtFinType2',
        'Heating', 'HeatingQC', 'CentralAir', 'Electrical', 'KitchenQual', 'Functional', 'FireplaceQu',
        'GarageType', 'GarageYrBlt', 'GarageFinish', 'GarageQual', 'GarageCond', 'PavedDrive',
        'PoolQC', 'Fence', 'MiscFeature', 'MoSold', 'YrSold', 'SaleType', 'SaleCondition']
cont_names = ['LotFrontage', 'LotArea', 'MasVnrArea', 'BsmtFinSF1', 'BsmtFinSF2', 'BsmtUnfSF', 'TotalBsmtSF',
         '1stFlrSF', '2ndFlrSF', 'LowQualFinSF', 'GrLivArea', 'BsmtFullBath', 'BsmtHalfBath', 
         'FullBath', 'HalfBath', 'BedroomAbvGr', 'KitchenAbvGr', 'TotRmsAbvGrd', 'Fireplaces', 'GarageCars', 
          'GarageArea', 'WoodDeckSF', 'OpenPorchSF', 'EnclosedPorch', '3SsnPorch', 'ScreenPorch', 'PoolArea', 'MiscVal']
procs = [FillMissing, Categorify, Normalize]

train_df = df
valid_df = df.sample(frac=0.2)
valid_idx = valid_df.index

data = TabularDataBunch.from_df(
    path, train_df, dep_var, valid_idx=valid_idx, procs=procs, cat_names=cat_names, cont_names=cont_names
)

learn = tabular_learner(data, layers=[100,100], metrics=accuracy)
learn.fit(1, 1e-2)

Stack Trace:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-8-d0518fa38950> in <module>
      1 learn = tabular_learner(data, layers=[100,100], metrics=accuracy)
----> 2 learn.fit(1, 1e-2)

~/fastai/lib/python3.6/site-packages/fastai/basic_train.py in fit(self, epochs, lr, wd, callbacks)
    160         callbacks = [cb(self) for cb in self.callback_fns] + listify(callbacks)
    161         fit(epochs, self.model, self.loss_func, opt=self.opt, data=self.data, metrics=self.metrics,
--> 162             callbacks=self.callbacks+callbacks)
    163 
    164     def create_opt(self, lr:Floats, wd:Floats=0.)->None:

~/fastai/lib/python3.6/site-packages/fastai/basic_train.py in fit(epochs, model, loss_func, opt, data, callbacks, metrics)
     92     except Exception as e:
     93         exception = e
---> 94         raise e
     95     finally: cb_handler.on_train_end(exception)
     96 

~/fastai/lib/python3.6/site-packages/fastai/basic_train.py in fit(epochs, model, loss_func, opt, data, callbacks, metrics)
     87             if hasattr(data,'valid_dl') and data.valid_dl is not None and data.valid_ds is not None:
     88                 val_loss = validate(model, data.valid_dl, loss_func=loss_func,
---> 89                                        cb_handler=cb_handler, pbar=pbar)
     90             else: val_loss=None
     91             if cb_handler.on_epoch_end(val_loss): break

~/fastai/lib/python3.6/site-packages/fastai/basic_train.py in validate(model, dl, loss_func, cb_handler, pbar, average, n_batch)
     47     with torch.no_grad():
     48         val_losses,nums = [],[]
---> 49         for xb,yb in progress_bar(dl, parent=pbar, leave=(pbar is not None)):
     50             if cb_handler: xb, yb = cb_handler.on_batch_begin(xb, yb, train=False)
     51             val_losses.append(loss_batch(model, xb, yb, loss_func, cb_handler=cb_handler))

~/fastai/lib/python3.6/site-packages/fastprogress/fastprogress.py in __iter__(self)
     63         self.update(0)
     64         try:
---> 65             for i,o in enumerate(self._gen):
     66                 yield o
     67                 if self.auto_update: self.update(i+1)

~/fastai/lib/python3.6/site-packages/fastai/basic_data.py in __iter__(self)
     67         "Process and returns items from `DataLoader`."
     68         assert not self.skip_size1 or self.batch_size > 1, "Batch size cannot be one if skip_size1 is set to True"
---> 69         for b in self.dl:
     70             y = b[1][0] if is_listy(b[1]) else b[1]
     71             if not self.skip_size1 or y.size(0) != 1: yield self.proc_batch(b)

~/fastai/lib/python3.6/site-packages/torch/utils/data/dataloader.py in __next__(self)
    636                 self.reorder_dict[idx] = batch
    637                 continue
--> 638             return self._process_next_batch(batch)
    639 
    640     next = __next__  # Python 2 compatibility

~/fastai/lib/python3.6/site-packages/torch/utils/data/dataloader.py in _process_next_batch(self, batch)
    657         self._put_indices()
    658         if isinstance(batch, ExceptionWrapper):
--> 659             raise batch.exc_type(batch.exc_msg)
    660         return batch
    661

wangyun · December 4, 2018, 10:36pm

Did you figure it out? I met the similar error here.

Romandovega · December 5, 2018, 2:27am

Me as well

steffenix · December 5, 2018, 5:13am

I have opened an issue on github

Luckymator · December 5, 2018, 7:49am

No i could not figure it out.
I was surprised that this was ignored so far.
Lets see, what the github issue can bring up!
(The first comment there was also me)

quantotto · December 5, 2018, 11:26am

Commented on @steffenix GitHub issue. Posting here too. Got a bit closer to root causing, but not there yet.

The exception is in the default_collate function of PyTorch’s dataloader.py: it checks type of first item in provided batch and decides what to do next. It fails to determine the right type if first entry is None.

In case that first element is not None, it will fail at torch.LongTensor() constructor if one of the values in the batch is None.
So, that’s the reason for two different error messages, but kind of the same root cause.

Just for test, I replaced all None values with zero and also located first not None element in batch and epoch completes successfully, but it probably produces completely wrong model (output of accuracy was ~0.01)

Not sure what is the origin of those None-s. Is it because a lot of values in original data are NaN or maybe there is some mismatch with categorical buckets? didn’t have more time to research…