`show_batch` error when using tabular data with no continuous values

Hi,

I have a dataset with no continuous values and I just pass an empty list while creating my databunch:

data_str = (TabularList.from_df(train_df, path=path, cat_names=cat_vars, cont_names=[], procs=[Categorify])
           .split_by_idx(get_rdm_idx(train_df))
           .label_from_df(cols=dep_var, label_cls=FloatList, log=True)
           .add_test(TabularList.from_df(test_df, path=path, cat_names=cat_vars, cont_names=[])) 
           .databunch(bs=8))

When I call show_batch, I get the following error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-39-f80c2c9af08f> in <module>
----> 1 data_str.show_batch()

~/fastai/fastai/basic_data.py in show_batch(self, rows, ds_type, **kwargs)
    166             ys = [self.train_ds.y.reconstruct(grab_idx(y, i), x=x) for i,x in enumerate(xs)]
    167         else : ys = [self.train_ds.y.reconstruct(grab_idx(y, i)) for i in range(rows)]
--> 168         self.train_ds.x.show_xys(xs, ys, **kwargs)
    169 
    170     def export(self, fname:str='export.pkl'):

~/fastai/fastai/tabular/data.py in show_xys(self, xs, ys)
    137             for c, n in zip(x.cats, x.names[:len(x.cats)]):
    138                 res.append(str(x.classes[n][c]))
--> 139             res += [f'{c:.4f}' for c in x.conts] + [str(y)]
    140             items.append(res)
    141         display(HTML(text2html_table(items, [10] * len(items[0]))))

/net/vaosl01/opt/NFS/sw/anaconda3/envs/mer/lib/python3.7/site-packages/torch/tensor.py in __iter__(self)
    420         # map will interleave them.)
    421         if self.dim() == 0:
--> 422             raise TypeError('iteration over a 0-d tensor')
    423         if torch._C._get_tracing_state():
    424             warnings.warn('Iterating over a tensor might cause the trace to be incorrect. '

TypeError: iteration over a 0-d tensor

My investigating led me to this line in function show_xyzs in file data.py:

res += [f'{c:.4f}' for c in x.conts] + [str(y)]

The reason is x here is just a tensor(0) which I’m assuming is because the dataset has no continuous variables. Thus iterating over it produces an error. How can I fix this? Also, will this error only be confined with show_batch or will it come up during training as well?

Thanks.

You are trying to predict a continuous target from categorical inputs. Is your dep_var actually a float?

Yes. I am trying to predict a float (price of a product) from categorical variables.

That makes sense.
Maybe add a dummy continuous variable - a random float that the model will learn to ignore.

Wouldn’t that be corrupting the data? But you say that the model will learn to ignore which is interesting.

I thought that the library automatically adds a column of zeros for an empty list of continuous variables. When I manually retrieve a batch I can see a tensor of zeros for the continuous variables.

Indeed that’s a bug. Pushed a fix in master.

Sylvain swoops in and saves the day once again.

2 Likes