TabularPandas: TypeError: unhashable type: 'L' (fastai v2)

robertritz · February 17, 2021, 1:25pm

I’m experiencing this same error.

Data loaders code:

dls_lm = DataBlock(blocks=TextBlock.from_df('text', is_lm=True), 
                   splitter=RandomSplitter(0.1))

dls_lm.dataloaders(df, bs=128, seq_len=80)

Here is my stack trace:

/usr/local/lib/python3.6/dist-packages/numpy/core/_asarray.py:83: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
  return array(a, dtype, copy=False, order=order)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-40-7a565d05bc8f> in <module>()
----> 1 dls_lm.dataloaders(df, bs=128, seq_len=80)

12 frames
/usr/local/lib/python3.6/dist-packages/fastai/text/data.py in <listcomp>(.0)
     46             self.o2i = defaultdict(int, {v:k for k,v in enumerate(self.vocab) if v != 'xxfake'})
     47 
---> 48     def encodes(self, o): return TensorText(tensor([self.o2i  [o_] for o_ in o]))
     49     def decodes(self, o): return L(self.vocab[o_] for o_ in o)
     50 

TypeError: unhashable type: 'L'

Here are my versions of fastai, fastcore, and pandas:

Fast.ai version: 2.2.5
Fastcore version: 1.3.19
Pandas version: 1.1.5

Any ideas on the issue?

EDIT: I was able to get this working using the following code. I think I was using it wrong or in an unintended way. Perhaps documentation issue. Note: I had to manually create an is_valid column with a boolean to tell ColSplitter how to create the validation set.

dls_lm = DataBlock(blocks=TextBlock.from_df('text', is_lm=True),
                   get_x=ColReader('text'),
                   splitter=ColSplitter()).dataloaders(df, bs=128, seq_len=80)