I’m experiencing this same error.
Data loaders code:
dls_lm = DataBlock(blocks=TextBlock.from_df('text', is_lm=True),
splitter=RandomSplitter(0.1))
dls_lm.dataloaders(df, bs=128, seq_len=80)
Here is my stack trace:
/usr/local/lib/python3.6/dist-packages/numpy/core/_asarray.py:83: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
return array(a, dtype, copy=False, order=order)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-40-7a565d05bc8f> in <module>()
----> 1 dls_lm.dataloaders(df, bs=128, seq_len=80)
12 frames
/usr/local/lib/python3.6/dist-packages/fastai/text/data.py in <listcomp>(.0)
46 self.o2i = defaultdict(int, {v:k for k,v in enumerate(self.vocab) if v != 'xxfake'})
47
---> 48 def encodes(self, o): return TensorText(tensor([self.o2i [o_] for o_ in o]))
49 def decodes(self, o): return L(self.vocab[o_] for o_ in o)
50
TypeError: unhashable type: 'L'
Here are my versions of fastai, fastcore, and pandas:
Fast.ai version: 2.2.5
Fastcore version: 1.3.19
Pandas version: 1.1.5
Any ideas on the issue?
EDIT: I was able to get this working using the following code. I think I was using it wrong or in an unintended way. Perhaps documentation issue. Note: I had to manually create an is_valid
column with a boolean to tell ColSplitter
how to create the validation set.
dls_lm = DataBlock(blocks=TextBlock.from_df('text', is_lm=True),
get_x=ColReader('text'),
splitter=ColSplitter()).dataloaders(df, bs=128, seq_len=80)