Hey there!
I hope I am posting this in the right subsection since its my first time actually writing to and not just reading this forum. I am training a text classifier and saved my results afterwards like so:
learn_cl.save("learned_cl_wb")
Then I re-init the model and load the trained instance:
model = learn_cl.load('learned_cl_wb')
I can now predict single items with the predict method, which works very well already:
model.predict("bla bla")
But now I would like to predict 30 000 sentences, and it would take a very long time to do it for each instance. If I understand it correctly, I am supposed to use the get_preds() method for these cases. But there, I get an error. My code looks like this:
df = pd.read_csv('test_data.csv', low_memory=False)
dl = DataBlock(
blocks=(TextBlock.from_df('text', vocab=dls.vocab)),
get_x=ColReader('text')
).dataloaders(df, bs=128, seq_len=72)
preds = model.get_preds(dl=dl)
The dataloader is excatly the same that I used to train the classifier in the first place, just without the get_y. I can also see that the predictions are actually made (the progress bar runs through my batches), but then an error happens, and I cant figure out why. Maybe one of you can help me? Here is the error log:
TypeError Traceback (most recent call last)
/tmp/ipykernel_874/1175197425.py in <module>
1 #dl = model.dls.test_dl(df)
----> 2 preds = model.get_preds(dl=dl)
3 preds
/opt/conda/lib/python3.7/site-packages/fastai/learner.py in get_preds(self, ds_idx, dl, with_input, with_decoded, with_loss, act, inner, reorder, cbs, **kwargs)
295 res[pred_i] = act(res[pred_i])
296 if with_decoded: res.insert(pred_i+2, getcallable(self.loss_func, 'decodes')(res[pred_i]))
--> 297 if reorder and hasattr(dl, 'get_idxs'): res = nested_reorder(res, tensor(idxs).argsort())
298 return tuple(res)
299 self._end_cleanup()
/opt/conda/lib/python3.7/site-packages/fastai/torch_core.py in tensor(x, *rest, **kwargs)
150 else as_tensor(x.values, **kwargs) if isinstance(x, (pd.Series, pd.DataFrame))
151 # else as_tensor(array(x, **kwargs)) if hasattr(x, '__array__') or is_iter(x)
--> 152 else _array2tensor(array(x), **kwargs))
153 if res.dtype is torch.float64: return res.float()
154 return res
/opt/conda/lib/python3.7/site-packages/fastai/torch_core.py in _array2tensor(x)
136 if sys.platform == "win32":
137 if x.dtype==np.int: x = x.astype(np.int64)
--> 138 return torch.from_numpy(x)
139
140 # %% ../nbs/00_torch_core.ipynb 40
TypeError: can't convert np.ndarray of type numpy.object_. The only supported types are: float64, float32, float16, complex64, complex128, int64, int32, int16, int8, uint8, and bool.
Thank you in advance!