IndexError: list index out of range for `get_pred`

I loaded two learners saved by export function

learn_fwd = load_learner(path + '/fwd_learn_c')
learn_bwd = load_learner(path + '/bwd_learn_c')

When I tried to call predict on them

pred_fwd,lbl_fwd = learn_fwd.get_preds(ordered=True)

I recieve the following error:


IndexError Traceback (most recent call last)
in
----> 1 pred_fwd,lbl_fwd = learn_fwd.get_preds(ordered=True)

~/miniconda3/lib/python3.7/site-packages/fastai/text/learner.py in get_preds(self, ds_type, activ, with_loss, n_batch, pbar, ordered)
88 self.model.reset()
89 if ordered: np.random.seed(42)
—> 90 preds = super().get_preds(ds_type=ds_type, activ=activ, with_loss=with_loss, n_batch=n_batch, pbar=pbar)
91 if ordered and hasattr(self.dl(ds_type), ‘sampler’):
92 np.random.seed(42)

~/miniconda3/lib/python3.7/site-packages/fastai/basic_train.py in get_preds(self, ds_type, activ, with_loss, n_batch, pbar)
339 callbacks = [cb(self) for cb in self.callback_fns + listify(defaults.extra_callback_fns)] + listify(self.callbacks)
340 return get_preds(self.model, self.dl(ds_type), cb_handler=CallbackHandler(callbacks),
–> 341 activ=activ, loss_func=lf, n_batch=n_batch, pbar=pbar)
342
343 def pred_batch(self, ds_type:DatasetType=DatasetType.Valid, batch:Tuple=None, reconstruct:bool=False,

~/miniconda3/lib/python3.7/site-packages/fastai/basic_train.py in get_preds(model, dl, pbar, cb_handler, activ, loss_func, n_batch)
42 “Tuple of predictions and targets, and optional losses (if loss_func) using dl, max batches n_batch.”
43 res = [to_float(torch.cat(o).cpu()) for o in
—> 44 zip(*validate(model, dl, cb_handler=cb_handler, pbar=pbar, average=False, n_batch=n_batch))]
45 if loss_func is not None:
46 with NoneReduceOnCPU(loss_func) as lf: res.append(lf(res[0], res[1]))

~/miniconda3/lib/python3.7/site-packages/fastai/basic_train.py in validate(model, dl, loss_func, cb_handler, pbar, average, n_batch)
55 val_losses,nums = [],[]
56 if cb_handler: cb_handler.set_dl(dl)
—> 57 for xb,yb in progress_bar(dl, parent=pbar, leave=(pbar is not None)):
58 if cb_handler: xb, yb = cb_handler.on_batch_begin(xb, yb, train=False)
59 val_loss = loss_batch(model, xb, yb, loss_func, cb_handler=cb_handler)

~/miniconda3/lib/python3.7/site-packages/fastprogress/fastprogress.py in iter(self)
45 except Exception as e:
46 self.on_interrupt()
—> 47 raise e
48
49 def update(self, val):

~/miniconda3/lib/python3.7/site-packages/fastprogress/fastprogress.py in iter(self)
39 if self.total != 0: self.update(0)
40 try:
—> 41 for i,o in enumerate(self.gen):
42 if i >= self.total: break
43 yield o

~/miniconda3/lib/python3.7/site-packages/fastai/basic_data.py in iter(self)
73 def iter(self):
74 “Process and returns items from DataLoader.”
—> 75 for b in self.dl: yield self.proc_batch(b)
76
77 @classmethod

~/miniconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py in iter(self)
277 return _SingleProcessDataLoaderIter(self)
278 else:
–> 279 return _MultiProcessingDataLoaderIter(self)
280
281 @property

~/miniconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py in init(self, loader)
744 # prime the prefetch loop
745 for _ in range(2 * self._num_workers):
–> 746 self._try_put_index()
747
748 def _try_get_data(self, timeout=_utils.MP_STATUS_CHECK_INTERVAL):

~/miniconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py in _try_put_index(self)
859 assert self._tasks_outstanding < 2 * self._num_workers
860 try:
–> 861 index = self._next_index()
862 except StopIteration:
863 return

~/miniconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py in _next_index(self)
337
338 def _next_index(self):
–> 339 return next(self._sampler_iter) # may raise StopIteration
340
341 def _next_data(self):

~/miniconda3/lib/python3.7/site-packages/torch/utils/data/sampler.py in iter(self)
198 def iter(self):
199 batch = []
–> 200 for idx in self.sampler:
201 batch.append(idx)
202 if len(batch) == self.batch_size:

~/miniconda3/lib/python3.7/site-packages/fastai/text/data.py in iter(self)
103 def len(self) -> int: return len(self.data_source)
104 def iter(self):
–> 105 return iter(sorted(range_of(self.data_source), key=self.key, reverse=True))
106
107 class SortishSampler(Sampler):

IndexError: list index out of range

learn.export does not save any of your DataLoaders. Only your model and how to make the DataLoaders.

@muellerzr Could you please elaborate what should I do before I call get_preds , I thought load_learner retrive anything, what else should I call?

@muellerzr

Okay, I searched more and refined the code as following to load a data set:

df = pd.read_csv(path + '/digi.csv')

test = (TextList.from_df(df, path, cols='texts'))

learn_fwd = load_learner(path + '/fwd_learn_c', test=test)
learn_bwd = load_learner(path + '/bwd_learn_c', test=test)

but still get the same error on the get_preds.

Please guide me what should I do?

@muellerzr

Okay, it seems I had to pass ds_type=DatasetType.Test to get_preds

test = (TextList.from_df(df, path, cols='texts'))

learn_fwd = load_learner(path + '/fwd_learn_c', test=test)

pred_fwd,lbl_fwd = learn_fwd.get_preds(ds_type=DatasetType.Test)

For future readers!

@muellerzr

my new problem is that I can’t load the labels!! and lbl_fwd is a zero vector!!!

Wish you could guide me to a complete example how I should run a classifier against a test set

A quick search on the forums would have brought these up or in the documentation.

The inference tutorial is in the documentation:

https://docs.fast.ai/tutorial.inference.html

And for labeled test sets there is a hack. It’s not needed in fastai2

I’ve been struggling with a similar issue of getting predictions out of my simple collaborative filtering model with fastai2.

Using fastbook/08_collab.ipynb as the primary resource I’ve been able to train my model and successfully interpret it and call get_preds.

I’m curious why loading training data is needed after exporting the model to make a prediction in production.

For example, in the answer to this forum question Load data and model:

If you want to train further with the exported model, I’d suggest something like…

Is my mental model wrong and calling predict with an exported model ought to be conceived as further training, which is why I need to load training data? I see the first line in the code is dl = self.dls.test_dl([item], rm_type_tfms=rm_type_tfms, num_workers=0) which implies test data is necessary, even prior to calling get_preds.

I was under the impression predict was intended to be called live in production.

I appreciate any guidance and/or pointer to further documentation.

It’s not. Instead the blueprint for how to make the data is needed.

Yes. No data is stored upon export, only the blueprint, so thus you must pre-process the data before you can move on so that the model expects data how you would like. To not piggy back onto this thread more, if you have any further questions please post on the #fastai-users:fastai-v2 subforum, specifically the v2 chat thread.

1 Like

For the future reader, I solved the problem of labels returned by the get_preds with the following code:

# Create your test set:
data_test = (TabularList.from_df(df, path=path, cat_names=cat_names, cont_names=cont_names, procs=procs)
                           .split_none()
                           .label_from_df(cols=dep_var))
data_test.valid = data_test.train
data_test=data_test.databunch()

# Set the validation set of the learner by the test data you created
learn.data.valid_dl = data_test.valid_dl

# Now y refers to the actual labels in the data set
preds, y = learn.get_preds(ds_type=DatasetType.Valid)
accuracy(preds, y)