Error in calculating test predictions while using cv

I’m running k fold CV as follows and i’m getting a FileNotFoundError and IndexError while calculating test set predictions per fold:

final_preds=np.zeros((sub.shape[0],sub.shape[1]))

fold = 0

for fold in range(3):
    fold += 1
    
    print('In fold:',fold)
    dls = get_dls(fold)
    learn = Learner(dls, model=net, loss_func=CrossEntropyLossFlat(), metrics=metrics, opt_func=opt_func).to_fp16()
    learn.fine_tune(3)
    
    test_dl = learn.dls.test_dl(test)
    preds, _ = learn.tta(dl=test_dl)
    
    print(f'Prediction completed in fold: {fold}')
    final_preds += preds.numpy()
    

final_preds = final_preds/3

Can’t figure out what the error is…

Error trace as follows:

---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
~/anaconda3/lib/python3.7/site-packages/fastai2/learner.py in _do_epoch_validate(self, ds_idx, dl)
    182             self.dl = dl;                                    self('begin_validate')
--> 183             with torch.no_grad(): self.all_batches()
    184         except CancelValidException:                         self('after_cancel_validate')

~/anaconda3/lib/python3.7/site-packages/fastai2/learner.py in all_batches(self)
    152         self.n_iter = len(self.dl)
--> 153         for o in enumerate(self.dl): self.one_batch(*o)
    154 

~/anaconda3/lib/python3.7/site-packages/fastai2/data/load.py in __iter__(self)
     97         self.before_iter()
---> 98         for b in _loaders[self.fake_l.num_workers==0](self.fake_l):
     99             if self.device is not None: b = to_device(b, self.device)

~/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py in __next__(self)
    344     def __next__(self):
--> 345         data = self._next_data()
    346         self._num_yielded += 1

~/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py in _next_data(self)
    855                 del self._task_info[idx]
--> 856                 return self._process_data(data)
    857 

~/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py in _process_data(self, data)
    880         if isinstance(data, ExceptionWrapper):
--> 881             data.reraise()
    882         return data

~/anaconda3/lib/python3.7/site-packages/torch/_utils.py in reraise(self)
    394             msg = KeyErrorMessage(msg)
--> 395         raise self.exc_type(msg)

FileNotFoundError: Caught FileNotFoundError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/home/harish3110/anaconda3/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
    data = fetcher.fetch(index)
  File "/home/harish3110/anaconda3/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 34, in fetch
    data = next(self.dataset_iter)
  File "/home/harish3110/anaconda3/lib/python3.7/site-packages/fastai2/data/load.py", line 107, in create_batches
    yield from map(self.do_batch, self.chunkify(res))
  File "/home/harish3110/anaconda3/lib/python3.7/site-packages/fastcore/utils.py", line 278, in chunked
    res = list(itertools.islice(it, cs))
  File "/home/harish3110/anaconda3/lib/python3.7/site-packages/fastai2/data/load.py", line 120, in do_item
    try: return self.after_item(self.create_item(s))
  File "/home/harish3110/anaconda3/lib/python3.7/site-packages/fastai2/data/load.py", line 126, in create_item
    def create_item(self, s):  return next(self.it) if s is None else self.dataset[s]
  File "/home/harish3110/anaconda3/lib/python3.7/site-packages/fastai2/data/core.py", line 289, in __getitem__
    res = tuple([tl[it] for tl in self.tls])
  File "/home/harish3110/anaconda3/lib/python3.7/site-packages/fastai2/data/core.py", line 289, in <listcomp>
    res = tuple([tl[it] for tl in self.tls])
  File "/home/harish3110/anaconda3/lib/python3.7/site-packages/fastai2/data/core.py", line 266, in __getitem__
    return self._after_item(res) if is_indexer(idx) else res.map(self._after_item)
  File "/home/harish3110/anaconda3/lib/python3.7/site-packages/fastai2/data/core.py", line 229, in _after_item
    def _after_item(self, o): return self.tfms(o)
  File "/home/harish3110/anaconda3/lib/python3.7/site-packages/fastcore/transform.py", line 187, in __call__
    def __call__(self, o): return compose_tfms(o, tfms=self.fs, split_idx=self.split_idx)
  File "/home/harish3110/anaconda3/lib/python3.7/site-packages/fastcore/transform.py", line 140, in compose_tfms
    x = f(x, **kwargs)
  File "/home/harish3110/anaconda3/lib/python3.7/site-packages/fastcore/transform.py", line 72, in __call__
    def __call__(self, x, **kwargs): return self._call('encodes', x, **kwargs)
  File "/home/harish3110/anaconda3/lib/python3.7/site-packages/fastcore/transform.py", line 82, in _call
    return self._do_call(getattr(self, fn), x, **kwargs)
  File "/home/harish3110/anaconda3/lib/python3.7/site-packages/fastcore/transform.py", line 86, in _do_call
    return x if f is None else retain_type(f(x, **kwargs), x, f.returns_none(x))
  File "/home/harish3110/anaconda3/lib/python3.7/site-packages/fastcore/dispatch.py", line 98, in __call__
    return f(*args, **kwargs)
  File "/home/harish3110/anaconda3/lib/python3.7/site-packages/fastai2/vision/core.py", line 98, in create
    return cls(load_image(fn, **merge(cls._open_args, kwargs)))
  File "/home/harish3110/anaconda3/lib/python3.7/site-packages/fastai2/vision/core.py", line 74, in load_image
    im = Image.open(fn, **kwargs)
  File "/home/harish3110/anaconda3/lib/python3.7/site-packages/PIL/Image.py", line 2766, in open
    fp = builtins.open(filename, "rb")
FileNotFoundError: [Errno 2] No such file or directory: 'data/jpeg-melanoma-512/train/ISIC_0052060.jpg'


During handling of the above exception, another exception occurred:

IndexError                                Traceback (most recent call last)
<ipython-input-24-71888c2b2688> in <module>
     12 
     13     test_dl=learn.dls.test_dl(test)
---> 14     preds, _ = learn.tta(dl=test_dl)
     15 
     16     print('Prediction completed in fold: {}'.format(str(fold)))

~/anaconda3/lib/python3.7/site-packages/fastai2/learner.py in tta(self, ds_idx, dl, n, item_tfms, batch_tfms, beta, use_max)
    536             for i in self.progress.mbar if hasattr(self,'progress') else range(n):
    537                 self.epoch = i #To keep track of progress on mbar since the progress callback will use self.epoch
--> 538                 aug_preds.append(self.get_preds(dl=dl, inner=True)[0][None])
    539         aug_preds = torch.cat(aug_preds)
    540         aug_preds = aug_preds.max(0)[0] if use_max else aug_preds.mean(0)

~/anaconda3/lib/python3.7/site-packages/fastai2/learner.py in get_preds(self, ds_idx, dl, with_input, with_decoded, with_loss, act, inner, reorder, **kwargs)
    227             for mgr in ctx_mgrs: stack.enter_context(mgr)
    228             self(event.begin_epoch if inner else _before_epoch)
--> 229             self._do_epoch_validate(dl=dl)
    230             self(event.after_epoch if inner else _after_epoch)
    231             if act is None: act = getattr(self.loss_func, 'activation', noop)

~/anaconda3/lib/python3.7/site-packages/fastai2/learner.py in _do_epoch_validate(self, ds_idx, dl)
    183             with torch.no_grad(): self.all_batches()
    184         except CancelValidException:                         self('after_cancel_validate')
--> 185         finally:                                             self('after_validate')
    186 
    187     @log_args(but='cbs')

~/anaconda3/lib/python3.7/site-packages/fastai2/learner.py in __call__(self, event_name)
    132     def ordered_cbs(self, event): return [cb for cb in sort_by_run(self.cbs) if hasattr(cb, event)]
    133 
--> 134     def __call__(self, event_name): L(event_name).map(self._call_one)
    135     def _call_one(self, event_name):
    136         assert hasattr(event, event_name)

~/anaconda3/lib/python3.7/site-packages/fastcore/foundation.py in map(self, f, *args, **kwargs)
    375              else f.format if isinstance(f,str)
    376              else f.__getitem__)
--> 377         return self._new(map(g, self))
    378 
    379     def filter(self, f, negate=False, **kwargs):

~/anaconda3/lib/python3.7/site-packages/fastcore/foundation.py in _new(self, items, *args, **kwargs)
    325     @property
    326     def _xtra(self): return None
--> 327     def _new(self, items, *args, **kwargs): return type(self)(items, *args, use_list=None, **kwargs)
    328     def __getitem__(self, idx): return self._get(idx) if is_indexer(idx) else L(self._get(idx), use_list=None)
    329     def copy(self): return self._new(self.items.copy())

~/anaconda3/lib/python3.7/site-packages/fastcore/foundation.py in __call__(cls, x, *args, **kwargs)
     45             return x
     46 
---> 47         res = super().__call__(*((x,) + args), **kwargs)
     48         res._newchk = 0
     49         return res

~/anaconda3/lib/python3.7/site-packages/fastcore/foundation.py in __init__(self, items, use_list, match, *rest)
    316         if items is None: items = []
    317         if (use_list is not None) or not _is_array(items):
--> 318             items = list(items) if use_list else _listify(items)
    319         if match is not None:
    320             if is_coll(match): match = len(match)

~/anaconda3/lib/python3.7/site-packages/fastcore/foundation.py in _listify(o)
    252     if isinstance(o, list): return o
    253     if isinstance(o, str) or _is_array(o): return [o]
--> 254     if is_iter(o): return list(o)
    255     return [o]
    256 

~/anaconda3/lib/python3.7/site-packages/fastcore/foundation.py in __call__(self, *args, **kwargs)
    218             if isinstance(v,_Arg): kwargs[k] = args.pop(v.i)
    219         fargs = [args[x.i] if isinstance(x, _Arg) else x for x in self.pargs] + args[self.maxi+1:]
--> 220         return self.fn(*fargs, **kwargs)
    221 
    222 # Cell

~/anaconda3/lib/python3.7/site-packages/fastai2/learner.py in _call_one(self, event_name)
    135     def _call_one(self, event_name):
    136         assert hasattr(event, event_name)
--> 137         [cb(event_name) for cb in sort_by_run(self.cbs)]
    138 
    139     def _bn_bias_state(self, with_bias): return bn_bias_params(self.model, with_bias).map(self.opt.state)

~/anaconda3/lib/python3.7/site-packages/fastai2/learner.py in <listcomp>(.0)
    135     def _call_one(self, event_name):
    136         assert hasattr(event, event_name)
--> 137         [cb(event_name) for cb in sort_by_run(self.cbs)]
    138 
    139     def _bn_bias_state(self, with_bias): return bn_bias_params(self.model, with_bias).map(self.opt.state)

~/anaconda3/lib/python3.7/site-packages/fastai2/callback/core.py in __call__(self, event_name)
     22         _run = (event_name not in _inner_loop or (self.run_train and getattr(self, 'training', True)) or
     23                (self.run_valid and not getattr(self, 'training', False)))
---> 24         if self.run and _run: getattr(self, event_name, noop)()
     25         if event_name=='after_fit': self.run=True #Reset self.run to True at each end of fit
     26 

~/anaconda3/lib/python3.7/site-packages/fastai2/callback/core.py in after_validate(self)
     94         "Concatenate all recorded tensors"
     95         if self.with_input:     self.inputs  = detuplify(to_concat(self.inputs, dim=self.concat_dim))
---> 96         if not self.save_preds: self.preds   = detuplify(to_concat(self.preds, dim=self.concat_dim))
     97         if not self.save_targs: self.targets = detuplify(to_concat(self.targets, dim=self.concat_dim))
     98         if self.with_loss:      self.losses  = to_concat(self.losses)

~/anaconda3/lib/python3.7/site-packages/fastai2/torch_core.py in to_concat(xs, dim)
    211 def to_concat(xs, dim=0):
    212     "Concat the element in `xs` (recursively if they are tuples/lists of tensors)"
--> 213     if is_listy(xs[0]): return type(xs[0])([to_concat([x[i] for x in xs], dim=dim) for i in range_of(xs[0])])
    214     if isinstance(xs[0],dict):  return {k: to_concat([x[k] for x in xs], dim=dim) for k in xs[0].keys()}
    215     #We may receives xs that are not concatenatable (inputs of a text classifier for instance),

IndexError: list index out of range

If you check the first error in the trace, it’s a file not found. Make sure your getters are set up in such a way that if you rely on different directories, those directories can be passed

@muellerzr I didn’t quite get what you meant there. Each fold is getting trained correctly but getting the test predictions is where the error is I guess.

My get_dls function is defined as above which is where I guess the problem lies (mostly the way i’m getting the fold data probably, I added a kfold column which is generated using stratified kfold of sklearn…)

def get_dls(fold, bs=32, size=224):
    # Getting specific fold data
    df_dls = df.copy()
    df_dls = df_dls[df_dls['kfold'] == fold].copy()
    
    dblock = DataBlock(
        blocks = (ImageBlock, CategoryBlock),
        get_x = ColReader('image_name', pref=path/'train', suff='.jpg'),
        get_y = ColReader('benign_malignant'),
        #splitting data based on triple stratified kernel provided here https://www.kaggle.com/c/siim-isic-melanoma-classification/discussion/165526
        splitter = splitter,
        # Implementing albumentations transforms
        item_tfms = [Resize(460), train_tfms], 
        batch_tfms=[*aug_transforms(size=size, min_scale=0.75),
                               Normalize.from_stats(*imagenet_stats)])
    dls = dblock.dataloaders(df_dls, bs=bs) 
    return dls

Yes, notice this is the real error in your trace. Is that a real file?

Yeah. That’s wrong. That’s an image in the test set.

So my test dl is using my train set getters for the file

get_x = ColReader('image_name', pref=path/'train', suff='.jpg')

How do I define a specific getter for my test_dl?

I figured out a way to do this, let me go see what I did… (will edit with an answer)

1 Like

BTW @muellerzr

Does this seem like the right way to get my fold data

And passing this to the dataloader as follows:

In my head it seems alright… :sweat_smile:

@harish3110 the answer is proper data preparation :wink: Since they share a base path but have different parent folders, we’ll modify our internal DataFrames to do 90% of the path for us:

df['image_name'] = 'jpeg/train/' + df['image_name'] + '.jpg'

So now the get_x is: get_x = lambda x:base_dir/x['image_name'] (you can obviously re-write this with ColReader, I just like lambdas)

Now, to get our test set we need to do the same but point to jpeg/test:

test_df['image_name'] = 'jpeg/test/' + test_df['image_name'] + '.jpg'

And we’re good to go.

1 Like

Wouldn’t it be easier to instead use an IndexSplitter and just pass in the indicies? What is your splitter here? (It’s global so I don’t see it there)

So if it’s a 3 fold, I store the 3 set of indices and just give it to splitter while building the datablock?

Quite literally that. Or you can modify the splitter and not have to re-declare it in the for loop. Something like:

# Assume `melanoma` is already defined
melanoma.splitter = IndexSplitter(val_idxs)

Where val_idxs come from your folds

2 Likes