Error while attempting to one_batch: "Could not infer dtype of PILImage" (SOLVED)

grcarmenaty · July 3, 2021, 11:43pm

Hi,
I’m trying to simply show a batch of a dataset I have created. My code is as follows (I’m removing a lot of fluff for simplicity’s sake, but I am confident the code running before this is not related to the error, I have checked):

from fastai.vision.all import *
from pathlib import Path

def random_seed(seed_value):
    import random 
    random.seed(seed_value) # Python
    import numpy as np
    np.random.seed(seed_value) # cpu vars
    import torch
    torch.manual_seed(seed_value) # cpu  vars
        
    if torch.cuda.is_available(): 
        torch.cuda.manual_seed(seed_value)
        torch.cuda.manual_seed_all(seed_value) # gpu vars
        torch.backends.cudnn.deterministic = True  #needed
        torch.backends.cudnn.benchmark = False

path = Path.cwd() / "data"
seed = np.random.randint(0, 1000)
random_seed(seed)
data = ImageDataLoaders.from_folder(data_path.parent, train="train", valid="valid", bs=32, num_workers=0)
data.one_batch()

Inside my data folder three folders exist: train, valid and test, each with three sub-folders corresponding to the categories for a classifying CNN I want to build. A sample of the kind of image that can be found in these folders can be seen here:

pristine_1_0_0.0491304347826087_0.0

This is, actually, the first image one_batch is trying to show. I am using fastai freshly cloned from the github repo and have been trying to debug what is happening on my own, but despite my best efforts, all I can get is:

Could not do one pass in your dataloader, there is something wrong in it
Traceback (most recent call last):
  File "/home/grcarmenaty/paper-1/doe.py", line 206, in <module>
    model_training()
  File "/home/grcarmenaty/paper-1/doe.py", line 198, in model_training
    doe.run(func, ["Accuracy", "Damage threshold", "Improvement threshold"], repetitions=25)
  File "/home/grcarmenaty/paper-1/taguchi.py", line 241, in run
    result = func(list(experimental_run.iloc[row, :]) + [repetition])
  File "/home/grcarmenaty/paper-1/doe.py", line 138, in func
    seed, epoch = train_model(path, model, model_name, repetition)
  File "/home/grcarmenaty/paper-1/doe.py", line 27, in train_model
    data.show_batch()
  File "/home/grcarmenaty/fastai/fastai/data/core.py", line 100, in show_batch
    if b is None: b = self.one_batch()
  File "/home/grcarmenaty/fastai/fastai/data/load.py", line 148, in one_batch
    with self.fake_l.no_multiproc(): res = first(self)
  File "/home/grcarmenaty/paper-1-env/lib/python3.8/site-packages/fastcore/basics.py", line 547, in first
    return next(x, None)
  File "/home/grcarmenaty/fastai/fastai/data/load.py", line 109, in __iter__
    for b in _loaders[self.fake_l.num_workers==0](self.fake_l):
  File "/home/grcarmenaty/paper-1-env/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 521, in __next__
    data = self._next_data()
  File "/home/grcarmenaty/paper-1-env/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 561, in _next_data
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
  File "/home/grcarmenaty/paper-1-env/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 34, in fetch
    data = next(self.dataset_iter)
  File "/home/grcarmenaty/fastai/fastai/data/load.py", line 118, in create_batches
    yield from map(self.do_batch, self.chunkify(res))
  File "/home/grcarmenaty/paper-1-env/lib/python3.8/site-packages/fastcore/basics.py", line 216, in chunked
    res = list(itertools.islice(it, chunk_sz))
  File "/home/grcarmenaty/fastai/fastai/data/load.py", line 133, in do_item
    try: return self.after_item(self.create_item(s))
  File "/home/grcarmenaty/paper-1-env/lib/python3.8/site-packages/fastcore/transform.py", line 200, in __call__
    def __call__(self, o): return compose_tfms(o, tfms=self.fs, split_idx=self.split_idx)
  File "/home/grcarmenaty/paper-1-env/lib/python3.8/site-packages/fastcore/transform.py", line 150, in compose_tfms
    x = f(x, **kwargs)
  File "/home/grcarmenaty/paper-1-env/lib/python3.8/site-packages/fastcore/transform.py", line 73, in __call__
    def __call__(self, x, **kwargs): return self._call('encodes', x, **kwargs)
  File "/home/grcarmenaty/paper-1-env/lib/python3.8/site-packages/fastcore/transform.py", line 83, in _call
    return self._do_call(getattr(self, fn), x, **kwargs)
  File "/home/grcarmenaty/paper-1-env/lib/python3.8/site-packages/fastcore/transform.py", line 90, in _do_call
    res = tuple(self._do_call(f, x_, **kwargs) for x_ in x)
  File "/home/grcarmenaty/paper-1-env/lib/python3.8/site-packages/fastcore/transform.py", line 90, in <genexpr>
    res = tuple(self._do_call(f, x_, **kwargs) for x_ in x)
  File "/home/grcarmenaty/paper-1-env/lib/python3.8/site-packages/fastcore/transform.py", line 89, in _do_call
    return retain_type(f(x, **kwargs), x, ret)
  File "/home/grcarmenaty/paper-1-env/lib/python3.8/site-packages/fastcore/dispatch.py", line 118, in __call__
    return f(*args, **kwargs)
  File "/home/grcarmenaty/fastai/fastai/vision/core.py", line 223, in encodes
    def encodes(self, o:PILBase): return o._tensor_cls(image2tensor(o))
  File "/home/grcarmenaty/fastai/fastai/vision/core.py", line 93, in image2tensor
    res = tensor(img)
  File "/home/grcarmenaty/fastai/fastai/torch_core.py", line 134, in tensor
    else as_tensor(x, **kwargs) if hasattr(x, '__array__') or is_iter(x)
RuntimeError: Could not infer dtype of PILImage

I don’t know how to proceed further, I have checked everything I know to check. My most elaborate attempt was converting the PIL image to a numpy array, forcing it to have dtype np.uint8 and converting it back to PIL image just before fastai/fastai/torch_core.py line 134. Nothing has worked so far.

I must add this exact same code with the exact same input data works on Windows 10 in an environment with fastai 2.3.1 and python 3.8.10 created through conda, but not in my Ubuntu 20.04 with an environment created through python venv running fastai from the github repo and python 3.8.10 just the same. I have tried to do all this with fastai 2.3.1 in Ubuntu to no avail.

If someone can help I will be grateful, and I hope someone can benefit from this, I haven’t found any helpful information regarding this situation. If additional data is needed I remain at your disposal.

idraja · July 4, 2021, 1:17am

It sounds like you have your folder structure set up the same way as Imagenette.

Look at the imagenette tutorial and you’ll see the data loader is setup like this

dls = ImageDataLoaders.from_folder(path, valid='val', 
    item_tfms=RandomResizedCrop(128, min_scale=0.35), batch_tfms=Normalize.from_stats(*imagenet_stats))

in your case, path = path.cwd()/data. But make sure you have the right path, as path.cwd() will change based on where you have your notebook. Also note that you use data_path.parent in ImageDataLoaders.from_folder, but you have not included the assignment of that variable in your code so I don’t know if that’s correct.

Hope this helps.

szantamano · July 4, 2021, 12:41pm

I’m getting very similar error, can you please write the solution here when you have fixed this?

grcarmenaty · July 4, 2021, 8:07pm

@idraja Thank you for your reply, I have indeed attempred to replicate that folder structure.
I have checked and in the step previous to the error the code is actually working with the image I posted. I was able to open it with the PIL module and save a copy of it while in debug, so I must conclude the path is not the problem. Furthermore if I put a wrong path I get a different error.

@szantamano Today I will try to create a DataLoader from torch and see if sidestepping what ImageDataLoaders is doing solves it. A more elegant solution eludes me, if it works, I’ll let you know.

grcarmenaty · July 4, 2021, 9:56pm

Hey, so, I solved it. In my attempt to sidestep ImageDataLoaders somehow I started trying to convert my PIL image to tensor manually before fastai converted it (before the code got to fastai/fastai/torch_core.py line 134). In doing so, I used:

transforms.ToTensor()(img).unsqueeze_(0)

as found in

I then got the following error:

TypeError: __array__() takes 1 positional argument but 2 were given

So I looked around and found this thread:

There, it suggests using PIL 8.2.0 instead of PIL 8.3.0. I don’t know why, but it works. Hope this helps anyone else having this problem. @szantamano tell me if it works for you.

szantamano · July 4, 2021, 11:16pm

I have tried reinstalling everything with conda instead of pip, and that seemed to have fixed the problem. It looks like my issue was because of package versions too

grongrilla · July 5, 2021, 2:38pm

I just started working on fastbook on a fresh Azure VM created with the script provided in the course and the very first example did not work with the error message mentioned here.

I can confirm that downgrading with

conda install Pillow=8.2.0

did the trick for me. I pinned Pillow to 8.2.0 in the fastai2 environment.

grongrilla · July 7, 2021, 7:47am

FYI: Pillow 8.3.1 was just released which seems to have a hotfix for a related issue, that may also fix this behaviour:
Pillow Issue #5571

scottab · July 9, 2021, 5:47pm

+1 - I have has just started working on fastbook but with AWS. Was facing the same error and downgrading with mamba has resolved the issue for me as well
mambda install Pillow=8.2.0