RuntimeError: number of dims don't match in permute

yang-zhang · September 11, 2019, 8:46pm

I ran into this error with notebook 08_pets_tutorial:

img = resized_image(items[0])
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-58-d76e911b214c> in <module>
----> 1 img = resized_image(items[0])

<ipython-input-13-4b2ef1648cbb> in resized_image(fn, sz)
      2     x = Image.open(fn).resize((sz,sz))
      3     # Convert image to tensor for modeling
----> 4     return tensor(array(x)).permute(2,0,1).float()/255.

RuntimeError: number of dims don't match in permute

Looks like this is caused by the fact that for some images (there are 9 of them) in URLs.PETS dataset, array(Image.open(fn)) returns an array with ndim=2 instead of 3. More details in this notebook: https://gist.github.com/yang-zhang/bea20a7e9791c341507e934e53a107fe

What’s the best way to handle these images? Thank you.

sgugger · September 11, 2019, 8:54pm

There is probably a convert('RGB') missing after opening the image with PIL.

KevinB · September 11, 2019, 9:04pm

I’m looking at this too. This is the image of one of the 2 dim images:

Which seems really strange that it could have color. Here is the file I’m seeing this on:

PosixPath(’/home/kbird/.fastai/data/oxford-iiit-pet/images/Egyptian_Mau_139.jpg’)

And the .shape of the image is: (250, 350) which I’m not sure how it could have color if it doesn’t have that RGB dimension.

After reading more about this and debugging it, I think the reason this is a weird one is that it is of type PIL.GifImagePlugin.GifImageFile. Now I’m researching how that works, but it looks like PIL may store the colors in some other format for GIF Images

KevinB · September 11, 2019, 9:46pm

Expanding on this:

The problem is that GIF Images store everything in a single channel (somehow). To get resize to work correctly here, you need to change it to have a .convert('RGB') so resize will look like this:

def resized_image(fn:Path, sz=128):
    x = Image.open(fn).resize((sz,sz)).convert("RGB")
    # Convert image to tensor for modeling
    return tensor(array(x)).permute(2,0,1).float()/255.

The other way to do it would be by figuring out if it’s a gif type, that would probably save some time doing the convert on every image. I haven’t been able to test this though because my type check doesn’t work like I would expect it to.

kdorichev · September 12, 2019, 7:03am

This conditional conversion of GIF images solves the problem:

def resized_image(fn:Path, sz=128):
    x = Image.open(fn).resize((sz,sz))
    x = x.convert('RGB') if x.format=='GIF' else x
    # Convert image to tensor for modeling
    return tensor(array(x)).permute(2,0,1).float()/255.

@sgugger, does it worth a PR?

sgugger · September 12, 2019, 1:43pm

Yes, though I think you can do the convert every time, not just when it’s a GIF. The usual PILImage.open does that (unless you use PILImageBW which converts to ‘L’).