Learn.predict working with uint8 and not float32 or images

Hello,

I’ve been trying to make predictions with my learner but was encountering error strings as such even though my model was successfully created, fit and my input sizes were correct:


The images were single channeled images, it required me to stack the images 3 times vertically to create an artificial 3-channel image.

I checked online for understanding this error and then proceeded to convert the data type to uint8 and multiply each pixel by 255, even though my data was not normalized. After doing so, I ran the learn.predict() function and it worked:

NOTE - the function did not work when I tried it through directly converting it to uint8, it required the multiplication with 255. Without it, I was encountering the list index out of range error

The notebook link for the full code is as follows: https://www.kaggle.com/namansood/fastai-image-channel

This really confuses me as to why all of the other methods did not work and this specifically worked. Is there any explanation for it? Am I doing something wrong? Thank you

I’m not sure I can explain why exactly this is happening, but could you try

fname = "..."
learn.predict(fname)

That way the entire data-loading process is handled the same way as when training your model.

An easier way to load such images would be to do:

fname = "..."
PIL.Image.open(fname).convert("RGB")

which gives you a 3-channeled image and is safe to do (stacking the image may lead to some unexpected behavior)

Wouldn’t converting them as RGB show unexpected behavior? As they are not necessarily RGB and are grayscale pixel values. Wouldn’t the image undergo changes ? Also, what do you imply by fname='...'? Should I replace the ... with my file name ?

I tried converting the image with the PIL command and using it in my predictor.

Code used

from PIL import Image
m = Image.open('/data/train/1/1806.jpg').convert('RGB')
learn.predict(m)

Error encountered:

AssertionError: Expected an input of type in

  • <class ‘pathlib.PosixPath’>
  • <class ‘pathlib.Path’>
  • <class ‘str’>
  • <class ‘torch.Tensor’>
  • <class ‘numpy.ndarray’>
  • <class ‘bytes’>
  • <class ‘fastai.vision.core.PILImage’>
    but got <class ‘PIL.Image.Image’>

Yes, exactly that.


I should’ve been more specific. As the error message shows, fastai requires the inputs to be of those specific types listed. On looking at the source, I see that PILImage.create is already doing Image.open(fname).convert('RGB') behind the scenes.


Technically, yes, but it’s not an issue. It’s still displaying the exact same information, and you can expect a well trained network to perform alright on it.

On a related note, if you’re seeing such data post training, it’s a good idea to include such one-channeled images in your training data as well, so the model has already trained on some images that were converted from 1 -> 3 channels

Yes, I’ve used PILImage.create but it’s unfortunately not working the way as intended using predict. Should I re-train my data with PILImages and then use it ?

Also, why is the other method working, it’s implying that I multiply each pixel by 255 again but when I check again, the values are not in that format, what is the Image.fromarray() function doing?

I’ve made and saved the updates in my notebook if you’d like to take a look at the link I mentioned in the question