Reading an alpha channel into a Data Bunch

mazchoo · April 4, 2019, 9:11am

Would anyone be able to tell me if it is possible to read the alpha channel of a png image into a Databunch? I have some code pasted below that I have adapted from Lesson 3. I have the default installation of fastai in conda on a GPU server.

from fastai.vision import *
from fastai.callbacks.hooks import *
from fastai.utils.mem import *
from fastai.metrics import *
from pathlib import PosixPath

cwd = PosixPath(’/home/deeplearning’)
path_img = cwd/‘HSV Infa’
print(path_img)

tfms = get_transforms(do_flip=True, flip_vert=True, max_rotate=30.0, max_zoom=1.1, max_lighting=0.1, max_warp=0.2, p_affine=0.75, p_lighting=0.75)

size = 25

free = gpu_mem_get_free_no_cache()
if free > 8200: bs=16
else: bs=8
print(f"using bs={bs}, have {free}MB of GPU RAM free")

source = (ImageList.from_folder(path_img)
.split_by_fname_file(path_img/‘validation1.csv’)
.label_from_folder())
data = (source.transform(tfms, size=size)
.databunch(bs=bs)
.normalize(imagenet_stats))

data.show_batch(rows=3, figsize=(21,14))

csaroff · April 5, 2019, 1:11am

There are some previous examples of creating custom classes to help work with 4+ channels How to work with 4+ channel images in fastai_v1 which help to clarify things. In the above example, the author is pulling each channel from a different file and appending those together. Obviously you have different requirements.

If you only want to add the alpha channel, I believe that you can just add a parameter to the constructor.

ImageList.from_folder(path_img, convert_mode='RGBA')

I haven’t tested this, but give it a whirl and let me know. FWIW, I just traced the ImageList code. Internally, the open method will be called for each image file in your dataset. open calls open_image which calls PIL’s convert method which takes in the convert_mode parameter.

The code is definitely hard to trace on github, so if you’re feeling overwhelmed you aren’t alone

Hope this helps!

csaroff · April 5, 2019, 1:13am

Also when you’re pasting code into discourse, as you have above, consider adding three backticks on each side of the codeblock. e.g.

```python
<code here>
```

ste · April 5, 2019, 4:04pm

Very Interesting!

There are two steps to use a RGBA image:

Properly load it
Adapt the first layer of your model to accept 4 channels instead of the canonical 3.

See