ImageDataLoaders class + Regression + how to?

Hey folks,
I’m trying to use the ImageDataLoaders method to load the dataset.

The project is: reconstructing dark images as if they were shot under lighting conditions.
The dataset:

As you can see above:
For each object, there were taken photos of:

  1. One long-exposure image (“Bright image”), in which you can clearly see all the details of the object;
  2. A few other pictures of the same object were taken in short exposure mode, so very little light had been coming in, hence the images seem pretty dark (almost black).

I need my model to be trained such that for every object, the paired images are being calculated in comparison to each other.

How do I load the dataset with ImageDataLoaders?

Note that it’s not classification, nor segmentation, but a Regression trainning.

Thanks!

If I understand it correctly, the dark images will be your input into the model, and the bright images the target? ImageDataLoaders will not accept an image for the target. You could try SegmentationDataLoaders, which accepts an image as the target, but you will have problems with the datatype.
The best solution might be to dig a bit deeper into the DataLoader/DataBlock API. Let’s say, you want to load three dark images and one bright image, then you can construct DataLoaders from DataBlocks as follows:

dblock = DataBlock(
    block=(ImageBlock, ImageBlock, ImageBlock,  # input
           ImageBlock),   # target
    get_x = [lambda x: x['image_col1'], lambda x: x['image_col2'], lambda x: x['image_col3']], 
    get_y = lambda x: x['target_col'], 
    n_inp = 3, 
    item_tfms = ...,  
    batch_tfms = ... )

dls = dblock(dataframe, bs = 64)
2 Likes

Hey man, that’s right. For every target image, there are about 1-10 dark images.

  1. What datatypes does SegmentationDataLoader accept? If not *.ARW, I could try *.DNG or *.ORF. But whatsoever, it’s better to be RAW instead of JPG or PNG, because these - when compressed - lose a lot of crucial data in the pixels.
  2. Wow, I’m impressed. Can I use your implementation?
    How can I adjust it such that it can construct up to 10 input images? (Just configure “n_inp=10”?)
    Where do I implement these new changes in the package/module?
    Why did you choose x[‘col1’]? Why use lambda?

Many thanks, Man!

I would recommend you to check this amazing tutorial to understand more.

The proposed solution from @BresNet recommends using DataBlock API which offers more flexibility. You can check the tutorial to understand more.

Regarding the supported formats, I think you can take a look at what image_extensions contains. Try using get_image_files on the folder containing images. If the format is supported you will get a list of all the files.

1 Like

I, too, highly recommend reading the tutorial, as @VishnuSubramanian suggested.

  1. What datatypes does SegmentationDataLoader accept? If not *.ARW, I could try *.DNG or *.ORF. But whatsoever, it’s better to be RAW instead of JPG or PNG because these - when compressed - lose a lot of crucial data in the pixels.

I don’t know if PIL supports the image types you want to process. Maybe you need to write your own image class and adapt some transforms. But as this can be some work, I would recommend getting your pipeline running with PNG images first. After you get some results with your network on PNG images, you can go back and adapt the data types.

  1. Wow, I’m impressed. Can I use your implementation?
    How can I adjust it such that it can construct up to 10 input images? (Just configure “n_inp=10”?)
    Where do I implement these new changes in the package/module?

Correct, for 10 Input images, you need to set n_inp to 10 and provide 10 ImageBlocks. To save some space, I think this code would work to: block=(ImageBlock, )* 11, which gives you a tuple of 10 ImageBlocks for Input and one ImageBlock as the target. However, this will only work if you have 10 input images in every example. Dynamically changing the number of inputs would be challenging.

Why did you choose x[‘col1’]? Why use lambda?

In my code example, I assumed you have the filenames to your images stored in a data frame where the first column (col1) has the name of the first image, the second column (col2) the name of the second image and so on (I corrected the typos in the code, where I repeated col1). The DataLoader will iterate through the data frame rows and use the functions defined in get_x and get_y to read out the filename from the row (x).
You can use lambda functions or another predefined function. As the function is very short I chose lambda.

3 Likes

Thanks man! Cool reply!

I would recommend you to check this amazing tutorial to understand more.

Oh wow, really helped me! But still left me with a few questions. I’ll write them below (as a reply to BresNet).

Regarding the supported formats, I think you can take a look at what image_extensions contains. Try using get_image_files on the folder containing images. If the format is supported you will get a list of all the files.

Yup, it looks like PIL can’t accept *.ARW or *.DNG. But just for the testing, I converted to JPG. Once I make it happen, I’ll take care of adjusting the format/class.

1 Like

Great tips!

I don’t know if PIL supports the image types you want to process. Maybe you need to write your own image class and adapt some transforms. But as this can be some work, I would recommend getting your pipeline running with PNG images first. After you get some results with your network on PNG images, you can go back and adapt the data types.

Yup, it looks like PIL can’t accept *.ARW or *.DNG. But just for the testing, I converted to JPG. Once I make it happen, I’ll take care of adjusting the format/class.

Correct, for 10 Input images, you need to set n_inp to 10 and provide 10 ImageBlocks. To save some space, I think this code would work to: block=(ImageBlock, )* 11 , which gives you a tuple of 10 ImageBlock s for Input and one ImageBlock as the target. However, this will only work if you have 10 input images in every example. Dynamically changing the number of inputs would be challenging.

Maybe I can solve this with a table of labels? I’ll try to explain further below.

In my code example, I assumed you have the filenames to your images stored in a data frame where the first column (col1 ) has the name of the first image, the second column (col2 ) the name of the second image and so on (I corrected the typos in the code, where I repeated col1 ). The DataLoader will iterate through the data frame rows and use the functions defined in get_x and get_y to read out the filename from the row (x ).

Basically, there is supposed to be a table of lables like this:
“Short” = input
“Long” = Target

ID input target
0 ./drive/SID/train/short/0.1/1_00.jpg ./drive/SID/train/long/10/1.jpg
1 ./drive/SID/train/short/0.1/1_01.jpg ./drive/SID/train/long/10/1.jpg
2 ./drive/SID/train/short/0.1/1_02.jpg ./drive/SID/train/long/10/1.jpg
3 ./drive/SID/train/short/0.1/1_03.jpg ./drive/SID/train/long/10/1.jpg
4 ./drive/SID/train/short/0.1/1_04.jpg ./drive/SID/train/long/10/1.jpg
5 ./drive/SID/train/short/0.1/1_05.jpg ./drive/SID/train/long/10/1.jpg
6 ./drive/SID/train/short/0.1/1_06.jpg ./drive/SID/train/long/10/1.jpg
7 ./drive/SID/train/short/0.1/1_07.jpg ./drive/SID/train/long/10/1.jpg
8 ./drive/SID/train/short/0.1/1_08.jpg ./drive/SID/train/long/10/1.jpg
9 ./drive/SID/train/short/0.1/1_09.jpg ./drive/SID/train/long/10/1.jpg
10 ./drive/SID/train/short/0.04/1_00.jpg ./drive/SID/train/long/10/1.jpg
11 ./drive/SID/train/short/0.04/1_01.jpg ./drive/SID/train/long/10/1.jpg
12 ./drive/SID/train/short/0.1/5_00.jpg ./drive/SID/train/long/10/5.jpg
13 ./drive/SID/train/short/0.1/5_01.jpg ./drive/SID/train/long/10/5.jpg
14 ./drive/SID/train/short/0.1/5_02.jpg ./drive/SID/train/long/10/5.jpg
15 ./drive/SID/train/short/0.1/5_03.jpg ./drive/SID/train/long/10/5.jpg
16 ./drive/SID/train/short/0.1/5_04.jpg ./drive/SID/train/long/10/5.jpg
17 ./drive/SID/train/short/0.1/5_05.jpg ./drive/SID/train/long/10/5.jpg
18 ./drive/SID/train/short/0.1/5_06.jpg ./drive/SID/train/long/10/5.jpg
19 ./drive/SID/train/short/0.1/5_07.jpg ./drive/SID/train/long/10/5.jpg
20 ./drive/SID/train/short/0.1/5_08.jpg ./drive/SID/train/long/10/5.jpg
21 ./drive/SID/train/short/0.1/5_09.jpg ./drive/SID/train/long/10/5.jpg
22 ./drive/SID/train/short/0.04/5_00.jpg ./drive/SID/train/long/10/5.jpg
23 ./drive/SID/train/short/0.04/5_01.jpg ./drive/SID/train/long/10/5.jpg
24 ./drive/SID/train/short/0.1/7_01.jpg ./drive/SID/train/long/10/7.jpg
25 ./drive/SID/train/short/0.1/7_02.jpg ./drive/SID/train/long/10/7.jpg
26 ./drive/SID/train/short/0.1/7_03.jpg ./drive/SID/train/long/10/7.jpg
27 ./drive/SID/train/short/0.1/7_04.jpg ./drive/SID/train/long/10/7.jpg
28 ./drive/SID/train/short/0.1/7_05.jpg ./drive/SID/train/long/10/7.jpg
29 ./drive/SID/train/short/0.1/7_06.jpg ./drive/SID/train/long/10/7.jpg
30 ./drive/SID/train/short/0.1/7_07.jpg ./drive/SID/train/long/10/7.jpg
31 ./drive/SID/train/short/0.1/7_08.jpg ./drive/SID/train/long/10/7.jpg
32 ./drive/SID/train/short/0.1/7_09.jpg ./drive/SID/train/long/10/7.jpg
33 ./drive/SID/train/short/0.04/7_00.jpg ./drive/SID/train/long/10/7.jpg
34 ./drive/SID/train/short/0.04/7_01.jpg ./drive/SID/train/long/10/7.jpg

(It’s just a sample of the much bigger dataset)

0.1 folder = 10/100s of exposure time.
0.04 folder = 4/100s of exposure time.
10 folder = 10s of exposure time.

filename = 7_00.jpg where 7 is the object number and 00 stands for the burst image number.

So, I don’t think that there is a need for more than two-three columns. Isn’t it?

Btw, I began building the code, and so far it looks like this: Google Colaboratory

But it gets stuck in:

dsets = dblock.datasets(fnames)
dsets.train[0]

But it gets stuck in:

dsets = dblock.datasets(fnames)
dsets.train[0]

You need to specify the blocks and getter functions when building your DataBlock. In your above example, you just initialized an empty DataBlock class.

Regarding your table structure. If you have two columns, you are not using multiple inputs. So the model would use only use ./drive/SID/train/short/0.1/1_00.jpg to predict ./drive/SID/train/long/10/1.jpg, then only ./drive/SID/train/short/0.1/1_01.jpg to predict ./drive/SID/train/long/10/1.jpg, etc. Maybe this would still work, but I don’t know if it is what you want.
Still, I would suggest first get your pipeline running with this table and then optimizing this task. This way, you are troubleshooting one problem at a time and not multiple.
I think the best approach would be to build your DataLoaders from DataFrame and not from a folder. This way, it is easier to match input and target.

1 Like

Thank Man!

Regarding your table structure. If you have two columns, you are not using multiple inputs. So the model would use only use ./drive/SID/train/short/0.1/1_00.jpg to predict ./drive/SID/train/long/10/1.jpg , then only ./drive/SID/train/short/0.1/1_01.jpg to predict ./drive/SID/train/long/10/1.jpg , etc. Maybe this would still work, but I don’t know if it is what you want.

Wait, isn’t that the same? For both variants of the data sorted, for the same target:
./drive/SID/train/long/10/1.jpg
there is two or more inputs:
./drive/SID/train/short/0.1/1_00.jpg
./drive/SID/train/short/0.1/1_01.jpg
What is the difference between sorting the inputs in rows and sorting the inputs in columns, in terms of the training the model?

You need to specify the blocks and getter functions when building your DataBlock. In your above example, you just initialized an empty DataBlock class.

I think that I did that though, as mentioned here:

dblock = DataBlock(
    blocks=(ImageBlock,)*12, # 11 inputs + 1 target
    get_x = [lambda x: x['fname'] for i in range(11)],
    get_y = lambda x: x['fname'],
    n_inp = 11
    )

But I honestly don’t know what the difference is if I make this slight change here: (Note get_x ‘i’)

dblock = DataBlock(
    blocks=(ImageBlock,)*12, # 11 inputs + 1 target
    get_x = [lambda x,i: x[i]['fname'] for i in range(11)],
    get_y = lambda x: x['fname'],
    n_inp = 11
    )

Still, I would suggest first get your pipeline running with this table and then optimizing this task. This way, you are troubleshooting one problem at a time and not multiple.
I think the best approach would be to build your DataLoaders from DataFrame and not from a folder. This way, it is easier to match input and target.

Is this what we do with Panda library?

I tried to search more stuff and found here something:

From this link above I found this:

what are the types of our inputs and targets? Images and segmentation masks.
where is the data? In subfolders.
how do we know if a sample is in the training or the validation set? We’ll take a random split.
how do we know the label of an image? By looking at a corresponding file in the “labels” folder.
do we want to apply a function to a batch after it’s created? Yes, we want data augmentation.

camvid = DataBlock(blocks=(ImageBlock, MaskBlock(codes = np.loadtxt(path/'codes.txt', dtype=str))),
    get_items=get_image_files,
    splitter=RandomSplitter(),
    get_y=lambda o: path/'labels'/f'{o.stem}_P{o.suffix}',
    batch_tfms=aug_transforms())

But I’m trying to adapt it to my needs:

Let’s go over our usual questionnaire:

What are the types of our inputs and targets? Images and Images.
Where is the data? In subfolders.
How do we know if a sample is in the training or the validation set? We’ll take a random split.
How do we know the label of an image? By looking at a corresponding file in the “Long” folder (target).
Do we want to apply a function to a batch after it’s created? Don’t know honestly.

datab = DataBlock(blocks=(ImageBlock, ImageBlock(codes = np.loadtxt(path/'codes.txt', dtype=str))),
    get_items=get_image_files,
    splitter=RandomSplitter(),
    get_y=lambda o: path/'long'/f'{o.stem}_P{o.suffix}',
    batch_tfms=aug_transforms())

But where is that “stem” and “suffix” coming from exactly?
Do I need here a ‘codes.txt’ file too?

In the above example, each item is a python Path object. Just pass the folder path to get_image_files and explore the output. Each file path is then passed to the function/lambda function that is used in get_y. So Path objects will have a stem and suffix. A simple example would look like this.

files = get_image_files(path)
files[0]
-- Path('/home/.fastai/data/oxford-iiit-pet/images/Siamese_87.jpg')
files[0].stem
-- 'Siamese_87'
files[0].suffix
-- '.jpg'

Codes represent the info on what category each pixel value represents. So it should be fine if you don’t have it. And with regards to batch, it is the transformations/functions you want to apply after batching the data. A common example of it is Data augmentation.

1 Like

Thank you a lot! I guessed it was that, but wasn’t sure why stem() and suffix() apply only for the name of the file. I guess that it’s the way it was implemented. Where can I find these functions?
I didn’t find them here:

And btw it can work this way:

datab = DataBlock(blocks=(ImageBlock, ImageBlock),
    get_items=get_image_files,
    splitter=RandomSplitter(),
    get_y=lambda o: path/'long'/f'{o.stem}{o.suffix}',
    batch_tfms=aug_transforms())

And apply the get_image_files on the parth/‘short’ alone.

But still, I’m not sure how it knows to match only the same filenames to each other (input to target)

get_items returns a list of elements, and each of these elements is passed to get_x and get_y. If you do not pass a function to these methods then the element in this case a python Path object is passed directly to the blocks.

Go through the tutorial I recommended carefully, it will help you understand how DataBlocks work. I am also working on a video, which should hopefully be ready by tomorrow.

2 Likes

You can also use the summary method to understand what is happening.

dblock.summary(path)

2 Likes

Hi, I made a video explaining how datablocks work. Thought you might be interested in it.
https://youtu.be/NzWadB_fcTE

1 Like

Thanks for the video, man!
Nice one.
It would have helped to see how you loaded two ImageBlocks of input and target (instead of CatagoryBlock).

So far I tried to play a little more, and then I found out that I could use ImageDataLoaders.from_csv instead, and it helped!

You guys are welcome to see my code here:

But now I got this error. Any idea why or where I could fix that?

I guess some shape mismatch error in the loss function. Grab a batch of data using dls.one_batch and calculate the targets. Now observe the shapes of yb and targs. Also, look at what is the valid shape the loss function takes.

xb, yb = dls.one_batch()
targ = learn.model(xb)
loss_fun(targ,yb) -- May crash 

The second part of the video shows how to use ImageBlock for both input and target in a way. You may find it useful.

1 Like

Hey Mann,
Thanks for your guidance! The video you made was just great. Loved it! Good to hear from you all around.

I’ve since managed to ask the right questions until I almost solved the whole problem. You can see here my progress:

Btw,
Any idea why IndexError: list index out of range?

I would guess the problem is in your getter function, which passes an invalid format to PILImage.create.

1 Like