How to custom the way of reading my image?

#1

I use ImageDataBunch.from_csv, it works fine, but I want to custom the way of reading my image, in 1.0.22, I found I can modify data.train_dl.dl.dataset.ds.image_opener, but now I found in 1.0.28 it doesn’t work, how to do it?

I done it previously in this way:

# this is from fastai code, named open_image
def my_open_imagey(fn:PathOrStr)->image.Image:
    custom my image reading
    return image

data.train_dl.dl.dataset.ds.image_opener = open_image_rgby

and it’s done, but what can I do now?

2 Likes

#2

You have to subclass ImageItemList now. You’ll see it has an open method, so just go

class MyImageItemList(ImageItemList):
    def open(self, fn:PathOrStr)->Image:
        ...
        return image

And when using the data block API, instead of calling ImageItemList.bla do MyImageItemList.bla.

If you aren’t using the data block API (you should!) check the source code of the factory method of ImageDataBunch you’re using. They’re written using the data block API so you can copy/paste then adapt to your new custom class.

4 Likes

(Kaspar Lund) #3

is the image_opener assignment out now ?

0 Likes

How to handle unbalance data of multi-label classification?
(Thomas) #4

An how I can pass a custom sampler to the data loader constructor?
I am currently doing a mix of old and new, is this the right way?

def get_data(sz=64, bs=64, pct=0.2, sample=5000):
#     sz, pct, bs = 64, 0.2, 64
    src = (MyImageItemList.from_df(df=seg, path=PATH, folder=TRAIN)
           .random_split_by_pct(pct)
           .label_from_df(sep=' ')
           .add_test([TEST/f for f in test_names]))
    data = (src.transform(tfms, size=sz))
    #         .databunch(bs=bs).normalize(stats))

   
    datasets = data.train, data.valid,  data.test
    sampler = ImbalancedDatasetSampler(datasets[0], num_samples=sample)
    train_dl = DataLoader(datasets[0], bs, sampler=sampler, num_workers=12)
    val_dl = DataLoader(datasets[1], 2*bs, False, num_workers=8)
    test_dl = DataLoader(datasets[2], 2*bs, False, num_workers=8)

    return ImageDataBunch(train_dl, val_dl, test_dl).normalize(stats)

I am using a custom sampler to pass to the DataLoader constructor. (Just and oversampler of low count classes)

1 Like

#5

fastai lib is changing rapidly, so I don’t know what will it be in the final, so I choose to use 1.0.22 version, and waiting for the stable version.

1 Like

(Dmytro Mishkin) #6

@tcapelle Have you succeed with this snippet?
I have found this by googling the same problem - using custom batch sampler, and now got:

TypeError: batch must contain tensors, numbers, dicts or lists; found <class ‘fastai.data_block.LabelList’>

0 Likes

(Thomas) #7

Yes it worked.

0 Likes

(Dmytro Mishkin) #8

Thanks. Finally, I have got this working as well - needed to set batch_sampler, not sampler

0 Likes

(Thomas) #9

but you are passing the sampler to the pytorch DataLoader? Ok, so you built a batch sampler, it is not the same.

0 Likes

(Constantin) #10

I am on fastai v1.0.42 and run into the following:

ds = ImageItemList.from_csv(path=PATH, csv_name=LBLS, folder=TRAIN, cols=[0,1])
print(type(ds))
ds= ds.random_split_by_pct(0.2, seed=SEED)
print(type(ds))
ds = ds.label_from_list(lbls.Target.values)
print(type(ds))
<class 'fastai.vision.data.ImageItemList'>
<class 'fastai.data_block.ItemLists'>
<class 'fastai.data_block.LabelLists'>

My issue is that even though I overwrite ds.open=my_image_loader no matter what I do the ds instance will always default back to the standard open() method which calls open_image().

I tried overwriting ds.open at each level of instantiating ds, i.e. on the level of ImageItemList, ItemList, LabelLists. On the former I get my method as desired. As soon as I move towards ItemList or downstream I get the default method, however.

I tried subclassing as you outlined above with:

class MyImageItemList(ImageItemList):
    def open(self, fn):
        ...
        return image
```
How can I enforce the use of my custom method?
0 Likes

#11

You should just replace the ImageItemList by MyImageItemList in your call of the data block API.
Also, be very careful when using label_from_list, it won’t work as your inputs aren’t in the same order after the split, you should use label_from_df.

0 Likes

(Constantin) #12

I did both these things. What I do not understand is that the LabelList which gets created falls back to the factory default open(). I presume it is a scoping problem, but just can‘t find out where it goes wrong. I will report if I find out.

0 Likes

(Constantin) #13

Seems to work now. I had (from an older version of my code subclassing fastai << 1.0.42) overwritten the methods

    def __getitem__(self,i):
        return self.open(self.items[i])
    
    def get(self, fn):
        return self.open(fn)

as well. I thought I needed that, but in fact it totally messes up everything and causes a lot of trouble. Seems to work now.

0 Likes

#14

Yeah, just subclassing open is preferable. Glad it’s working now!

0 Likes

#15

Hi trying to take baby steps on handling 3d image training and reading the above post. Trying to understand how to use:

class MyImageItemList(ImageItemList):
def open(self, fn:PathOrStr)->Image:
    ...
    return image

Described above. The images are from MRNet and they are .npy files. I can read in each file to my Jupyter notebook with:

img_array = np.load('0000.npy')

The resulting array is (44, 256, 256). It is basically 44 slices of (1, 256, 256) images. So, if I want to view the last slice I run:

plt.imshow(img_array[43], cmap='gray')
plt.show()

I am trying to use the datablock api and I think a custom ItemList as the first step. I initially tried:

mri_list = ObjectItemList.from_csv(path, 'train_knee_tiny.csv', folder='sagittal', 
suffix='.npy')

It runs but when I type mri_list I get:

OSError: cannot identify image file '.\\sagittal\\n0000.npy'

I guess that makes sense, since the ObjectItemList is looking for 2d images.

So, when I try to do the above just returning:

return np.load('image')

I get

NameError: name 'ImageItemList' is not defined

So, I think I have some conceptual errors how this works. I am reading more (I just started lesson 7, so not sure of myself around the Datablock API and how to handle something that is seems like it is non-standard.

0 Likes

(Burak) #16

Hi,
I am trying similar image read from .npy and get same error.
Did you solve the problem?

0 Likes

(Kai Lichtenberg) #17

@burak and @jmstadt
I think this here should work:

def open_npy(fn:PathOrStr, cls:type=MyImage, after_open:Callable=None)->Image:
    x = np.load(fn)
    if after_open: x = after_open(x)
    return cls(x)

class MyList(ImageList):
    def open(self, fn):
        return open_npy(fn, after_open=self.after_open)

Change cls:type=MyImage to cls:type=Image if you don’t want to customize your ItemBase class. If you want to overload something in the Image class you can just:

class MyImage(Image):
   #Do whatever necessary to represent your data :-) 
1 Like

#18

Hi Burak, no, I had not found a solution. So, thanks Kai! will give that a try!

0 Likes