So, you want to create your custom pipeline with fastai

I got this to run end-to-end with one key change to line up the input/output of the whole thing. I put a debug in at the forward() so that I could figure out what the x_in and the x_out looked like on that pass and made sure that the conv() lined up with what was expected from the batch.

y = ItemList(items=target_np_array[:,None,:,:])

Full example is here. Let me know if that maps back to your case.

1 Like

Thanks so much again.

The training works on images of same sizes but on different sizes images it throws an exception.

currently my training looks like this:

for epoch in range(0, epochs): 
    learn.fit(1)
    abs_error= validate(val_list, model, criterion)
    
    if(abs_error< best_abs_error):
        checkpoint(model, epoch%3,model_out_path+"C1Net"+"_"+repr(int(abs_error)))
    else:
        checkpoint(model, epoch%3,model_out_path+"C1Net")


    is_best = abs_error< best_abs_error
    best_abs_error = min(abs_error, best_abs_error)
 
    print(' * best abs_error {abs_error:.3f} '
          .format(abs_error=best_abs_error))

Also, is there a way to stop it validating inside fit? i.e. i have a different folder for test images and labels so i want to validate separately.
Also, to save the model after each epoch i’ve to put the fit inside a loop where after each completion of fit(1) i can save the model, or there is a better way to do that which fastai offers?

Your CNN model maps from the same size down to the same size. If you want it to do something different you can change your convolutions or change your padding to change the in/out dimensions. Once you set them up, they are static for the model (hence the need to add padding.) When I am trying to debug these types of things I will put in import pdb;pdb.set_trace() in the forward call and then inspect the input/output to figure out what the model is going to try and use.

It appears that Fit is going to check if data.empty_val to decide about running validation once per epoch. You could leave that empty or you can build your validation and test data as needed/expected and feed into the DataBunch

Don’t put the fit inside the loop. Go back and look at the Fast.ai lectures about fitting and then use the fit_one_cycle method with theappropriate callbacks so that you save on best or save each cycle. If you do it in a loop, you lose all the history about momentum and gradients that is very important to getting a good fit.

1 Like

Thanks so much Bobak.

I will modify it as you mentioned.

1 Like

@bfarzin In the Custom Dataset it seems that you are dropping the 4th channel.I have a dataset of 7 channels and I need all those 7 channels.So when I use this How Will Data Augmentation works since it is only defined for 3 channel images in fast.ai or is there a way to pass a dataset with my own transforms into fastai learner.

I don’t understand your question. you have a 7 channel image? Maybe if you post an example it would be clear what you are trying to do and I could help out.

I have 7 channels as input to CNN.The dataset is similar to one used in dstl competition.For One Output mask there are 7 channels as input.It is basically a segmentation problem

@bfarzin Basically I have a (7 * size * size) as my Input and I want to pass it To UNET to get (n_classes * sz * sz) as output masks.How do I pass this to Fastai.While using Standard Pytorch I implemented Basic Transformations like horizontal flip and 90 degree rotations using numpy.Will the transforms of fastai work for 7 channel input i not how can i pass my own transforms to fastai.

This is outside the scope of my experience, really. I would suggest breaking out as a new topic in the forums and see if there is broader help. I suspect you can do all the transforms in fastai, but I have not spent a ton of time working on that part of the library.

I think you are discussing these two lines:

        #channel 4 is all 255, drop it
        label = cv2.resize(img[:,:,:3],(int(img.shape[1]/8),int(img.shape[0]/8)),interpolation = cv2.INTER_CUBIC)*64

Is that correct? Here I am just saving space since the images are only three channel (RGB.) I don’t see why you couldn’t have it be N channel but have never tried that. Did you just try to allow all channels? Does something break?

I think transforms are only defined for a channel<=3 so this isnt working.I have my own dataset with transforms available.All I need to know is How I can pass it to fastai.I guess I just need the one fit cycle on a custom model and a custom dataset.Could you help me with that

@bfarzin
Here is the function that is performing transforms

def applytransform(x,y,mode):
    if mode=="val":
      return x,y
    
    if random.random()>=0.5:
        return x,y
    else:
        
        if random.random()>=0.5:
            for i in range(len(x)):x[i]=np.fliplr(x[i])
            y=np.fliplr(y)
        if random.random()>=0.5:
            for i in range(len(x)):x[i]=np.flipud(x[i])
            y=np.flipud(y)
        if random.random()>=0.5:
            k=random.randint(0,3)
            for i in range(len(x)):x[i]=np.rot90(x[i],k)
            y=np.rot90(y,k)
    return x,y 

I am going through each channel and applying transformation to both image and mask

Here is one idea. You could extend the TensorDataset with your own __getitem__ that can pull your own transforms (from your own data) rather than having the data be static. From there, it should all flow, but I have not tried this myself. Let me know if that works for you.

@bfarzin Okay I will try that.But using Tensor Dataset Does it mean I must have all the tensors in memory first?I already have created something that returns me the Final Tensors after transformation.

@bfarzin

class GetTransformedSet(Dataset):
    def __init__(self,root,img_list,mask_list,transforms=None,mode="train"):
        self.root=root
        self.img_list=img_list
        self.mask_list=mask_list
        self.transformation=transforms
        self.mode=mode
    def __len__(self):
        return len(self.img_list)
    def __getitem__(self,i):
        
        mask_path=os.path.join(os.getcwd(),self.root+path+self.mask_list[i])
        img,mask=open_multiple_channel_img(input_files[i],mask_path,self.mode)
        img = img.type('torch.FloatTensor')
        return img,mask    
        

def open_multiple_channel_img(file_name,mask_path,mode):
    file_name=file_name.split("_")
    file_name=file_name[0]+"_"+file_name[1]+"_"+file_name[2]+"_"+file_name[3]
    path=f'{im_input}/'+file_name
    images=[]
    mask=Image.open(mask_path)
    mask=np.array(mask)
    for i in range(1,8):
        im=Image.open(path+"_"+str(i)+".tif")
        im_array=np.array(im)/255
        im_array=im_array-mean_norm[i-1]
        #im_array=im_array/std_norm[i-1]
        
        images.append(im_array)
    
    images,masks=applytransform(np.stack(images, axis=0),mask,mode)
    return  torch.from_numpy(images.copy()),torch.from_numpy(masks.copy())
2 Likes

Hello @at98

Does this address your question?
Classifier with multiple images as input and multiple labels as output.

I am trying to use multichannels as inputs to a pre-trained resnet model so I cam across your post and this other one above.

CC: @bfarzin

Thank you!

Hi guys,

I want to create my own data generator from a csv file for the segmentation task. The file contains two columns: 1st column contains the location of the image and the 2nd one contains the location of the mask. I have written the following code:

df = pd.read_csv('/path/to/csv/data.csv')

X = list(df['input_img'])
y = list(df['mask_img'])

X_train, X_valid, y_train, y_valid = train_test_split(
     X, y, test_size=0.33, random_state=42)

class ToTensor(object):
    """Convert ndarrays in sample to Tensors."""

    def __call__(self, img):
        img = img.transpose((2, 0, 1))
        # return {'image': torch.from_numpy(img),
                # }
        return torch.from_numpy(img)

class NumbersDataset(Dataset):
    def __init__(self, inputs, labels, transform=None):
        classes = [0,1]
        self.X = inputs
        self.y = labels
        self.transform = transform
        self.c = 2

    def __len__(self):
        return len(self.X)

    def __getitem__(self, idx):
        img_train = cv2.imread(self.X[idx])
        img_mask = cv2.imread(self.y[idx])
        img_train = cv2.resize(img_train, (427,240), interpolation = cv2.INTER_LANCZOS4)
        img_mask = cv2.resize(img_mask, (427,240), interpolation = cv2.INTER_LANCZOS4)
        img_mask = cv2.cvtColor(img_mask, cv2.COLOR_BGR2GRAY)
        bin_mask = np.zeros_like(img_mask)
        bin_mask[(img_mask)>0]=1
        bin_mask = bin_mask.reshape(240, 427, 1)
        if self.transform:
            img_train = self.transform(img_train)
            bin_mask = self.transform(bin_mask)

        return img_train, bin_mask

if __name__ == '__main__':
    dataset_train = NumbersDataset(X_train, y_train, transforms.Compose([ToTensor()]))
    # dataset_train = NumbersDataset(X_train, y_train)
    dataloader_train = DataLoader(dataset_train, batch_size=4, shuffle=True)

    # dataset_valid = NumbersDataset(X_valid, y_valid)
    dataset_valid = NumbersDataset(X_valid, y_valid, transforms.Compose([ToTensor()]))
    dataloader_valid = DataLoader(dataset_valid, batch_size=4, shuffle=True)

    datas = DataBunch(train_dl = dataloader_train, valid_dl = dataloader_valid)
    # datas.show_batch()
    datas.c = 1
    learner = unet_learner(datas, models.resnet34)

I get the following error:

Traceback (most recent call last):
  File "dataset_test.py", line 104, in <module>
    learner = unet_learner(datas, models.resnet34)
  File "/home/sarvagya/miniconda3/envs/gr/lib/python3.6/site-packages/fastai/vision/learner.py", line 121, in unet_learner
    bottle=bottle), data.device)
  File "/home/sarvagya/miniconda3/envs/gr/lib/python3.6/site-packages/fastai/core.py", line 66, in _init
    old_init(self, *args,**kwargs)
  File "/home/sarvagya/miniconda3/envs/gr/lib/python3.6/site-packages/fastai/vision/models/unet.py", line 43, in __init__
    sfs_szs = model_sizes(encoder, size=imsize)
  File "/home/sarvagya/miniconda3/envs/gr/lib/python3.6/site-packages/fastai/callbacks/hooks.py", line 113, in model_sizes
    x = dummy_eval(m, size)
  File "/home/sarvagya/miniconda3/envs/gr/lib/python3.6/site-packages/fastai/callbacks/hooks.py", line 108, in dummy_eval
    return m.eval()(dummy_batch(m, size))
  File "/home/sarvagya/miniconda3/envs/gr/lib/python3.6/site-packages/fastai/callbacks/hooks.py", line 104, in dummy_batch
    return one_param(m).new(1, ch_in, *size).requires_grad_(False).uniform_(-1.,1.)
TypeError: new() argument after * must be an iterable, not builtin_function_or_method

What should I do?

Hi, Bobak,

Thanks for sharing your work! I’ve been tying to create my custom training pipeline with the fastai library, and I find your notebook very useful. However, under the current version of fastai v1.0.60, your code doesn’t work anymore. The thing is, now we should import DataBunch from fastai.basic_data, and import Learner from fastai.basic_train. My problem is, after I’ve done these import right, and pass a custom model, a DataBunch, a loss function and an optimzor into Learner, when I called learn.lr_find(), it returns AttributeError: 'Learner' object has no attribute 'lr_find'.
I’ve googled a lot and can’t find the solution. Can you help me?

I have not yet caught up on the latest version of the library. I don’t think that finding the lr_find() is related to the custom pipeline. But I am just not sure.
If you do find a solution, please post a link here so others can find in the future. Good luck!

I didn’t find the solutoin. I create my custom lr_find with pytorch’s torch.optim.lr_scheduler.LambdaLR. Normally creating a fastai leaner class with Learner will have lr_find and fit_one_cycle attributes automatically, as the fastai doc says. However, even when I pass lr_find callbacks explicitly into callback_fns when creating my learn object using learn = Learner(data, model, opt_func=optim.SGD, loss_func=loss_func, callback_fns=[lr_finder]), lr_find does not appear in my learn's attributes list.