Create Databunch from pytorch dataloader

Thanks Sugugger, I will change to use the fastai datasets.

The fastai doc says we can use the torch.utils.data.DataLoader or torch.utils.data.Dataset during construct the Databunch. But no where to see how…

2 Likes

I am getting an error:

samples = collate_fn([dataset[i] for i in batch_indices])
TypeError: ‘DataLoader’ object does not support indexing

What am I doing wrong.
Why am I not able to create and train a Databunch from pytorch dataloaders

Hey
Can You please help me with the following error message

As indicated by the docs DataBunch.create takes datasets. It’s the regular init that takes DataLoader.

Oh okay.
This solved it.

Thank You.

Hey,
This solved the problem but I am not getting the expected results.
Can you please take a look at my problem.
Link is given below

It’s hard to say why a model doesn’t want to train. Did you try a higher learning rate?

Yeah I did.
When I run the same model with keras it trains perfectly.

But I want to use fast ai now.

You should check the initialization. There is a bug in the default initialization of PyTorch for conv layers, that might be the difference with Keras.

Hey,
I tried that and still accuracy is 14%.
It is weird as no matter what validation loss I have The accuracy is ~14%.

I have had val_loss=9.5 and accuracy ~14% and also
val_loss=1.93 and still accuracy is ~ 14%

As you can see.
All my predictions are exactly the same

Hey,
So I searched on how to check gradients of different layers in the model.
It turns out all my gradients are zero.

Can you tell me a possible solution/reason for this.
Below is the code that I am using to initialize my weights now.

If you don’t get gradients, that’s the whole reason your model doesn’t train. How did you check them? Note that they are zeroed in the training loop after each step, so just looking after a fit of 1 epoch doesn’t mean they were all zeros.

You should amnually check with

model.train()
x,y = next(iter(data.train_dl))
z = model(x)
loss = criterion(z,y)
loss.backward()

and see if you can then see gradients, for instance in model.layer1.weight.grad

2 Likes

Hey,
so when I try this my gradients are not zero.


So I tried to go deeper into the problem and as you can see in the snippet below.
My parameters before and after the update step are not same i.e they get updated.

I am not sure if i am going on the right track but everything seems fine.

Have you tried creating a DataBunch from Pytorch dataset using DataBunch.create()?

Hi. I don’t know if this is applicable now. But I want to ask you how to train my custom dataset. The thing is, I have images stored as npz since the images have negative values. So I’ll need to load them through numpy and then use the CNN. Hence, I have created my own data generator (as shown below):

class NumbersDataset():
    def __init__(self, inputs, labels):
        self.X = inputs
        self.y = labels

    def __len__(self):
        return len(self.X)

    def __getitem__(self, idx):
        tmp = np.load(self.X[idx])
        img_train = tmp['x']
        tmp = np.load(self.y[idx])
        img_mask = tmp['x']
        img_train = cv2.resize(img_train, (224,224), interpolation = cv2.INTER_LANCZOS4) 
        img_mask = cv2.resize(img_mask, (224,224), interpolation = cv2.INTER_LANCZOS4) 
        return img_train, img_mask

I create a DataLoader and create DataBunch for FastAI to load it on the UNet like this:

datas = DataBunch(train_dl = dataloader_train, valid_dl = dataloader_valid)

I want to train a ResNet based UNet from scratch and for that, I used the following code:

leaner = unet_learner(data = datas, arch = models.resnet34, pretrained=False)

But I get the following error:

AttributeError: ‘NumbersDataset’ object has no attribute ‘c’

which I figured out is for the number of classes (basically for classification). But I want to use the model for regression. How do I go about it then?

Just put data.c = the number of channels of the final layer of the unet.

1 Like

Hi,

Thank you very mnuch for the reply. The last layer is a convolution layer. I want to train the network to recreate the input image. Hence, it’s not a classification but a regression problem.

What do you suggest I should do in this case?

Like I said, data.c = the number of channels of the final layer of the unet. If you want an image, it’s probably 3 channels.

1 Like

Hello,

Thanks. I apologize regarding the cofusion from my side. Instead of channels, I understood it as nimber of classes.

However, can you tell me what is data object? where should I mention data.c?

It’d be datas.c before you call unet_learner

So datas.c = 3

1 Like