I am trying to use it in Lesson 14’s Carvana example.
When I try to learn.lr_find(), it gives this error message:
ValueError: Target size (torch.Size([8, 256, 256])) must be the same as input size (torch.Size([8, 3, 256, 256]))
This is quite odd. Target is the ground truth. So, it is a black and white image, just one channel. Input is a RGB image. This is how Unet34 from Lesson 14 works as well.
I tried to follow your code from github and the only thing I didn’t copy was the loss function (which is exactly where the error happens). My loss is a simple BCEWithLogitsLoss.
I have replied to your previous problem in DM, you can check it out
For this problem can you try to see where exactly your code fails by using pdb package. It’s widely mentioned throughout the course. A better approach on seeking help from the forums would be sharing your notebook.
This looks great. I tried it out and am getting the following when I use DynamicUnet.
Input type (CUDAFloatTensor) and weight type (CPUFloatTensor) should be the same
I get it even when running the notebook in the OP:
in my notebook i get it as soon as I do learn.summary()
Your new code chunk should look something like this.
f = vgg16
cut, cut_lr = model_meta[f]
encoder = get_encoder(f, 30)
m = DynamicUnet(encoder)
# init upsample on cpu
inp = torch.ones(1, 3, 256, 256)
out = m(V(inp).cpu())
# put model to gpu if desired
m = m.cuda(0)
models = UpsampleModel(m, cut_lr=20)
learn = ConvLearner(md, models)
learn.opt_fn=optim.Adam
learn.crit=nn.BCEWithLogitsLoss()
learn.metrics=[accuracy_thresh(0.5),dice, soft_jaccard]
I am running into the exact problem you mentioned above.
ValueError: Target size (torch.Size([64, 128, 128])) must be the same as input size (torch.Size([64, 3, 128, 128]))
I am following the exact steps but the error persists. I am using BCEWithLogitLoss() with ResNet. I get the same error if I switch to vgg16. When I use cpu only, I get an error related to ‘tensors on different GPU’.
Any hints on how you managed to resolve this? Thanks.
Solution: I managed to fix this by modifying the channel outputs to match the masks (grayscale). Very neat implementation Kerem! I didn’t know you already have a parameter for n_classes output. Cheers!
f = resnet152
cut,lr_cut = model_meta[f]
encoder = get_encoder(f,cut)
encoder = encoder.cuda(1)
m = DynamicUnet(encoder).cuda(1)
models = UnetModel(m)
learn = ConvLearner(md, models) # md is my model data
I ran into the same issue. Passing n_classes = 1 got the right output but it was delivered as a [64, 1, 128, 128] tensor rather than the [64, 128, 128] tensor the model expected. I fixed this by adding
I’ve been playing around with a few different kinds of Unets and I’m having issues using progressive resizing. I’m training on 256 x 256 images, then 512 x 512.
When I use a resnet34 backbone, I can get a dice score of 0.81 on 256 x 256 and 0.84 on 512 x 512.
When I use a VGG16 backbone, I get a dice score of 0.9 on 256 x 256, which is a great improvement over resnet34. But when I scale up to 512 x 512, the model falls apart. BCE loss jumps about 10x, dice score drops to 0.71, and the model doesn’t seem to train - it just oscillates. Behavior is the same over a range of learning rates.
There had been lot of requests around this issue. If anyone interested in how to get dynamic unet up and running for a real example I’ve created a repo for an ongoing kaggle competition. You may check it.
It essentially demonstrates:
How to create a custom dataset that inherits from Fastai’s BaseDataset
How to get any custom model, in this case resnet18 (pretrained) to become a unet model
I’ve been trying to use your U-Net implementation with my own dataset and I am facing some problems, hope someone can help!
My dataset is composed by a train and a test set. Inside the training set folder I have 2 other folders, one with the original image and other with the segmentation groundtruth (the original and mask share the same file name). The same is true for the test set folder.
The problem is that I cannot find a way to load the data and feed it to the model with the fastai library.
Thanks
EDIT: Solved it, my problem was that I was not passing the mask images.