Nice, this error I know ! There is a problem between the size of input’s dimension 0 (which should be 3) and the maximum value of target. Can you check input.shape
and target.max
(and target.shape
as a sanity check).
Absolutely. Thus, in the case dim == 2
, we have: Input shape: torch.Size([100352, 3]) Target shape: torch.Size([100352]) Target Max: 3
Should the input shape be transposed ? Are tensors used to initialize the Image class supposed to have a shape as [H,W,C]
?
No they supposed to be in format [C, H, W]
, which seems to be what you are doing here. But somehow the target’s max is 3, while it’s supposed to be 2. Do you have 3 classes including background or 3 classes + background ? If that’s the latter, you should have self.c = 4
in your dataset class.
Yes, indeed, I am passing the images in usual pytorch style, [C, H, W]
.
HA ! I think that was it ! I tried with 4 classes and it seems to have worked its way through the pipeline without raising any errors.
So to recap:
- Pass the loss function directly to the learner
- Ensure number of classes is correct
Thanks a lot, my friend, you really saved my fastai membership. I may have drifted back to pytorch without your help !
Next stop: a working segmentation model !
Nice! No problem mate, always happy to help If you plan on continuing to use fastai, you should still at some point get a bit deep on the DataBlock API. It is pretty unnatural at the beginning when you are used to classic pytorch, but it soon gets very handy and integrates better with the rest of fastai (for instance you can pass the loss function to the dataset ). Anyway, good luck on your future work with fastai!
That solved the issue for me on the error:
“only batches of spatial targets supported (3D tensors) but got targets of dimension: 4”
The code:
learn = unet_learner(data, models.resnet18, metrics=dice, loss_func=CrossEntropyFlat(axis = 1))
I’ve thought that unet_learner handles it this way out of the box.
Hi All
I’m participating in kaggle pulmonary embolism competition and facing challenge in writing the test pipeline. Kaggle requires a separate inference notebook for submission that does not have train data or internet connection… in case this is useful.
The idea is to train a model – save the model – load the model in different file – do the inference.
Following is my training pipeline:
df is the column of filenames and target variables (encoded)
get_x = lambda x:f'{source}/train/{x.StudyInstanceUID}/{x.SeriesInstanceUID}/{x.SOPInstanceUID}.dcm'
vocab = ['pe_present_on_image', 'negative_exam_for_pe', 'indeterminate',
'rv_lv_ratio_gte_1', 'rv_lv_ratio_lt_1', # Only one label should be true at a time
'chronic_pe', 'acute_and_chronic_pe', # Only one label can be true at a time
'leftsided_pe', 'central_pe', 'rightsided_pe', # More than one label can be true at a time
'qa_motion', 'qa_contrast', 'flow_artifact', 'true_filling_defect_not_pe'] # These are only informational. Maybe use it for study level inferences
get_y = ColReader(vocab)
block = DataBlock(blocks=(ImageBlock(cls=PILDicom), MultiCategoryBlock(vocab=vocab, encoded=True)),
get_x=get_x,
get_y=get_y,
batch_tfms=aug_transforms(size=224))
dls = block.dataloaders(df, bs=8, num_workers=0)
head = create_head(nf=1024, n_out=14, lin_ftrs=[256, 64], concat_pool=True)
config = cnn_config(custom_head=head)
learn = cnn_learner(dls, resnet34, config=config)
learn.fit_one_cycle(3, lr_max=0.05)
learn.save(file='resnet34_10epochs')
Now, the inference pipeline is: In this case, df is dataframe with filenames only. no target variable columns. Also, loss_func in training learn was self-configured as BCEWithLogitsLoss()… so used the same here as well.
get_x = lambda x:f'{source}/test/{x.StudyInstanceUID}/{x.SeriesInstanceUID}/{x.SOPInstanceUID}.dcm'
vocab = ['pe_present_on_image', 'negative_exam_for_pe', 'indeterminate',
'rv_lv_ratio_gte_1', 'rv_lv_ratio_lt_1', # Only one label should be true at a time
'chronic_pe', 'acute_and_chronic_pe', # Only one label can be true at a time
'leftsided_pe', 'central_pe', 'rightsided_pe', # More than one label can be true at a time
'qa_motion', 'qa_contrast', 'flow_artifact', 'true_filling_defect_not_pe'] # These are only informational. Maybe use it for study level inferences
block = DataBlock(blocks=(ImageBlock(cls=PILDicom)), # , MultiCategoryBlock(vocab=vocab, encoded=True)
get_x=get_x,
batch_tfms=aug_transforms(size=224))
dls = block.dataloaders(df[:1000], bs=64, num_workers=0)
head = create_head(nf=1024, n_out=14, lin_ftrs=[256, 64], concat_pool=True)
config = cnn_config(custom_head=head)
learn = cnn_learner(dls, resnet34, config=config, , n_out=14, pretrained=False, loss_func=nn.BCEWithLogitsLoss())
test_data = dls.test_dl(df)
preds = learn.get_preds(dl=test_data)
On the get_preds, I’m getting the following error:
RuntimeError: Given groups=1, weight of size [64, 3, 7, 7], expected input[64, 1, 224, 224] to have 3 channels, but got 1 channels instead
I can interpret the issue. It is related to having only 1 channel, instead of 3 channels. But this is exactly like datablock and dataloader in training process. How come this issue did not come up in training process but came up in inference process.
Need some help urgently!
Thanks in anticipation.
Adding to the above post:
If it is what it is and I’m not doing anything incorrectly, then I suppose I’ll have to convert 1 channel to 3-channels somewhere in the process.
From looking at the datablock summary (shared below), it seems that it needs to be done at batch_tfms level. But I’m clueless about how to do that. I could not find any transform from the list that modify the structure of the Tensor.
Would be great if someone can help!
Setting-up type transforms pipelines
Collecting items from StudyInstanceUID SeriesInstanceUID SOPInstanceUID
2668 16997c0ce2d9 1dab76457905 4f43708a0732
2669 16997c0ce2d9 1dab76457905 adaaeef49da2
2670 16997c0ce2d9 1dab76457905 f1ab8e058c53
2671 16997c0ce2d9 1dab76457905 fa5228210e24
2672 16997c0ce2d9 1dab76457905 28f6a41eb73f
... ... ... ...
4958 018dbbbea69e 7c5b3db841d6 4f6f9b32ee00
4959 018dbbbea69e 7c5b3db841d6 f06ca93806ef
4960 018dbbbea69e 7c5b3db841d6 a39e84b22dcf
4961 018dbbbea69e 7c5b3db841d6 79c0b89bb9ce
4962 018dbbbea69e 7c5b3db841d6 d80591f64848
[306 rows x 3 columns]
Found 306 items
2 datasets of sizes 245,61
Setting up Pipeline: <lambda> -> PILDicom.create
Building one sample
Pipeline: <lambda> -> PILDicom.create
starting from
StudyInstanceUID 16997c0ce2d9
SeriesInstanceUID 1dab76457905
SOPInstanceUID be10058789c1
Name: 2696, dtype: object
applying <lambda> gives
../input/rsna-str-pulmonary-embolism-detection/test/16997c0ce2d9/1dab76457905/be10058789c1.dcm
applying PILDicom.create gives
PILDicom mode=I;16 size=512x512
Final sample: (PILDicom mode=I;16 size=512x512,)
Setting up after_item: Pipeline: ToTensor
Setting up before_batch: Pipeline:
Setting up after_batch: Pipeline: IntToFloatTensor -- {'div': 255.0, 'div_mask': 1} -> Flip -- {'size': 224, 'mode': 'bilinear', 'pad_mode': 'reflection', 'mode_mask': 'nearest', 'align_corners': True, 'p': 0.5} -> Brightness -- {'max_lighting': 0.2, 'p': 1.0, 'draw': None, 'batch': False}
Building one batch
Applying item_tfms to the first sample:
Pipeline: ToTensor
starting from
(PILDicom mode=I;16 size=512x512)
applying ToTensor gives
(TensorDicom of size 1x512x512)
Adding the next 3 samples
No before_batch transform to apply
Collating items in a batch
Applying batch_tfms to the batch built
Pipeline: IntToFloatTensor -- {'div': 255.0, 'div_mask': 1} -> Flip -- {'size': 224, 'mode': 'bilinear', 'pad_mode': 'reflection', 'mode_mask': 'nearest', 'align_corners': True, 'p': 0.5} -> Brightness -- {'max_lighting': 0.2, 'p': 1.0, 'draw': None, 'batch': False}
starting from
(TensorDicom of size 4x1x512x512)
applying IntToFloatTensor -- {'div': 255.0, 'div_mask': 1} gives
(TensorDicom of size 4x1x512x512)
applying Flip -- {'size': 224, 'mode': 'bilinear', 'pad_mode': 'reflection', 'mode_mask': 'nearest', 'align_corners': True, 'p': 0.5} gives
(TensorDicom of size 4x1x224x224)
applying Brightness -- {'max_lighting': 0.2, 'p': 1.0, 'draw': None, 'batch': False} gives
(TensorDicom of size 4x1x224x224)
Hi @imnishantg
That’s an inconsistency with the batch shape.
When Fastai downloads a pretrained model, it expects 3 channels as input (RGB). Besides that, it adds to the Dataloader the after_batch Normalize transformation with the ImageNet stats.
That’s not true when we load a saved model or create a pretrained=False (as you did in the inference).
The dataloader with the after_batch normalization outputs 3 channels (example: [bs, 3, 224, 224]
) and that fits the model. Your original dataloader outputs just 1 channel.
I wrote a detailed answer with example in the Kaggle notebook`s comments:
https://www.kaggle.com/cordmaur/fastai2-medical-simple-training/comments
Hope that helps.
Thank you @cordmaur, the solution you detailed in the Kaggle link helped me after looking for a couple days to solve this.