I am using the fastai library to segment images with multiple classes on a personal dataset. Consequently, I have multiple masks per image. What is the best way to proceed? Do I consolidate the masks into a single image where the background is 0, and each subsequent class is assigned an integer (1, 2, 3 etc…)? Or do I extend the SegmentationDataset class to become more like ImageMultiDataset so that it can handle multiple masks? All the examples of the V1 library that I can find of segmentation are of single class segmentation datasets. Thank you in advance!
The first solution is the best if the masks don’t overlap. There is an example with camvid of a segmentation problem with multiple classes.
If one pixel can belong to several classes though, you should have multiple channels in your mask.
Thank you for your prompt reply! For people who encounter a similar issue, the camvid lesson is linked below.
I chose to focus on implementing a model that can segment one feature first before adding additional channels to handle overlapping features. When I call ImageBunch instance functions like show_batch() and normalize() on an ImageBunch object , I get an error along the lines of AttributeError: 'SegmentationDatasetOneFeature' object has no attribute 'normalize.'
Why are the functions being called on the extended Dataset instead of the ImageBunch class?
class SegmentationDatasetOneFeature(SegmentationDataset):
def _get_y(self,i): #checks if mask exists image_path = glob.glob(self.y[i] + "/abc.*") #if there is no abc, load an empty one if len(image_path) == 0: tensor = torch.tensor((), dtype=torch.int) tensor.new_zeros((sz, sz)) return ImageSegment(Image(tensor)) else: return open_mask(image_path[0])
data = DataBunch.create(train_ds=train_ds, valid_ds=valid_ds, test_ds=None,
bs=bs, num_workers=nw, tfms=tfms)
data = data.normalize(imagenet_stats)
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-30-f381a0614dd5> in <module>
----> 1 data = data.normalize(imagenet_stats)
~/anaconda3/envs/fastai-1.0/lib/python3.7/site-packages/fastai/basic_data.py in __getattr__(self, k)
114 return cls(*dls, path=path, device=device, tfms=tfms, collate_fn=collate_fn)
115
--> 116 def __getattr__(self,k:int)->Any: return getattr(self.train_dl, k)
117 def holdout(self, is_test:bool=False)->DeviceDataLoader:
118 "Returns correct holdout `Dataset` for test vs validation (`is_test`)."
~/anaconda3/envs/fastai-1.0/lib/python3.7/site-packages/fastai/basic_data.py in __getattr__(self, k)
49
50 def __len__(self)->int: return len(self.dl)
---> 51 def __getattr__(self,k:str)->Any: return getattr(self.dl, k)
52
53 @property
~/anaconda3/envs/fastai-1.0/lib/python3.7/site-packages/fastai/basic_data.py in DataLoader___getattr__(dl, k)
34 super().__init__(len(classes))
35
---> 36 def DataLoader___getattr__(dl, k:str)->Any: return getattr(dl.dataset, k)
37 DataLoader.__getattr__ = DataLoader___getattr__
38
AttributeError: 'SegmentationDatasetOneFeature' object has no attribute 'normalize'
That’s because DataBunch
doesn’t have a normliaze
function (so it’s trying to find it in its train_ds
). You have to use an ImageDataBunch
for this.
@Jamie which tool did you use to create the masks for the raw images?
Can you clarify what you mean by ‘raw images?’
raw images are input features without any pre-processing
In case you are looking for an annotation tool , try Labelme
For a dataset that has multiple masks for a single image, is it also possible (and also is it recommended?) to create multiple training instances of that single image, paired with each image mask, rather than say combining all of the masks into one single mask where there are no overlaps? I guess this would increase training time? But would this improve or degrade accuracy?
Also to add to the above, is there a way in fastai to combine these masks into one image in some sort of preprocessing step?
How do you recommend I proceed if we have masks with multiple channels (due to overlapping)? I currently have outputs as CHW tensors, but those don’t seem to work nicely with PIL if I use ImageSegment. Should I have multiple ImageSegment labels (1 per channel) for each input? Thank you.
You should use and ImageSegment
with CHW channels. That probably requires a custom open
method, so you will have to subclass ImageSegment
but then the transforms should be applied properly to the targets.
Does that also mean I have to create custom class for SegmentationLabelList and SegmentationItemList if I need a CHW ImageSegment?
I have a multiclass image segmentation problem where an individual pixel can belong to any of 4 possible classes. This ends up being convenient as I can actually store all the mask info within an RGBA .png
.
I have created a Unet Learner and dataset in which
- A batch (
xb
) is of size[2, 3, 350, 525]
- The labels are of size
[2, 4, 350, 525]
(1
for positive match,0
for background) - The output of the model is of size
[2, 4, 350, 525]
I am having problems training my model and it appears to fail while calculating the loss for a batch.
FlattenedLoss.__call__
is defined below:
Line 242
checks whether or not we are working with 2D input
(we are) and resizes from the original [2, 4, 350, 525]
to [367500, 4]
.
However, on line 243
it resizes target
from [2, 4, 350, 525]
to [1470000,]
. This triggers an exception within Torch due to the mismatched sizes.
ValueError: Expected input batch_size (367500) to match target batch_size (1470000).
I tried resizing both to [1470000,]
but received
~/.local/lib/python3.6/site-packages/torch/nn/functional.py in log_softmax(input, dim, _stacklevel, dtype)
1348 dim = _get_softmax_dim('log_softmax', input.dim(), _stacklevel)
1349 if dtype is None:
-> 1350 ret = input.log_softmax(dim)
1351 else:
1352 ret = input.log_softmax(dim, dtype=dtype)
IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1)
I tried also resizing both to [367500, 4]
but received
~/.local/lib/python3.6/site-packages/torch/nn/functional.py in nll_loss(input, target, weight, size_average, ignore_index, reduce, reduction)
1869 .format(input.size(0), target.size(0)))
1870 if dim == 2:
-> 1871 ret = torch._C._nn.nll_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
1872 elif dim == 4:
1873 ret = torch._C._nn.nll_loss2d(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
RuntimeError: multi-target not supported at /pytorch/aten/src/THCUNN/generic/ClassNLLCriterion.cu:15
At this point I’m mostly blindly guessing at what I should do instead of thinking about it.
It feels like I should:
- Take
sigmoid
of eachoutput
pixel in the volume so as to limit each to be between 0 and 1. - Flatten the output and target into
(1470000,)
- Compute cross-entropy between the
output
andtarget
.
Does this sound correct? If so, what’s the best way to work this custom functionality into FastAI? Do I pass in a custom loss_func
when creating my learner?
You are using the wrong loss function: if you have a multiclassification problem, you should use
BCEWithLogitsFlat
(not sure of the name but you can find the right one easily).
Perfect, this is exactly what I needed to make it all work!
For future visitors, I created my learner as:
learn = unet_learner(data, models.resnet18, loss_func=BCEWithLogitsFlat())
I notice that the last post here is from a few months ago so I’m wondering if perhaps anyone has created a working example of using or creating multi-class data sets for > 4 channels in the meantime?
It would be super helpful to see a working implementation.
Thanks in advance.