I am using the fastai library to segment images with multiple classes on a personal dataset. Consequently, I have multiple masks per image. What is the best way to proceed? Do I consolidate the masks into a single image where the background is 0, and each subsequent class is assigned an integer (1, 2, 3 etc…)? Or do I extend the SegmentationDataset class to become more like ImageMultiDataset so that it can handle multiple masks? All the examples of the V1 library that I can find of segmentation are of single class segmentation datasets. Thank you in advance!
The first solution is the best if the masks don’t overlap. There is an example with camvid of a segmentation problem with multiple classes.
If one pixel can belong to several classes though, you should have multiple channels in your mask.
Thank you for your prompt reply! For people who encounter a similar issue, the camvid lesson is linked below.
I chose to focus on implementing a model that can segment one feature first before adding additional channels to handle overlapping features. When I call ImageBunch instance functions like show_batch() and normalize() on an ImageBunch object , I get an error along the lines of
AttributeError: 'SegmentationDatasetOneFeature' object has no attribute 'normalize.' Why are the functions being called on the extended Dataset instead of the ImageBunch class?
def _get_y(self,i): #checks if mask exists image_path = glob.glob(self.y[i] + "/abc.*") #if there is no abc, load an empty one if len(image_path) == 0: tensor = torch.tensor((), dtype=torch.int) tensor.new_zeros((sz, sz)) return ImageSegment(Image(tensor)) else: return open_mask(image_path)
data = DataBunch.create(train_ds=train_ds, valid_ds=valid_ds, test_ds=None,
bs=bs, num_workers=nw, tfms=tfms)
data = data.normalize(imagenet_stats)
--------------------------------------------------------------------------- AttributeError Traceback (most recent call last) <ipython-input-30-f381a0614dd5> in <module> ----> 1 data = data.normalize(imagenet_stats) ~/anaconda3/envs/fastai-1.0/lib/python3.7/site-packages/fastai/basic_data.py in __getattr__(self, k) 114 return cls(*dls, path=path, device=device, tfms=tfms, collate_fn=collate_fn) 115 --> 116 def __getattr__(self,k:int)->Any: return getattr(self.train_dl, k) 117 def holdout(self, is_test:bool=False)->DeviceDataLoader: 118 "Returns correct holdout `Dataset` for test vs validation (`is_test`)." ~/anaconda3/envs/fastai-1.0/lib/python3.7/site-packages/fastai/basic_data.py in __getattr__(self, k) 49 50 def __len__(self)->int: return len(self.dl) ---> 51 def __getattr__(self,k:str)->Any: return getattr(self.dl, k) 52 53 @property ~/anaconda3/envs/fastai-1.0/lib/python3.7/site-packages/fastai/basic_data.py in DataLoader___getattr__(dl, k) 34 super().__init__(len(classes)) 35 ---> 36 def DataLoader___getattr__(dl, k:str)->Any: return getattr(dl.dataset, k) 37 DataLoader.__getattr__ = DataLoader___getattr__ 38 AttributeError: 'SegmentationDatasetOneFeature' object has no attribute 'normalize'
DataBunch doesn’t have a
normliaze function (so it’s trying to find it in its
train_ds). You have to use an
ImageDataBunch for this.
@Jamie which tool did you use to create the masks for the raw images?
Can you clarify what you mean by ‘raw images?’
raw images are input features without any pre-processing
In case you are looking for an annotation tool , try Labelme
For a dataset that has multiple masks for a single image, is it also possible (and also is it recommended?) to create multiple training instances of that single image, paired with each image mask, rather than say combining all of the masks into one single mask where there are no overlaps? I guess this would increase training time? But would this improve or degrade accuracy?
Also to add to the above, is there a way in fastai to combine these masks into one image in some sort of preprocessing step?
How do you recommend I proceed if we have masks with multiple channels (due to overlapping)? I currently have outputs as CHW tensors, but those don’t seem to work nicely with PIL if I use ImageSegment. Should I have multiple ImageSegment labels (1 per channel) for each input? Thank you.
You should use and
ImageSegment with CHW channels. That probably requires a custom
open method, so you will have to subclass
ImageSegment but then the transforms should be applied properly to the targets.
Does that also mean I have to create custom class for SegmentationLabelList and SegmentationItemList if I need a CHW ImageSegment?
I have a multiclass image segmentation problem where an individual pixel can belong to any of 4 possible classes. This ends up being convenient as I can actually store all the mask info within an RGBA
I have created a Unet Learner and dataset in which
- A batch (
xb) is of size
[2, 3, 350, 525]
- The labels are of size
[2, 4, 350, 525](
1for positive match,
- The output of the model is of size
[2, 4, 350, 525]
I am having problems training my model and it appears to fail while calculating the loss for a batch.
FlattenedLoss.__call__ is defined below:
242 checks whether or not we are working with 2D
input (we are) and resizes from the original
[2, 4, 350, 525] to
However, on line
243 it resizes
[2, 4, 350, 525] to
[1470000,]. This triggers an exception within Torch due to the mismatched sizes.
ValueError: Expected input batch_size (367500) to match target batch_size (1470000).
I tried resizing both to
[1470000,] but received
~/.local/lib/python3.6/site-packages/torch/nn/functional.py in log_softmax(input, dim, _stacklevel, dtype) 1348 dim = _get_softmax_dim('log_softmax', input.dim(), _stacklevel) 1349 if dtype is None: -> 1350 ret = input.log_softmax(dim) 1351 else: 1352 ret = input.log_softmax(dim, dtype=dtype) IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1)
I tried also resizing both to
[367500, 4] but received
~/.local/lib/python3.6/site-packages/torch/nn/functional.py in nll_loss(input, target, weight, size_average, ignore_index, reduce, reduction) 1869 .format(input.size(0), target.size(0))) 1870 if dim == 2: -> 1871 ret = torch._C._nn.nll_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index) 1872 elif dim == 4: 1873 ret = torch._C._nn.nll_loss2d(input, target, weight, _Reduction.get_enum(reduction), ignore_index) RuntimeError: multi-target not supported at /pytorch/aten/src/THCUNN/generic/ClassNLLCriterion.cu:15
At this point I’m mostly blindly guessing at what I should do instead of thinking about it.
It feels like I should:
outputpixel in the volume so as to limit each to be between 0 and 1.
- Flatten the output and target into
- Compute cross-entropy between the
Does this sound correct? If so, what’s the best way to work this custom functionality into FastAI? Do I pass in a custom
loss_func when creating my learner?
You are using the wrong loss function: if you have a multiclassification problem, you should use
BCEWithLogitsFlat (not sure of the name but you can find the right one easily).
Perfect, this is exactly what I needed to make it all work!
For future visitors, I created my learner as:
learn = unet_learner(data, models.resnet18, loss_func=BCEWithLogitsFlat())