Issues building minimal image segmentation example using custom data
I’ve been working through the first few lectures, and decided that it might be interesting to try to reproduce the image segmentation problem from Lecture 01 / 01_intro.ipynb
using a custom data set. In this case, I wanted to see if I could build a model that would segment bluebirds from an image. I downloaded a few (in this case, six) images, and annotated them using the Pixel Annotation Tool, which produced something like the following outputs, all of which have been resized to dimensions (w,h) = (300, 200):
Original Image
Masked images
"Watershed" Masked Image
Imperfect, but hey. My codes.txt
file simply has
Bluebird
Branch
although I’ve tried with and without adding a Background
line, and the full directory structure looks like
. minimal_segmentation.ipynb
├──data
├── bluebird_config.json
├── bluebirds
├── codes.txt
├── resized_images
│ ├── 0001.png
│ ├── 0002.png
│ ├── 0003.png
│ ├── 0004.png
│ ├── 0005.png
│ └── 0006.png
└── resized_labels
├── 0001_watershed_mask.png
├── 0002_watershed_mask.png
├── 0003_watershed_mask.png
├── 0004_watershed_mask.png
├── 0005_watershed_mask.png
└── 0006_watershed_mask.png
Obviously, training a network from only six example is not going to produce almost anything useful, but I’m really going for building basically the absolute minimal example I possibly can, and scale it from there.
This is the code I’m running in my notebook, broken up into cells.
from fastai2.vision.all import *
from pathlib import Path
## For debugging, I'm currently run on CPU-only
import os
os.environ['CUDA_LAUNCH_BLOCKING'] = "1"
path = Path("data/bluebirds")
# This dataloader structure is one I pulled from the online documentation, which itself runs correctly, but still causes the same error at the end
#fnames = get_image_files(path/'resized_images')
#def label_func(x): return path/'resized_labels'/f'{x.stem}_watershed_mask{x.suffix}'
#codes = np.loadtxt(path/'codes.txt', dtype=str)
# dls = SegmentationDataLoaders.from_label_func(path, fnames, label_func, bs=1,codes=codes)
# This dataloader is almost directly pulled from the example code in Lesson #1
dls = SegmentationDataLoaders.from_label_func(
path, bs=2, fnames = get_image_files(path/"resized_images"),
label_func = lambda o: path/'resized_labels'/f'{o.stem}_watershed_mask{o.suffix}',
codes = np.loadtxt(path/'codes.txt', dtype=str)
)
learn = unet_learner(dls, resnet18)
learn.freeze()
# The following line the where the error occurs
learn.fine_tune(2,3e-3)
# I've also tried just using variations on learn.fit(), with no luck
with the error:
<ipython-input-6-59add907aa7c> in <module>
----> 1 learn.fine_tune(2,3e-3)
[...] // A stack trace that works its way through a bunch of PyTorch, which I'm happy to post if asked
RuntimeError: cuda runtime error (710) : device-side assert triggered at /pytorch/aten/src/THCUNN/generic/ClassNLLCriterion.cu:118
At this point, I’m a bit confused about how/what/why this error is occuring. I’m able to successfully complete the segmentation using the camvid_tiny
dataset, which seems to be in a very similar form. Some things I’ve tried playing around with:
- As mentioned, both the dataloader structures and
fit()
vs.fine_tune()
sturcture. - Changing the number of classes in the
codes.txt
. This seems to create a similar issue, so maybe the error is because of a mismatched class size? Still, I’m not seeing the number of classes being manually set anywhere, so unless it’s an opaque abstraction betweencodes.txt
and thelabels/
images, I’m at a loss. - Using
resnet34
andresnet18
, to no avail - Cutting the body/building a new head and model based on the “Chapter 15: Application Architectures Deep Dive”, although my inexperience may be the issue with that
- Googling/DDGing all the things, and finding many examples repeating this using the
CamVid
data, but none using a “custom” set.
Image segmentation on custom classes using new data, based on transfer learning, seems like it would be a particularly useful application for people new to machine learning (I know it would be for me), so having some sort of canonical example for building a minimal solution to that is something I would find immensely helpful. It certainly feels like this shouldn’t be as hard as it’s become for me, so any advice would be deeply appreciated!