Image segmentation in fastai v1 (carvana notebook)

shbkan · October 30, 2018, 2:34pm

I am planning to use the new fastai v1 library with the carvana notebook from 2018 MOOC. I was wondering if someone else has tried it and if its worth changing or everything needs to be completely rewritten. I am also curious if its worth, in general. to try and adapt the notebooks from mooc 2018 using the new library.

Kaspar · October 30, 2018, 3:45pm

@sgugger is still working on it

sgugger · October 30, 2018, 7:15pm

Not anymore, no.
@shbkan, it can be a good exercise yes. After tonight’s lesson, there is going to be some new materials for segmentaiton so you may want to wait for that.

tcapelle · October 30, 2018, 8:05pm

I implemented segmentation with V1 for the TGS salt competition. It works pretty good:

github.com

tcapelle/TGS/blob/master/TGS_fastai_v1.ipynb

{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "qj4OpX8C6JaC"
   },
   "outputs": [],
   "source": [
    "%reload_ext autoreload\n",
    "%autoreload 2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {

This file has been truncated. show original

shbkan · October 31, 2018, 10:11am

@sgugger When do we expect the new materials to be available? I’ll give it try after watching the new lesson. I visited data-institute website but the lecture from 30th Oct isn’t available yet (I guess you were talking about that lesson).

shbkan · October 31, 2018, 10:12am

@tcapelle thanks for sharing that. I will check it out and probably adapt it for carvana dataset.

vikasbhandary · November 6, 2018, 1:08pm

I have been trying to implement image segmentation in fastai v1 (in colab with 20g shared memory), but I am getting following error whenever I call learn.fit_one_cycle or learn.lr_find().

RuntimeError: cuda runtime error (59) : device-side assert triggered at /pytorch/aten/src/THC/generic/THCTensorCopy.cpp:20

I have tried both pre-trained and empty resnet34 as base arch. I have copied most of the code from notebook mentioned in this thread.

train_ds = SegmentationDataset( x=train_fn, y=train_mask, classes=[0,1])
valid_ds = SegmentationDataset( x=valid_fn, y=valid_mask, classes=[0,1])
train_ds, valid_ds = transform_datasets(train_ds, valid_ds, tfms=tfms, tfm_y=True, size=(128,128))
seg_data = DataBunch.create(train_ds, valid_ds, bs=bs)
metrics = [accuracy , dice ]
def load_pretrained(model, path):
…weights = torch.load(path, map_location=lambda storage, loc: storage)
…model.load_state_dict(weights, strict=False)
return model
learn = Learner.create_unet( data=seg_data, arch=resnet34, pretrained=False)
learn.lr_find()
learn.recorder.plot()

let me know what I am missing. A hint would be enough.
Error info: An exception is raised on line 94 basic_train.py

sgugger · November 6, 2018, 3:30pm

This usually comes from a bad index. Try running your model on the CPU to get a clearer message.

charming · November 8, 2018, 4:08am

I am learning the dev code, but in 006a_unet ,i cannot run after building the dynamic model, get the following error.

body = create_body(tvm.resnet34(True), -2)
model = DynamicUnet(body, n_classes=2).cuda()
learn = Learner(data, model, metrics=metrics,
                loss_fn=CrossEntropyFlat())
learn.split([model[0][6], model[1]])
learn.freeze()
lr_find(learn)
---------------------------------------------------
RuntimeError: Traceback (most recent call last):
  File "/home/charm/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 138, in _worker_loop
    samples = collate_fn([dataset[i] for i in batch_indices])
  File "/home/charm/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 138, in <listcomp>
    samples = collate_fn([dataset[i] for i in batch_indices])
  File "/home/charm/Documents/fastai/zero_start/nb_006.py", line 58, in __getitem__
    if self.tfm_y: y = apply_tfms(self.tfms, y, **self.y_kwargs)
  File "/home/charm/Documents/fastai/zero_start/nb_003.py", line 83, in apply_tfms
    else: x = tfm(x)
  File "/home/charm/Documents/fastai/zero_start/nb_002.py", line 396, in __call__
    return self.tfm(x, *args, **{**self.resolved, **kwargs}) if self.do_run else x
  File "/home/charm/Documents/fastai/zero_start/nb_002.py", line 340, in __call__
    if args: return self.calc(*args, **kwargs)
  File "/home/charm/Documents/fastai/zero_start/nb_002.py", line 345, in calc
    if tfm._wrap: return getattr(x, tfm._wrap)(tfm.func, *args, **kwargs)
  File "/home/charm/Documents/fastai/zero_start/nb_002.py", line 244, in pixel
    self.px = func(self.px, *args, **kwargs)
  File "/home/charm/Documents/fastai/zero_start/nb_002.py", line 217, in px
    self.refresh()
  File "/home/charm/Documents/fastai/zero_start/nb_006.py", line 15, in refresh
    return super().refresh()
  File "/home/charm/Documents/fastai/zero_start/nb_002.py", line 209, in refresh
    self._px = grid_sample(self._px, self.flow, **self.sample_kwargs)
  File "/home/charm/Documents/fastai/zero_start/nb_002.py", line 509, in grid_sample
    return F.grid_sample(x[None], coords, mode=mode, padding_mode=padding_mode)[0]
  File "/home/charm/anaconda3/lib/python3.7/site-packages/torch/nn/functional.py", line 2224, in grid_sample
    GRID_SAMPLE_PADDING_MODES[padding_mode])
RuntimeError: grid_sampler(): expected input and grid to have same dtype, but input has Long and grid has Float

charming · November 8, 2018, 12:04pm

I solve the problem

’open_mask‘ should return float type,
the ture code is:
def open_mask(fn:PathOrStr)-> ImageMask: return ImageMask(pil2tensor(PIL.Image.open(fn)).float())
----------
but  in nb_006.py return long type:
def open_mask(fn:PathOrStr) -> ImageMask: return ImageMask(pil2tensor(PIL.Image.open(fn)).long())

tamhash · November 17, 2018, 4:59pm

trying to follow new notebook(https://github.com/fastai/course-v3/blob/master/nbs/dl1/lesson3-camvid.ipynb) with caravana data but getting following error for lr_find(learn)

Any thoughts, suggestions?

---------------------------------------------------------------------------

RuntimeError Traceback (most recent call last)
in
----> 1 lr_find(learn)

~/myenv/fastai/fastai/train.py in lr_find(learn, start_lr, end_lr, num_it, stop_div, **kwargs)
26 cb = LRFinder(learn, start_lr, end_lr, num_it, stop_div)
27 a = int(np.ceil(num_it/len(learn.data.train_dl)))
—> 28 learn.fit(a, start_lr, callbacks=[cb], **kwargs)
29
30 def to_fp16(learn:Learner, loss_scale:float=512., flat_master:bool=False)->Learner:

~/myenv/fastai/fastai/basic_train.py in fit(self, epochs, lr, wd, callbacks)
160 callbacks = [cb(self) for cb in self.callback_fns] + listify(callbacks)
161 fit(epochs, self.model, self.loss_func, opt=self.opt, data=self.data, metrics=self.metrics,
–> 162 callbacks=self.callbacks+callbacks)
163
164 def create_opt(self, lr:Floats, wd:Floats=0.)->None:

~/myenv/fastai/fastai/basic_train.py in fit(epochs, model, loss_func, opt, data, callbacks, metrics)
92 except Exception as e:
93 exception = e
—> 94 raise e
95 finally: cb_handler.on_train_end(exception)
96

~/myenv/fastai/fastai/basic_train.py in fit(epochs, model, loss_func, opt, data, callbacks, metrics)
82 for xb,yb in progress_bar(data.train_dl, parent=pbar):
83 xb, yb = cb_handler.on_batch_begin(xb, yb)
—> 84 loss = loss_batch(model, xb, yb, loss_func, opt, cb_handler)
85 if cb_handler.on_batch_end(loss): break
86

~/myenv/fastai/fastai/basic_train.py in loss_batch(model, xb, yb, loss_func, opt, cb_handler)
20
21 if not loss_func: return to_detach(out), yb[0].detach()
—> 22 loss = loss_func(out, *yb)
23
24 if opt is not None:

~/anaconda3/envs/fastai1.0/lib/python3.6/site-packages/torch/nn/modules/module.py in call(self, *input, **kwargs)
477 result = self._slow_forward(*input, **kwargs)
478 else:
–> 479 result = self.forward(*input, **kwargs)
480 for hook in self._forward_hooks.values():
481 hook_result = hook(self, input, result)

~/myenv/fastai/fastai/layers.py in forward(self, input, target)
96 def forward(self, input:Tensor, target:Tensor) -> Rank0Tensor:
97 n,c,*_ = input.shape
—> 98 return super().forward(input.view(n, c, -1), target.view(n, -1))
99
100 class MSELossFlat(nn.MSELoss):

~/anaconda3/envs/fastai1.0/lib/python3.6/site-packages/torch/nn/modules/loss.py in forward(self, input, target)
865 def forward(self, input, target):
866 return F.cross_entropy(input, target, weight=self.weight,
–> 867 ignore_index=self.ignore_index, reduction=self.reduction)
868
869

~/anaconda3/envs/fastai1.0/lib/python3.6/site-packages/torch/nn/functional.py in cross_entropy(input, target, weight, size_average, ignore_index, reduce, reduction)
1691 if size_average is not None or reduce is not None:
1692 reduction = _Reduction.legacy_get_string(size_average, reduce)
-> 1693 return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)
1694
1695

~/anaconda3/envs/fastai1.0/lib/python3.6/site-packages/torch/nn/functional.py in nll_loss(input, target, weight, size_average, ignore_index, reduce, reduction)
1561 target = target.contiguous().view(n, 1, -1)
1562 if reduction is not ‘none’:
-> 1563 return torch._C._nn.nll_loss2d(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
1564 out = torch._C._nn.nll_loss2d(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
1565 return out.view(out_size)

RuntimeError: Assertion `cur_target >= 0 && cur_target < n_classes’ failed. at /opt/conda/conda-bld/pytorch-nightly_1541787457209/work/aten/src/THNN/generic/SpatialClassNLLCriterion.c:110

Shashank509 · October 7, 2019, 8:26pm

Is it possible to convert targets to one hot encoding with dimension (M,N,2) where all the pixels represent 0 or 1.
For this (M,N,2) representation of targets should we use the cross enttopy loss or we should switch to dice loss?