Image Segmentation based on Camvid

imrandude · April 8, 2019, 1:37pm

Hi,
I am trying out the method described in lesson 3 - CamVid notebook. The mask is of 1 particular object and I understand that 2 classes has be used while in such cases like shown here (https://www.kaggle.com/tanlikesmath/ultrasound-nerve-segmentation-with-fastai/data)

The error below is excerpt after setting, this happens while executing lr_find

os.environ[‘CUDA_LAUNCH_BLOCKING’] = “1”

I tried searching the forum for anything related, and could not able to find any. Any suggestion on how to proceed?

RuntimeError Traceback (most recent call last)
in
----> 1 lr_find(learn)

~/.conda/envs/fastai/lib/python3.7/site-packages/fastai/train.py in lr_find(learn, start_lr, end_lr, num_it, stop_div, wd)
30 cb = LRFinder(learn, start_lr, end_lr, num_it, stop_div)
31 epochs = int(np.ceil(num_it/len(learn.data.train_dl)))
—> 32 learn.fit(epochs, start_lr, callbacks=[cb], wd=wd)
33
34 def to_fp16(learn:Learner, loss_scale:float=None, max_noskip:int=1000, dynamic:bool=True, clip:float=None,

~/.conda/envs/fastai/lib/python3.7/site-packages/fastai/basic_train.py in fit(self, epochs, lr, wd, callbacks)
194 callbacks = [cb(self) for cb in self.callback_fns] + listify(callbacks)
195 if defaults.extra_callbacks is not None: callbacks += defaults.extra_callbacks
–> 196 fit(epochs, self, metrics=self.metrics, callbacks=self.callbacks+callbacks)
197
198 def create_opt(self, lr:Floats, wd:Floats=0.)->None:

~/.conda/envs/fastai/lib/python3.7/site-packages/fastai/basic_train.py in fit(epochs, learn, callbacks, metrics)
88 cb_handler = CallbackHandler(callbacks, metrics)
89 pbar = master_bar(range(epochs))
—> 90 cb_handler.on_train_begin(epochs, pbar=pbar, metrics=metrics)
91
92 exception=False

~/.conda/envs/fastai/lib/python3.7/site-packages/fastai/callback.py in on_train_begin(self, epochs, pbar, metrics)
262 self.state_dict.update(dict(n_epochs=epochs, pbar=pbar, metrics=metrics))
263 names = [(met.name if hasattr(met, ‘name’) else camel2snake(met.class.name)) for met in self.metrics]
–> 264 self(‘train_begin’, metrics_names=names)
265 if self.state_dict[‘epoch’] != 0:
266 self.state_dict[‘pbar’].first_bar.total -= self.state_dict[‘epoch’]

~/.conda/envs/fastai/lib/python3.7/site-packages/fastai/callback.py in call(self, cb_name, call_mets, **kwargs)
248 if call_mets:
249 for met in self.metrics: self._call_and_update(met, cb_name, **kwargs)
–> 250 for cb in self.callbacks: self._call_and_update(cb, cb_name, **kwargs)
251
252 def set_dl(self, dl:DataLoader):

~/.conda/envs/fastai/lib/python3.7/site-packages/fastai/callback.py in _call_and_update(self, cb, cb_name, **kwargs)
238 def call_and_update(self, cb, cb_name, **kwargs)->None:
239 “Call cb_name on cb and update the inner state.”
–> 240 new = ifnone(getattr(cb, f’on{cb_name}’)(**self.state_dict, **kwargs), dict())
241 for k,v in new.items():
242 if k not in self.state_dict:

~/.conda/envs/fastai/lib/python3.7/site-packages/fastai/callbacks/lr_finder.py in on_train_begin(self, pbar, **kwargs)
21 “Initialize optimizer and learner hyperparameters.”
22 setattr(pbar, ‘clean_on_interrupt’, True)
—> 23 self.learn.save(‘tmp’)
24 self.opt = self.learn.opt
25 self.opt.lr = self.sched.start

~/.conda/envs/fastai/lib/python3.7/site-packages/fastai/basic_train.py in save(self, file, return_path, with_opt)
249 if not with_opt: state = get_model(self.model).state_dict()
250 else: state = {‘model’: get_model(self.model).state_dict(), ‘opt’:self.opt.state_dict()}
–> 251 torch.save(state, target)
252 if return_path: return target
253

~/.conda/envs/fastai/lib/python3.7/site-packages/torch/serialization.py in save(obj, f, pickle_module, pickle_protocol)
217 >>> torch.save(x, buffer)
218 “”"
–> 219 return _with_file_like(f, “wb”, lambda f: _save(obj, f, pickle_module, pickle_protocol))
220
221

~/.conda/envs/fastai/lib/python3.7/site-packages/torch/serialization.py in _with_file_like(f, mode, body)
142 f = open(f, mode)
143 try:
–> 144 return body(f)
145 finally:
146 if new_fd:

~/.conda/envs/fastai/lib/python3.7/site-packages/torch/serialization.py in (f)
217 >>> torch.save(x, buffer)
218 “”"
–> 219 return _with_file_like(f, “wb”, lambda f: _save(obj, f, pickle_module, pickle_protocol))
220
221

~/.conda/envs/fastai/lib/python3.7/site-packages/torch/serialization.py in _save(obj, f, pickle_module, pickle_protocol)
296 f.flush()
297 for key in serialized_storage_keys:
–> 298 serialized_storages[key]._write_file(f, _should_read_directly(f))
299
300

RuntimeError: cuda runtime error (59) : device-side assert triggered at /opt/conda/conda-bld/pytorch_1549636813070/work/torch/csrc/generic/serialization.cpp:23

The mask I am using is something like this.

I’ve used (https://github.com/abreheret/PixelAnnotationTool) to create the annotations.

imrandude · April 11, 2019, 5:12pm

Issue was that I had input files of multiple dimension, standardizing it by padding and/or cropping solved the problem

tenzin · May 4, 2019, 6:47pm

I am having the same issue. Can you please elaborate your solution more. My image size is 500x830

imrandude · May 4, 2019, 7:02pm

Resize/crop/pad the training image set to fit the standard sized used in the pretrained model you are using. Ex: 224x224 for Resnet, 299x299 Inception etc

tenzin · May 4, 2019, 7:13pm

Great, thanks for your quick response.

But in this lesson 3 camvid notebook https://github.com/fastai/course-v3/blob/master/nbs/dl1/lesson3-camvid.ipynb the image size is 720x960 and it works. Can u please explain me about this ?

imrandude · May 4, 2019, 8:08pm

Agreed, all the images used in the Camvid lesson had same dimension (720x960), in my case, the dimension of each image was different.
If I remember it right, there was something in the code, which divided the input dimensions by 4 and if the dimensions were giving out fractions (ex: 24.5 or 22.7), the above mentioned error popped up.

You can look up the code where the division happens and modify the input dimensions of your entire dataset to avoid this specific error.