Runtime Error

dillonfreed · September 12, 2020, 7:40pm

So I am totally, totally new this (and coding in general). I am trying to recreate Chapter 2 (PRODUCTION LESSON) on my own, just to get used to typing the code.

Everything was going fine, but then I hit a snag

epoch train_loss valid_loss error_rate time
0 0.000000 00:04

RuntimeError Traceback (most recent call last)
in
1 learn = cnn_learner(dls, resnet34, metrics=error_rate)
----> 2 learn.fine_tune(1)

/opt/conda/envs/fastai/lib/python3.8/site-packages/fastcore/utils.py in _f(*args, **kwargs)
452 init_args.update(log)
453 setattr(inst, ‘init_args’, init_args)
–> 454 return inst if to_return else f(*args, **kwargs)
455 return _f
456
I have no idea what to do?

orendar · September 12, 2020, 11:12pm

Hello and welcome to the community
As a rule of thumb, people generally won’t be able to help you unless you post your code along with the full stack-trace of the error (which you can format nicely using the “Preformatted text” option here on the forums). It would also help to know whether you’ve installed the latest fastai & fastcore versions before running your code.

dillonfreed · September 13, 2020, 3:57am

Ah thank you! Sorry!! I still am not sure if formatted this correctly (see below).
This is all like Greek to me lol. So I clicked </> and then pasted my code and the error. I think I did that right.

Also, maybe there is something I have not installed. I am working in a Paperspace, using a free Jupyter notebook.

So this was the code (everything before this seemed fine):
learn = cnn_learner(dls, resnet34, metrics=error_rate)
learn.fine_tune(1)

And it lead to this error:
RuntimeError Traceback (most recent call last)
in
1 learn = cnn_learner(dls, resnet34, metrics=error_rate)
----> 2 learn.fine_tune(1)

/opt/conda/envs/fastai/lib/python3.8/site-packages/fastcore/utils.py in _f(*args, **kwargs)
    452         init_args.update(log)
    453         setattr(inst, 'init_args', init_args)
--> 454         return inst if to_return else f(*args, **kwargs)
    455     return _f
    456 

/opt/conda/envs/fastai/lib/python3.8/site-packages/fastai/callback/schedule.py in fine_tune(self, epochs, base_lr, freeze_epochs, lr_mult, pct_start, div, **kwargs)
    159     "Fine tune with `freeze` for `freeze_epochs` then with `unfreeze` from `epochs` using discriminative LR"
    160     self.freeze()
--> 161     self.fit_one_cycle(freeze_epochs, slice(base_lr), pct_start=0.99, **kwargs)
    162     base_lr /= 2
    163     self.unfreeze()

/opt/conda/envs/fastai/lib/python3.8/site-packages/fastcore/utils.py in _f(*args, **kwargs)
    452         init_args.update(log)
    453         setattr(inst, 'init_args', init_args)
--> 454         return inst if to_return else f(*args, **kwargs)
    455     return _f
    456 

/opt/conda/envs/fastai/lib/python3.8/site-packages/fastai/callback/schedule.py in fit_one_cycle(self, n_epoch, lr_max, div, div_final, pct_start, wd, moms, cbs, reset_opt)
    111     scheds = {'lr': combined_cos(pct_start, lr_max/div, lr_max, lr_max/div_final),
    112               'mom': combined_cos(pct_start, *(self.moms if moms is None else moms))}
--> 113     self.fit(n_epoch, cbs=ParamScheduler(scheds)+L(cbs), reset_opt=reset_opt, wd=wd)
    114 
    115 # Cell

/opt/conda/envs/fastai/lib/python3.8/site-packages/fastcore/utils.py in _f(*args, **kwargs)
    452         init_args.update(log)
    453         setattr(inst, 'init_args', init_args)
--> 454         return inst if to_return else f(*args, **kwargs)
    455     return _f
    456 

/opt/conda/envs/fastai/lib/python3.8/site-packages/fastai/learner.py in fit(self, n_epoch, lr, wd, cbs, reset_opt)
    202             self.opt.set_hypers(lr=self.lr if lr is None else lr)
    203             self.n_epoch,self.loss = n_epoch,tensor(0.)
--> 204             self._with_events(self._do_fit, 'fit', CancelFitException, self._end_cleanup)
    205 
    206     def _end_cleanup(self): self.dl,self.xb,self.yb,self.pred,self.loss = None,(None,),(None,),None,None

/opt/conda/envs/fastai/lib/python3.8/site-packages/fastai/learner.py in _with_events(self, f, event_type, ex, final)
    153 
    154     def _with_events(self, f, event_type, ex, final=noop):
--> 155         try:       self(f'before_{event_type}')       ;f()
    156         except ex: self(f'after_cancel_{event_type}')
    157         finally:   self(f'after_{event_type}')        ;final()

/opt/conda/envs/fastai/lib/python3.8/site-packages/fastai/learner.py in _do_fit(self)
    192         for epoch in range(self.n_epoch):
    193             self.epoch=epoch
--> 194             self._with_events(self._do_epoch, 'epoch', CancelEpochException)
    195 
    196     @log_args(but='cbs')

/opt/conda/envs/fastai/lib/python3.8/site-packages/fastai/learner.py in _with_events(self, f, event_type, ex, final)
    153 
    154     def _with_events(self, f, event_type, ex, final=noop):
--> 155         try:       self(f'before_{event_type}')       ;f()
    156         except ex: self(f'after_cancel_{event_type}')
    157         finally:   self(f'after_{event_type}')        ;final()

/opt/conda/envs/fastai/lib/python3.8/site-packages/fastai/learner.py in _do_epoch(self)
    186 
    187     def _do_epoch(self):
--> 188         self._do_epoch_train()
    189         self._do_epoch_validate()
    190 

/opt/conda/envs/fastai/lib/python3.8/site-packages/fastai/learner.py in _do_epoch_train(self)
    178     def _do_epoch_train(self):
    179         self.dl = self.dls.train
--> 180         self._with_events(self.all_batches, 'train', CancelTrainException)
    181 
    182     def _do_epoch_validate(self, ds_idx=1, dl=None):

/opt/conda/envs/fastai/lib/python3.8/site-packages/fastai/learner.py in _with_events(self, f, event_type, ex, final)
    153 
    154     def _with_events(self, f, event_type, ex, final=noop):
--> 155         try:       self(f'before_{event_type}')       ;f()
    156         except ex: self(f'after_cancel_{event_type}')
    157         finally:   self(f'after_{event_type}')        ;final()

/opt/conda/envs/fastai/lib/python3.8/site-packages/fastai/learner.py in all_batches(self)
    159     def all_batches(self):
    160         self.n_iter = len(self.dl)
--> 161         for o in enumerate(self.dl): self.one_batch(*o)
    162 
    163     def _do_one_batch(self):

/opt/conda/envs/fastai/lib/python3.8/site-packages/fastai/learner.py in one_batch(self, i, b)
    174         self.iter = i
    175         self._split(b)
--> 176         self._with_events(self._do_one_batch, 'batch', CancelBatchException)
    177 
    178     def _do_epoch_train(self):

/opt/conda/envs/fastai/lib/python3.8/site-packages/fastai/learner.py in _with_events(self, f, event_type, ex, final)
    153 
    154     def _with_events(self, f, event_type, ex, final=noop):
--> 155         try:       self(f'before_{event_type}')       ;f()
    156         except ex: self(f'after_cancel_{event_type}')
    157         finally:   self(f'after_{event_type}')        ;final()

/opt/conda/envs/fastai/lib/python3.8/site-packages/fastai/learner.py in _do_one_batch(self)
    162 
    163     def _do_one_batch(self):
--> 164         self.pred = self.model(*self.xb);                self('after_pred')
    165         if len(self.yb) == 0: return
    166         self.loss = self.loss_func(self.pred, *self.yb); self('after_loss')

/opt/conda/envs/fastai/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
    720             result = self._slow_forward(*input, **kwargs)
    721         else:
--> 722             result = self.forward(*input, **kwargs)
    723         for hook in itertools.chain(
    724                 _global_forward_hooks.values(),

/opt/conda/envs/fastai/lib/python3.8/site-packages/torch/nn/modules/container.py in forward(self, input)
    115     def forward(self, input):
    116         for module in self:
--> 117             input = module(input)
    118         return input
    119 

/opt/conda/envs/fastai/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
    720             result = self._slow_forward(*input, **kwargs)
    721         else:
--> 722             result = self.forward(*input, **kwargs)
    723         for hook in itertools.chain(
    724                 _global_forward_hooks.values(),

/opt/conda/envs/fastai/lib/python3.8/site-packages/torch/nn/modules/container.py in forward(self, input)
    115     def forward(self, input):
    116         for module in self:
--> 117             input = module(input)
    118         return input
    119 

/opt/conda/envs/fastai/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
    720             result = self._slow_forward(*input, **kwargs)
    721         else:
--> 722             result = self.forward(*input, **kwargs)
    723         for hook in itertools.chain(
    724                 _global_forward_hooks.values(),

/opt/conda/envs/fastai/lib/python3.8/site-packages/torch/nn/modules/container.py in forward(self, input)
    115     def forward(self, input):
    116         for module in self:
--> 117             input = module(input)
    118         return input
    119 

/opt/conda/envs/fastai/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
    720             result = self._slow_forward(*input, **kwargs)
    721         else:
--> 722             result = self.forward(*input, **kwargs)
    723         for hook in itertools.chain(
    724                 _global_forward_hooks.values(),

/opt/conda/envs/fastai/lib/python3.8/site-packages/torchvision/models/resnet.py in forward(self, x)
     57         identity = x
     58 
---> 59         out = self.conv1(x)
     60         out = self.bn1(out)
     61         out = self.relu(out)

/opt/conda/envs/fastai/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
    720             result = self._slow_forward(*input, **kwargs)
    721         else:
--> 722             result = self.forward(*input, **kwargs)
    723         for hook in itertools.chain(
    724                 _global_forward_hooks.values(),

/opt/conda/envs/fastai/lib/python3.8/site-packages/torch/nn/modules/conv.py in forward(self, input)
    417 
    418     def forward(self, input: Tensor) -> Tensor:
--> 419         return self._conv_forward(input, self.weight)
    420 
    421 class Conv3d(_ConvNd):

/opt/conda/envs/fastai/lib/python3.8/site-packages/torch/nn/modules/conv.py in _conv_forward(self, input, weight)
    413                             weight, self.bias, self.stride,
    414                             _pair(0), self.dilation, self.groups)
--> 415         return F.conv2d(input, weight, self.bias, self.stride,
    416                         self.padding, self.dilation, self.groups)
    417 

/opt/conda/envs/fastai/lib/python3.8/site-packages/torch/utils/data/_utils/signal_handling.py in handler(signum, frame)
     64         # This following call uses `waitid` with WNOHANG from C side. Therefore,
     65         # Python can still get and update the process status successfully.
---> 66         _error_if_any_worker_fails()
     67         if previous_handler is not None:
     68             previous_handler(signum, frame)

RuntimeError: DataLoader worker (pid 2199) is killed by signal: Killed.

Again, thank you so much.

KevinB · September 13, 2020, 4:12am

try setting num_workers=0 when you define your dataloaders (dls)

joedockrill · September 13, 2020, 9:05am

You’re going to struggle at this point in time. The pre-requisites for the course are about 1 year of programming and high-school maths.

That’s not to try and put you off or say that you shouldn’t try and do the course anyway, but you’re going to have to teach yourself programming at the same time. FWIW, you may find you make better progress if you go and find a Python course and do that for a few weeks and then come back to this afterwards.

It’s going to be hard trying to figure out what’s going wrong with things until you can look at a call-stack like that and know which part of it is the problem, what to start looking for, where to start looking (is it a fastai problem, a paperspace problem, a python problem? should you be searching here or google?).

That’s my $0.02 worth anyway.

Runtime Error

epoch train_loss valid_loss error_rate time 0 0.000000 00:04

epoch train_loss valid_loss error_rate time
0 0.000000 00:04