@zlapp what are the callbacks in cbs?

Also tried and learn.metrics=None before exporting but was still getting the same err.

Nothing else is missing, I don’t think… Try doing a git pull on your fastbook repo to get the latest, check in your fastbook folder to make sure there is, and run the cell you pasted (which has from utils import *), and then gv should work.

@zlapp see what happens without any callback

Thanks. I git pull fastbook, fastai2 and fastcore. Then, I !pip install utils, before running the cell. now it is working. Thank you.

However, I tried to run “from utils import *” in 01_intro.ipynb. It shown “No module of azure”, I tried to install azure, but it has compatibility issue. I guess, it must be the issue of window. I am running jupyter lab in locally in Win10. I am not running in Azure platform. I think it can be an issue. Since window is not the high priority of the development team. Now I am using it to look up information and run the training etc with GCP.

Thanks @boris just ran and it is occurring without any callback passed in during intialization of the learner. Double checked and do not have lambdas. Will try and create a reproducable colab notebook and send. From checking learner summary only defaults are present:

  - TrainEvalCallback
  - Recorder
  - ProgressCallback


AttributeError                            Traceback (most recent call last)
<ipython-input-55-fa5b61306ef3> in <module>()
----> 1 learn.export()

2 frames
/usr/local/lib/python3.6/dist-packages/fastai2/ in export(self, fname, pickle_protocol)
    497         #To avoid the warning that come from PyTorch about model not being checked
    498         warnings.simplefilter("ignore")
--> 499, self.path/fname, pickle_protocol=pickle_protocol)
    500     self.create_opt()
    501     if state is not None: self.opt.load_state_dict(state)

/usr/local/lib/python3.6/dist-packages/torch/ in save(obj, f, pickle_module, pickle_protocol, _use_new_zipfile_serialization)
    327     with _open_file_like(f, 'wb') as opened_file:
--> 328         _legacy_save(obj, opened_file, pickle_module, pickle_protocol)

/usr/local/lib/python3.6/dist-packages/torch/ in _legacy_save(obj, f, pickle_module, pickle_protocol)
    399     pickler = pickle_module.Pickler(f, protocol=pickle_protocol)
    400     pickler.persistent_id = persistent_id
--> 401     pickler.dump(obj)
    403     serialized_storage_keys = sorted(serialized_storages.keys())

AttributeError: Can't pickle local object 'combine_scheds.<locals>._inner'

I was able to reproduce the error in a standalone colab notebook (based on notebook 6 from course-v4). Export fails after an interrupted fine_tune. Pretty sure this is a bug.

Thanks for the help @muellerzr and @boris
The error is in the last cell (not the KeyboardInterrupt)

It’s exactly that interrupt that’s causing the issues @zlapp. Due to how fine_tune works, the actual fit function is really 2, and we’re interrupting it in the middle of it. Letting fine_tune run all the way I can successfully just export the model.

Interrupting the model I can recreate the error you caused. We can tell by looking in Notice the difference? (First one is before fine_tune, second is with the interrupt):

(#3) [TrainEvalCallback,Recorder,ProgressCallback]

(#4) [TrainEvalCallback,Recorder,ProgressCallback,ParamScheduler]

During fit any callback function can be added which is then removed at the end of it. We’re not letting it run to remove the fourth callback here


Very interesting. It’s nice to get a better look at the inner workings of fine_tune from this error I was getting. Would you consider this a bug or just not supported? Wondering if ProgressCallback should be cleaned up during/prior to export to avoid the error.

Could you see if it happens when installing fastai2 and fastcore from git?
The context manager added_cbs should have removed it even with a KeyboardInterrupt.

Just tried installing from git, the err seems to not be reproducing. Also checked cbs and saw ProgressCallback wasn’t present. :+1:

Is there any way of preventing train from computing the losses??

This could be useful with models that return lossses at training and not at inference.

Not sure what you mean here. At inference (e.g. predict) there will be no ‘loss’ since there is no target?


I have a model that returns the losses computed so, I don’t want one_batch to execute
self.loss = self.loss_func(self.pred, *self.yb); self('after_loss')

Because the model in training model return losses and in validation mode returns predictions.

I am trying to redefine the learner as follows:

class Mask_RCNN_Learner(Learner):
    def __init__(self, dls, model, loss_func=None, opt_func=Adam,, splitter=trainable_params, cbs=None,
                 metrics=None, path=None, model_dir='models', wd=None, wd_bn_bias=False, train_bn=True,
        super().__init__(dls, model, loss_func, opt_func, lr, splitter, cbs,
                 metrics, path, model_dir, wd, wd_bn_bias, train_bn,
    def _split(self, b):
        i = getattr(self.dls, 'n_inp', 1 if len(b)==1 else len(b)-1)
        self.xb,self.yb = b[:i],b[i:]
    def _do_epoch_train(self):
            self.dl = self.dls.train;                                     
            # Modification
            self.n_iter = len(self.dl)
            for o in enumerate(self.dl):
                i, b = *o
                self.iter = i
                    loss_dict = self.model(*self.xb,*self.yb)           
                    if len(self.yb) == 0: return
                    self.loss = sum(loss for loss in loss_dict.values())
                    if not return
                except CancelBatchException as e:
                    raise e   
        except CancelTrainException as e:
            raise e

    def _do_epoch_validate(self, ds_idx=1, dl=None):
        if dl is None: dl = self.dls[ds_idx]
            self.dl = dl;                                                 
            with torch.no_grad():
                # Modification
                self.n_iter = len(self.dl)
                for o in enumerate(self.dl):
                    i, b = *o
                    self.iter = i
                        detection = self.model(*self.xb);               
                        self.loss =  self.loss_func(detection, *self.yb)
                        # COMPUTING METRICS
                        if not return
                    except CancelBatchException as e:
                        raise e
        except CancelValidException as e:
            raise e                                                  
    def fit(self, n_epoch, lr=None, wd=None, cbs=None, reset_opt=False):
        with self.added_cbs(cbs):
            if reset_opt or not self.opt: self.create_opt()
            if wd is None: wd = self.wd
            if wd is not None: self.opt.set_hypers(wd=wd)
            self.opt.set_hypers( if lr is None else lr)

                for epoch in range(n_epoch):
                    except CancelEpochException as e:
                        raise e                       

            except CancelFitException as e:
                raise e 

However, I am getting:

  File "<ipython-input-39-0a0657f193c1>", line 23
SyntaxError: can't use starred expression here

The error message points to your use of *, in the line i, b = *o. This SO page explains it, I think? You should try changing that line to i, b = o, or delete that line and just change the line above to for i, b in enumerate(self.dl):

Not sure how all these relate to your question about not computing losses – I did not read through your code, and don’t know what’s happening in it…!

Good luck.


I have solved it, however don’t know pretty well how.

class Mask_RCNN_Learner(Learner):
    def __init__(self, dls, model, loss_func=None, opt_func=Adam,, splitter=trainable_params, cbs=None,
                 metrics=None, path=None, model_dir='models', wd=None, wd_bn_bias=False, train_bn=True,
        super().__init__(dls, model, loss_func, opt_func, lr, splitter, cbs,
                 metrics, path, model_dir, wd, wd_bn_bias, train_bn,
    def all_batches(self):
        self.n_iter = len(self.dl)
        for o in enumerate(self.dl): self.one_batch(*o)

    def one_batch(self, i, b):
        self.iter = i
            self._split(b);                                  self('begin_batch')
            loss_dict = self.model(*self.xb,self.yb);       self('after_pred')
            if len(self.yb) == 0: return
            loss = sum(loss for loss in loss_dict.values())
            self.loss = loss;                                self('after_loss')
            if not return
            self.loss.backward();                            self('after_backward')
            self.opt.step();                                 self('after_step')
        except CancelBatchException:                         self('after_cancel_batch')
        finally:                                             self('after_batch')

    def _do_begin_fit(self, n_epoch):
        self.n_epoch,self.loss = n_epoch,tensor(0.);         self('begin_fit')

    def _do_epoch_train(self):
            self.dl = self.dls.train;                        self('begin_train')
        except CancelTrainException:                         self('after_cancel_train')
        finally:                                             self('after_train')

    def _do_epoch_validate(self, ds_idx=1, dl=None):
        if dl is None: dl = self.dls[ds_idx]
            self.dl = dl;                                    self('begin_validate')
            with torch.no_grad(): self.all_batches()
        except CancelValidException:                         self('after_cancel_validate')
        finally:                                             self('after_validate')                                              
    def fit(self, n_epoch, lr=None, wd=None, cbs=None, reset_opt=False):
        with self.added_cbs(cbs):
            if reset_opt or not self.opt: self.create_opt()
            if wd is None: wd = self.wd
            if wd is not None: self.opt.set_hypers(wd=wd)
            self.opt.set_hypers( if lr is None else lr)

                for epoch in range(n_epoch):
                        self.epoch=epoch;          self('begin_epoch')
                    except CancelEpochException:   self('after_cancel_epoch')
                    finally:                       self('after_epoch')

            except CancelFitException:             self('after_cancel_fit')
            finally:                               self('after_fit')   

If you look, I am just adjusting these lines of code:

loss_dict = self.model(*self.xb,self.yb);       self('after_pred')
if len(self.yb) == 0: return
loss = sum(loss for loss in loss_dict.values())
self.loss = loss;                                self('after_loss')

This is done because i am working with torchvision.models.detection.maskrcnn-resnet50_fpn. This model expects as input and image an a dict with the target.

The thing is that in evaluation it return a dict with a mask, boxes and labels.

I would like the accuracy metrics to be calculated just in the mask.

That’s why I was asking where to modify the data passed into metrics. The output of this model is not a usual one

Does image should look visually similar before and after normalization?
I’ve created dataloaders with no augmentations, so the images came out of dls.one_batch() have been just transformed to float tensor.

xb,_ = dls.one_batch()
norm = Normalize.from_stats(*imagenet_stats)
xb_n = norm(xb)

Then I applied the Normalize transform and viewed the image, it looks totally distorted — some portion of image have been masked and while the visible portion seems like gone through major brightness/contrast change. I’ve also calculated my own statistics of dataset and tried to Normalize using that, but images look equally distorted.

I have seen in the code many annotations like @patch and @typedispatch.

What are they doing this annotations??