Lesson 15 official topic

This is a wiki post - feel free to edit to add links from the lesson or other useful info.

<<< Lesson 14Lesson 16 >>>

Lesson resources

Links from the lesson


Jeremy I think the slope kernels are incorrect in that the order of -1,0,1 are not as the perpendicular an horizontal ones.

I have a basic question about Python syntax.

For the with_cbs decorator as defined below, how does o (in function _f) get bound to the calling class Learner’s instance?

class with_cbs:
    def __init__(self, nm): self.nm = nm
    def __call__(self, f):
        def _f(o, *args, **kwargs):
                f(o, *args, **kwargs)
            except globals()[f'Cancel{self.nm.title()}Exception']: pass
        return _f

To be honest, your question made me realise I sort of know it works, but not to the point of being able to explain clearly. Below is a simplified example for myself, and hopefully others.

self as an argument name is largely a convention. o is a self, just the self of another object. Because the __call__ body already uses the decorator self as self.mn, the newly created function has to use another name for the Learner self.

But this is not answering exactly (or usefully) your question, I think.

If we define a simplified decorator, with a non-conventional self for illustration:

class with_print:
    #dec_self: decorator's self
    def __init__(dec_self, nm): dec_self.nm = nm
    def __call__(dec_self, f):
        def _f(self, *args, **kwargs):
                print(f'before {dec_self.nm}')
                f(self, *args, **kwargs)
                print(f'after {dec_self.nm}')
            except globals()[f'Cancel{dec_self.nm.title()}Exception']: pass
        return _f

and use it on a method of a Decorated class.

class Decorated:
    def __init__(self):
        self.counter = 0

    def increment(self):
        self.counter += 1
    def decrement(self):
        self.counter -= 1
d = Decorated()

prints out the before/after messages.

d.increment?? in a notebook cell shows the body of the replacement def _f(self, *args, **kwargs): function, whereas d.decrement?? of course does not.

Now let’s emulate the effect of @with_print by adding a class method to Decorated

    def decorate_decrement(cls):
        """Decorates (wraps and replaces) the definition of the decrement method"""
        cls._decrement_deco = with_print("decrementation")
        cls.decrement = cls._decrement_deco(cls.decrement)

If we call:


d.decrement will now have the new body and a call to d.decrement() prints out the extra before/after.

d.decrement() is actually the same as calling Decorated.decrement(d), so far as I understand, which is actually the answer to your initial question, in a non-obvious way.

Hope this helps.


In the Learner class, It appears the statistics being printed for each epoch are cumulative because the lists self.accs, self.losses, and self.ns used to store the statistics for each batch are not reset to empty lists at the beginning of each epoch so we’re using values from previous epochs as well when doing the calculations.
Am I getting this correct or is there something I’m missing here?

Edit: Ok, this is the case just for the first basic Learner; In the Callback Learner, we do the reset before each epoch with the MetricCB.

I am attempting to implement a VAE on color images in fastai2. I am using the Pets dataset and am shrinking the images to 16x16 for speed and simplicity. my notebook is here The model is a 3 layer convNet

I have implemented the loss function with the help of @johnowhitaker’s VAE notebook. because the loss uses BCE which expects the output values to be between 0 and 1 therefore the input values also need to be in that range. That means the normalization was moved into the first part of the encoder model. For simplicity, I used a mean and standard deviation =0.5 which is what google’s mobelnet images did as opposed to the typical imagenet stats.

The model’s loss goes from 11,830 to 11,382 then plateaus. The result is a blurry splotch. Does anyone have experiencing getting the VAE to work? is there a rule of thumb to the hidden dims at the bottleneck/ number of layers that give decent results? Am I shooting myself in the foot by downsampling the input images too much?

1 Like

I wrote a blog consolidating to my understanding of the learner class and the callback system.

Link : The Ultimate Training Loop

Hope others will find it useful. Also, as I am trying to learn and get better at it, any feedback would be greatly appreciated.


Looks good! Note that the latest Learner has switched from using a context manager to a decorator, so you might want to update that part of your post.

1 Like

Sure Jeremy. Will do that. Thank you for the feedback :grin:

Hi everyone! So I tried to install the miniai library as described by jeremy in the video but I get the errormessage that “miniai does not appear to be a Python project: neither ‘setup.py’ nor ‘pyproject.toml’ found”.
Searching online I only found the reference that yes, this needs to be in the folder. But it does not seem to be generated and it did not seem to appear in jeremy’s folder in the lecture video either. I also could not find a reference to that error when searching for it in the nbdev subforum.

I understand that I can just copy the folder into the nbs folder (which is a workaround) and do it like that, but given that we take the effort to build a framework library here, I would like to do it right.

I’m sure some of you can help me out here.

Just ensure you are inside the dir where setup.py is there to run the command successfully.


Instructions per GitHub - fastai/course22p2: course.fast.ai 2022 part 2 - under construction

Clone this repo, `cd` to it, and run:

pip install -e .

This installs the `miniai` library which we're creating in this course.

Hah, I knew it was something stupid but could not find it. Now it works, of course. Thanks a lot to both of you!

1 Like

In this lesson we were able to use from torcheval.metrics import MulticlassAccuracy

torcheval.metrics share the same API, I don’t get what is going on when I try using Throughput metric.

from torcheval.metrics import MulticlassAccuracy, Throughput
metrics = MetricsCB(accuracy=MulticlassAccuracy(), throughput=Throughput())

Here is the bug. It is quite cryptic for me:

Output exceeds the size limit. Open the full output data in a text editor
RuntimeError                              Traceback (most recent call last)
Cell In[6], line 15
     13 model = get_cnn(act_gr, norm=nn.BatchNorm2d).apply(iw)
     14 learn = TrainLearner(model, dls, F.cross_entropy, lr=lr, cbs=cbs+xtra, opt_func=optim.AdamW)
---> 15 learn.fit(epochs)

File ~/code/fredcourse22/miniai/learner.py:170, in Learner.fit(self, n_epochs, train, valid, cbs, lr)
    168     if lr is None: lr = self.lr
    169     if self.opt_func: self.opt = self.opt_func(self.model.parameters(), lr)
--> 170     self._fit(train, valid)
    171 finally:
    172     for cb in cbs: self.cbs.remove(cb)

File ~/code/fredcourse22/miniai/learner.py:121, in with_cbs.__call__.<locals>._f(o, *args, **kwargs)
    119 try:
    120     o.callback(f'before_{self.nm}')
--> 121     f(o, *args, **kwargs)
    122     o.callback(f'after_{self.nm}')
    123 except globals()[f'Cancel{self.nm.title()}Exception']: pass

File ~/code/fredcourse22/miniai/learner.py:158, in Learner._fit(self, train, valid)
    155 @with_cbs('fit')
    156 def _fit(self, train, valid):
    157     for self.epoch in self.epochs:
     77             f"Expected num_processed to be a non-negative number, but received {num_processed}."
     78         )
     79     if elapsed_time_sec <= 0:

RuntimeError: Boolean value of Tensor with more than one value is ambiguous

I have build a minimum working example with the bug

The problem seems to be the fact that Throughput metric update does not work like other metrics.
Instead of metric.update(tensor, tensor) it expects update (num_processed, elapsed_time_sec).

I had the same question. @jmp’s answer helped me figure out what was missing in my understanding.
I had forgot that in the Learner we had:

def run_cbs(cbs, method_nm, learn=None):
    for cb in sorted(cbs, key=attrgetter('order')):
        method = getattr(cb, method_nm, None)
        if method is not None: method(learn)
class Learner():
    def __getattr__(self, name):
        if name in ('predict','get_loss','backward','step','zero_grad'): return partial(self.callback, name)
        raise AttributeError(name)

    def callback(self, method_nm): run_cbs(self.cbs, method_nm, self)

the __getattr__ dunder method will add to the object of the class that has methods ‘predict’, ‘get_loss’, etc (i.e. object of a Learner class) a callback method that will be called before and after these methods. Beautiful code.

1 Like


I wrote a blog on how to convert Hugging Face Data images into Pytorch tensors without applying transform every batch. It trains faster even with only one CPU processor. I used Hugging Face Features and Array2D in this blog.

Hopefully this is helpful for those who are using Google Colab or on computers where CPU is limited.

This maybe a beginner question, but I am trying to get on top of defining callback as decorator (rather than context manager).

Can someone explain the control flow of the with_cbs() function below and the use of globals() in Except condition? This is the same refactored code in Jeremy’s 09_learner notebook

class with_cbs:
    def __init__(self, nm): self.nm = nm
    def __call__(self, f):
        def _f(o, *args, **kwargs):
                f(o, *args, **kwargs)
            except globals()[f'Cancel{self.nm.title()}Exception']: pass
            finally: o.callback(f'cleanup_{self.nm}')
        return _f

Start with ChatGPT’s answer, and let us know if there’s anything unclear here: