Lesson 14 official topic

This is a wiki post - feel free to edit to add links from the lesson or other useful info.

<<< Lesson 13Lesson 15 >>>

Lesson resources


A very relevant and really edifying notebook by Jeremy:

What is torch.nn really?

I very highly recommend it as this particular notebook made multiple concepts just click for me and gave me many aha moments.


Trying to bridge between pieces that we’re covering in class to the fast.ai library bits… Using the mnist dataset loaded as tensors in Jeremy’s, 04_minibatch_training.ipynb notebook. I was able to use DataLoaders.from_dset to load the training and validation set and to use Learner to train. But having trouble figuring out the right way to get Learner.predict to work post training…

import pickle,gzip,math,os,time,shutil,torch,matplotlib as mpl, numpy as np
from pathlib import Path
from torch import tensor
from fastcore.test import test_close

mpl.rcParams[‘image.cmap’] = ‘gray’
torch.set_printoptions(precision=2, linewidth=125, sci_mode=False)
np.set_printoptions(precision=2, linewidth=125)

path_data = Path(‘…/nbs/data’)
path_gz = path_data/‘mnist.pkl.gz’
with gzip.open(path_gz, ‘rb’) as f: ((x_train, y_train), (x_valid, y_valid), _) = pickle.load(f, encoding=‘latin-1’)
x_train, y_train, x_valid, y_valid = map(tensor, [x_train, y_train, x_valid, y_valid])
from fastai.basics import *

n,m = x_train.shape
c = y_train.max()+1
nh = 50

model = nn.Sequential(nn.Linear(m,nh),nn.ReLU(),nn.Linear(nh,c))

from torch.utils.data.dataset import Dataset
from torch.utils.data import TensorDataset

train_ds = TensorDataset(x_train,y_train)
valid_ds = TensorDataset(x_valid,y_valid)

dls = DataLoaders.from_dsets(train_ds,valid_ds)
learn = Learner(dls,model,F.cross_entropy,lr=0.01)

This trains as expected… and after training I can get correct results from the model by bypassing the learner and calling the model directly

print(f’target: {y_train[2]}, pred: {learn.model(x_train[2])}')


target: 4, pred: tensor([-22.40, -19.15, -9.69, -28.76, 7.94, -22.86, -9.99, -16.53, -12.23, -8.97], grad_fn=)

Which looks right. Max prob is for 4.

But when I do


This gives

(tensor(-22.40), tensor(-22.40), tensor(-22.40))

The decoding machinery in predict/get_preds is chopping off everything but the first prob value. I’m sort of expecting learn.predict to give me the full Tensor back.

Would appreciate any tips on what’s the right way to set up the dataloader/dataset to work with .predict and give back usable results?

Thanks Much

Great Lesson
I’d missed the live stream and just watched the lesson. Even if you are having difficulty with the previous lessons, study this one anyway. From my perspective, this is the most enjoyable one.


pass the x_train[2] as a list - print(learn.predict([x_train[2]]))
you could use fastai’s version of cross_entropy,CrossEntropyLossFlat, which has a few more methods - decodes and activation that help during the inference process source code.

class CrossEntropyLossFlat(BaseLoss):
    "Same as `nn.CrossEntropyLoss`, but flattens input and target."
    y_int = True # y interpolation
    @use_kwargs_dict(keep=True, weight=None, ignore_index=-100, reduction='mean')
    def __init__(self, 
        axis:int=-1, # Class axis
        super().__init__(nn.CrossEntropyLoss, *args, axis=axis, **kwargs)
    def decodes(self, x:Tensor) -> Tensor:    
        "Converts model output to target format"
        return x.argmax(dim=self.axis)
    def activation(self, x:Tensor) -> Tensor: 
        "`nn.CrossEntropyLoss`'s fused activation function applied to model output"
        return F.softmax(x, dim=self.axis)

learn.predict uses this to generate the output.
Now you can replace:
learn = Learner(dls,model,F.cross_entropy,lr=0.01)
learn = Learner(dls,model,CrossEntropyLossFlat(),lr=0.01)
and call predict:
print(learn.predict([x_train[2]])) note we are passing the example as a list.
This returns 3 values: dec_targ,dec_preds[0],preds[0]
Hope that helps :slight_smile:


Awesome. that was the missing bit for me… Thanks Much.

1 Like

sorry about the PS, the listifying was all the solution was

The PS bit was good too… since it shows how to wire up the decode and activation bit… helped me understand the connection… Good Stuff. Thanks!

1 Like

I felt like lost thread and took decision to proceed with next lesson video only when I fully understood what was going on. But eventually I do now both, watching and catching up. 4am is too early to join live with small baby crying at nights haha. But, I still have 2 great live lessons I will try to join. Also learning PyTorch here is the most efficient way to understand it + dive in in math beyond it as well. Thank you Jeremy!


Hello everyone. Does anyone know a good book of advanced python? after these couple of lessons I’m suspecting python has more secrets to offer…

1 Like

a.pico, I highly recommend Fluent Python by Luciano Romalho. It takes you from intermediate to advanced.


Piggy-backing on top of this, I found a good half-way house in between basic Python and the doorstopper tome of ‘Fluent Python’ is the book ‘Robust Python’ by Patrick Viafore. Covers a lot of quite deep topics, but pretty much everything in there is useful (in contrast to Fluent Python which I found covers a lot of stuff you might never need to know or dive deep on). You can get a taste of it via some summaries I wrote here.


we have 2 hours to do the more advanced part of the homework :),
Although the only two bits I remember was:

  • watching 3blue1brown from lesson 13
  • and the usual attempt at rewriting the lesson notebook, which is always a good idea.
    However, this time around it seems more so. We used nbdev exports for the first time, with the goal of “building a small library on top of huggingface”. So having our own implementation of the interface that Jeremy is building will let us experiment more fluently, and possibly contribute a bit later.

Do you remember any other tasks for the homework?

Good to have the extra time to work through the course work. I was trying to find the best way to recreate the class notebooks without just copying the cells from one to another but it can be difficult to follow the flow etc it you work completely independently. One thing I found works for me is to when I need to create a new class or function is to try and articulate what the required inputs, outputs and methods are in a markdown cell before I try to generate the code myself. This really helps me to generate the code myself and helps the learning process. I guess everybody has their own approach, what do others do?


My current approach is to copy the notebooks, and remove body from all functions / methods replacing them by …, and then go back and implement it once again without looking at the original notebook.

This is similar approach to yours, as it forces me to think what the function was suppose to do and what were the inputs / outputs.

Here is how it looks like for 03_backprop

Here are the notebooks 03 and 04 if any one wants to give it a try:


Are there chapters in the book that map to Lesson13 & 14? In particular building the individual components from scratch and getting an intuition for what they do/how they work?

Would love to go deeper on these to strengthen the fundamentals

@londonfog You want Chapter 17 and 19.


When you have replaced you code with ... it may be beneficial to write some pseudocode, this also helps with comments and general documentation.

Pseudocode Tutorial

1 Like

I don’t understand the im.shape[0] < 5 in the show_image function (05_datasets.ipynb), is there a way to interpret what it is trying to accomplish?

1 Like