Lesson 3 official topic

vbakshi · February 13, 2024, 8:42am

This is an interesting problem, I haven’t found a solution but here’s what I’ve found.

Here is the source code for learn.predict:

def predict(self, item, rm_type_tfms=None, with_input=False):
        dl = self.dls.test_dl([item], rm_type_tfms=rm_type_tfms, num_workers=0)
        inp,preds,_,dec_preds = self.get_preds(dl=dl, with_input=True, with_decoded=True)
        i = getattr(self.dls, 'n_inp', -1)
        inp = (inp,) if i==1 else tuplify(inp)
        dec = self.dls.decode_batch(inp + tuplify(dec_preds))[0]
        dec_inp,dec_targ = map(detuplify, [dec[:i],dec[i:]])
        res = dec_targ,dec_preds[0],preds[0]
        if with_input: res = (dec_inp,) + res
        return res

I took that source code, pasted it into a cell and ran the following (I ran your provided code first which gives me train_x):

item = train_x[0]

dl = learn.dls.test_dl([item], rm_type_tfms=None, num_workers=0)
inp,preds,_,dec_preds = learn.get_preds(dl=dl, with_input=True, with_decoded=True)
i = getattr(learn.dls, 'n_inp', -1)
inp = (inp,) if i==1 else tuplify(inp)
dec = learn.dls.decode_batch(inp + tuplify(dec_preds))[0]
dec_inp,dec_targ = map(detuplify, [dec[:i],dec[i:]])
res = dec_targ,dec_preds[0],preds[0]

This gives the following error:

IndexError: too many indices for tensor of dimension 0

for the following line of code:

inp,preds,_,dec_preds = learn.get_preds(dl=dl, with_input=True, with_decoded=True)

I found this Forums post which although isn’t technically related to your situation, I thought I would give it a try (unsqueeze the train_x[0] value):

item = train_x[0].unsqueeze(dim=0)

dl = learn.dls.test_dl([item], rm_type_tfms=None, num_workers=0)
inp,preds,_,dec_preds = learn.get_preds(dl=dl, with_input=True, with_decoded=True)
i = getattr(learn.dls, 'n_inp', -1)
inp = (inp,) if i==1 else tuplify(inp)
dec = learn.dls.decode_batch(inp + tuplify(dec_preds))[0]
dec_inp,dec_targ = map(detuplify, [dec[:i],dec[i:]])
res = dec_targ,dec_preds[0],preds[0]

This resolved the initial error but gave a new error:

AttributeError: 'list' object has no attribute 'decode_batch'

Caused by the following line:

dec = learn.dls.decode_batch(inp + tuplify(dec_preds))[0]

Looking at your DataLoaders, it doesn’t have a decode_batch attribute (I’m not sure why):

Here is a colab notebook with the code.

Dropping a few more related links that I found—note the last link where they have had the same issue but no resolution:

Not sure if any of this helps but hopefully you can find a solution to this.

liali · February 14, 2024, 9:33am

Thank you!

although the problem still unresolved, I found a clue too.

looks like activation() and decodes() need to be created properly for loss_func().

Kanav11 · February 14, 2024, 4:57pm

Hugging Face & Github Synchronization

I am able to push my app.py, requirement.txt and all other require files to my Hugging Face Space. But I want to know how can we push those file simultaneously to Github repo also. I am using Github Desktop.

priya_m · February 26, 2024, 10:30am

My apologies if this has been discussed earlier. I didn’t quite get why, to reduce the loss (when initial gradient values were negative in the video here, we subtract grad0.01:
abc -= abc.grad0.01

vishutanwar · March 1, 2024, 2:07am

check this image this will help you to understand the point

you should check out algebra course, just check the basic topics

vishutanwar · March 1, 2024, 2:11am

People are quiet unactive on this course, not sure if anyone get this issue before or I am the first one, who get this,

when I check the timm there were change in names as of not it is

while I am trying to use “convnext_tiny”, but get some error,

but it says timm is not defined

vbakshi · March 1, 2024, 3:35am

This forum post might help.

olex_wel · March 20, 2024, 6:23pm

I struggled with the same problem today and have just found the solution:
learn.model(valid_x[0])

olex_wel · March 20, 2024, 6:24pm

Hey, I struggled with the same problem today and have just found the solution:
learn.model(valid_x[0])

olex_wel · March 20, 2024, 7:00pm

I’ve just complete the 4th chapter of the book. The final model there has the following structure:

simple_net = nn.Sequential(
    nn.Linear(28*28,30),
    nn.ReLU(),
    nn.Linear(30,1)
)

We performed normalization on the input to the first linear layer by normalizing the image pixel data. However, there’s no normalization applied to the input of the second layer; values greater than 1 are present after the ReLU activation.

Can anyone explain the reasoning behind this?

olex_wel · March 20, 2024, 7:22pm

Use learn.model(x)

olex_wel · March 20, 2024, 7:31pm

In the 4th chapter of the book, it is stated that there is no difference between models with two large layers versus models with multiple smaller layers, with the latter being easier to compute.

However, this reminded me of the multi-layered image classifier that Jeremy presented. It was demonstrated that the first layer identifies simple shapes, and the subsequent layers, building upon the previous ones, recognize increasingly complex features.

Doesn’t this contradict the thesis presented in the book? I mean, if there are only two large layers, there cannot be such a hierarchy of feature recognition, as nodes within the same layer are not connected to each other.

shantam · March 28, 2024, 11:34pm

While going through 04_mnist_basics notebook, I didn’t understand the reason behind using unsqueeze in

train_y = tensor([1]*len(threes) + [0]*len(sevens)).unsqueeze(1)
train_x.shape,train_y.shape

and

valid_x = torch.cat([valid_3_tens, valid_7_tens]).view(-1, 28*28)
valid_y = tensor([1]*len(valid_3_tens) + [0]*len(valid_7_tens)).unsqueeze(1)
valid_dset = list(zip(valid_x,valid_y))

I managed to replicate the training process from scratch on a new notebook without adding these extra dimensions and it seems unintuitive to me why it has been done so.

shantam · March 28, 2024, 11:47pm

After going through 04_mnist_basics, I found these resources helpful in understanding tensors:

https://tensorgym.com : it’s like a mini leetcode for tensors
python - What does "unsqueeze" do in Pytorch? - Stack Overflow : understanding unsqueeze method
Understanding dimensions in PyTorch | by Boyan Barakov | Towards Data Science : understanding dimensions in PyTorch
Hope it helps someone

I also came across https://minitorch.github.io where you get to build your own mini PyTorch library from scratch. I’m looking forward to do it once I’m through with Part 1.

ForBo7 · March 30, 2024, 4:00am

I also came across https://minitorch.github.io 1 where you get to build your own mini PyTorch library from scratch. I’m looking forward to do it once I’m through with Part 1.

Part 2 of the fastai Course also involves you recreating many of PyTorch’s functions from scratch.

dynamics · March 30, 2024, 2:33pm

Hi Folks,

I feel a little silly for asking, but what is the assignment for Lesson 3? I see some people discussing working with the MNIST dataset, but I can’t find any prompt or instructions to go off of.

Also, should we be sharing our work for each lesson here, or on the lesson thread, or maybe another place?

Thanks for your help!

Charles

vbakshi · April 1, 2024, 6:01am

To answer your first question: there is a “Further Research” section after each chapter (see the chapter notebook) which has prompts for us students to explore.

For lesson 3/chapter 4, the prompts are:

Create your own implementation of Learner from scratch, based on the training loop shown in this chapter.
Complete all the steps in this chapter using the full MNIST datasets (that is, for all digits, not just 3s and 7s). This is a significant project and will take you quite a bit of time to complete! You’ll need to do some of your own research to figure out how to overcome some obstacles you’ll meet on the way.

dynamics · April 2, 2024, 10:48am

Thanks vbakshi.

I guess my confusion is that for Lesson 1 the video and notebook explicitly state the homework, which is different from the textbook. Then Lesson 2 is all about publishing a model, so the assignment (to create and publish a model) seems obvious, though I can’t find where it was explicitly stated. For comparison, the lesson 2 textbook assignment is to write a blog post (among other things).

After reviewing the Lesson 3 video it does appear to suggest homework. I believe it is referring to this kaggle notebook, but I don’t see a clear link there in the Lesson 3 resources. This again is different than the suggested homework from the textbook assignment, which you’ve kindly pointed me to.

So all that is to say, it is easy enough to follow the textbook. However the videos suggest that the material from the videos is different from the text and that one of the assignments for each video lesson is to read the corresponding chapter in the textbook. And given that the first 2 lessons appeared to assignments that were different from the textbook, it seems strange to pivot there for Lesson 3.

Maybe I am over thinking this ¯_(ツ)_/¯

I guess doing “all the things” will be best for learning.

vbakshi · April 2, 2024, 4:37pm

Ah I see. Yeah what I did in response to that segment of the video is run each cell in the “Getting started with NLP for absolute beginners” notebook before the Lesson 4 video. You are right that the material in the videos doesn’t always match the textbook, and I too took the “doing all the things” approach to cover it all.

johnkflam · April 18, 2024, 2:54pm

Hi all: I have a question regarding using spreadsheet to create a machine learning model. I don’t understand how creating the Ones column with the value of 1 can replace the value of b, the constant, in this forumua, mx + b. Can someone please explain it to me? thanks!