Lesson 1 In-Class Discussion ✅

For image classification of dogs and cats in Lesson 1, The filenames present in path_img folder is used to get the labels of the pictures. Then, that is the use of the path_anno folder? What type of annotation information is stored and used? Is it fine to ignore the path_anno folder if the labels are present in the path_img filenames?

@jeremy path.ls() is convenient (as you point out) and discoverable, since it’s simply a method and comes up in tab completion. On the flip side, it builds muscle memory which will fail people whenever they encounter standard path objects or paths represented as strings.

In Jupyter/IPython, there’s an alternative which works with plain paths and strings alike: %ll {path}. It’s also fairly succinct IMHO, though definitely less discoverable, but it’s a good way to get used to variable interpolation in magics (or shell commands, !ls -l {path} works too of course) :slight_smile:

image

Hello everyone,

I just started this course and just finished lesson 1. Then I wanted to do some practice using lesson 1’s code, but I encountered a little problem and hope I can get an answer here.

So basically I want to what Jeremy did in lesson 1 all over again, but on CIFAR 10 dataset. However, after I downloaded the CIFAR 10 dataset, it only has a test set and a label.txt. There is no training or validation set. I did some search in the forum and found that other people has a training set for CIFAR 10. I just want to know if this is by design and I need to partition the training set and validation set by myself, or there are some errors.

Screenshot of my code and the output:

I just tried on Colab. No problem. Should you check it again?


Here is my Colab notebook Click me

2 Likes

Thank you! I will try again to see if it is different.

Hi I just started this course and I’m working on MNIST data set of recognizing numbers 3 and 7. When I plotted the graph for learning rate, this is what I got. Can someone explain the reason for the curve going backwards?
learn-recorder.plot()

I Want to know what is the** minimum version **of PyTorch required for this course ?

INFO – I have a GPU with Cuda 10.0 supported , so i can only download torch 1.2.0 and torchvision 0.4.0 with GPU support.

I want to know is 1.2.0 sufficient?

I do not know how to ask question. Why I cost 30mins when I run fit_one_cycle…

which platform are you using? it sounds like you’re not using GPU.

if you’re using Google Colab you have to switch it on for each notebook (Runtime->Change Runtime Type->Hardware Accelerator->GPU). For other platforms I’m not sure, check the Server Setup section at https://course.fast.ai/

When I check source code, github.com refused to connect. Why?

Why should we use gradient or other other servers?
i have nvidia GPU in my lap , should i go for one? or should i get one server ?
im already using jupyter with minconda .
help me out

why am i not seeing the labels which were assumed wrong ?

You can…a lot of people do run it on their local computers. Though its much more convenient, and hassle-free on a cloud server! No set up is required, and there are a lot of free options as well!

Apparantly, matplotlib cannot plot tensors because they dont have an attribute called ndim. Thats fine, we can define it for ourselves!
Run this:
torch.Tensor.ndim = property(lambda x: len(x.shape))
This should solve the error

By the way, you’re probably using an older version of pytorch. Pytorch tensors do have an attribute ndim now!

1 Like

Hi All,

Some questions from lesson 1:

  1. Regarding the resnet34 part of the course,

1.a. When re-training resnet34, I noticed to get an error rate closer to 5, I had to train 5 epochs instead of 2? Why would that be the case?

1.b. How does one know how many epochs to train? Is that the “art” part of Deep Learning, where you learn with trial and error?

1.c. What is the loss on the Y-axis on the (lr_finder)? Is that training loss, validation loss or error rate?

  1. Regarding the resnet50 part of the lesson,

2.a. The loss curve and scale is very different from the loss for resnet34. I see for resnet34 the loss ranges from 0.225 to 0.425; but got resnet50 it ranges from 3.0 to 5.5. Per my understanding resnet50 should have a lower loss than resnet34, so who the huge difference?

2.b. For resent50 the notebook chooses max_lr=slice(1e-6,1e-4) but when I see the lr_finder plot the loss is very high for a learning rate of 1e-6 though 1e-4, why did we choose this range here?

See resnet34 plot below,
resnet34-plot

See resnet50 plot below,

Thank you!

Can you tell me the steps to do that? i think these are eating my storage :stuck_out_tongue:

thanks mate

Follow this thread. I haven’t done it myself, ever. So I might not be able to guide you step by step. All the best.


(Assuming you’re on a windows PC)

1 Like

@nextlife
1.a Really depends on your problem. Are you just running the notebooks from the course? or are you training your model on a custom dataset. Different problems have different needs, so to say. You may be able to get the same accuracy with less number of epochs if you set your hyperparameters (learning rate, weight decay, etc, which you’ll learn in the following lessons) correctly. That too, requires some experience and more than that, trial and error!

1.b Yes, its the art. But its not intuitive. As you’ll see, intuition often fails us. Trial and Error, and some basic logical reasoning is the best way to make things work perfectly. Don’t worry too much about it.

1.c Its the training loss. lr_find calculates loss on the training set only. It basically tells us what value of learning rate will lead to the best reduction of loss on your data, and how the loss reduces(how fast/slow/smoothly etc). And so it acts as a guide to choose your learning rate. JH also teaches tricks to analyse this graph in the course.

2.a Different architectures have different loss landscapes. As you’ll see later, a more complex architecture does’nt necessarily mean that it will fit better. Sometimes, its the opposite, it makes it worse. So the architecture must be chosen according to the problem you’re solving. This is discussed in detail in part 2 of the course.

2.b As I mentioned, JH discusses some tricks to choose the optimal learning rate slice. You should keep in mind that the learning rate corresponding to the lowest loss in the graph is not the best learning rate. Thats actually discussed in lesson 2, IIRC.

Hope this helps.
Cheers, stay safe!

1 Like

I am trying to download data using untar_data function for the sample URLs below
https://www.kaggle.com/scolianni/mnistasjpg/download)
http://www.vlsitechnology.org/pharosc_8.4.tar.gz

Getting an error as OSError: Not a gzipped file (b’<h’). Can anyone explain the reason for this?