Lesson 1 In-Class Discussion ✅

Tried your solution didn’t worked. Mine is not working for both resnet 34 and resnet50

Which cloud / gpu you are using? Your gpu might have too little vram. I suggest you to describe your problem with more details so that people can better help you.

TitanX GPU/1080ti having 12 gb ram. I tried reducing batch size but it didn’t help

Is there any error message you get? Or did your kernel just get restarted whenever u run learn.fit_one_cycle(4)?

No error message kernel keeps restarting

DW - solved.

Was trying to scrape from DDG’s rendering of Gimages which does not use the ‘.rg_di .rg_meta’ classes.

All good now!

1 Like

This turns out to be not really true - that statement was many Deep-Learning-Years ago, when we were all young and naive. :slight_smile:

4 Likes

Ah FYI I didn’t see that at first, since edits don’t appear as notifications. Best to reply with extra info, for future reference.

1 Like

Your stage 1 training shows error still improving a lot at your last epoch:

That means you’re under-fitting. We’ll learn more about this soon. Try doing 20 epochs.

Also, your LR finder shows that you can use a much higher learning rate than 1e-6. Try 1e-4 or even 1e-3.

Can you share your dataset? I’d be happy (with credit) to use this to do a deeper dive during class.

6 Likes

I think that you galaxies could benefit from lots of transform:flip, rotation etc.
Have you verified which tranforms are used in get_transforms(),

From the gist:
Cobbled together this small dataset myself from google images using Sergiusz Bleja's https://github.com/svenski/duckgoose and wikipedia. It has approximately 200 images categorized into 4 types of galaxy.
Is it fine to publicly share a scraped image dataset without proper attribution?

Yes. Thanks. You have pioneered the pace & scale of change in many respects with your efforts.

Hi,

what should be the expected behaviour of the learning rate finder when used multiple times? I ran it twice on the resnet50 model using the first lesson dataset and it gave me two different results. Also, I am not able to find an optimal lr on the first plot.

Here’s the gist: http://nbviewer.jupyter.org/gist/jpramos/154767ca8bf7b735a141a583f209ae73

3 Likes

Hey guys. I have a couple of questions about the first lesson. Hopefully someone can help me out here:

  1. I’ve noticed that I sometimes get CUDA out of memory errors on the second epoch of finetuning. I’ve lowered the batch size and it worked fine, but I still thought it was odd. I would expect the memory usage to be the same every epoch (every batch actually), or am I wrong here?
  2. Is there a recommended workflow for when you train for a couple of epochs using the fit_one-cycle approach, and you notice that the model is still improving. Do you then start over completely and set the number of epochs higher, or do you just call fit-one-cycle again for a couple of epochs. With a fixed learning rate it wouldn’t matter, but if I understand the one cycle policy correctly, you would get different results.
  3. I found that on my dataset, aplying the learning rate finder (after unfreezing) results in a plot where the loss is increasing for all learning rates in the range (also when I decrease the min value of the range to 1e-9). But when I do finetuning the loss still does go down a little. Am I doing something wrong with the LR finder, or does this just happen sometimes?

My notebook is here if anyone’s interested: https://nbviewer.jupyter.org/gist/bvandepoel/1f1fe859cb02baf27ef7ba286e222780

1 Like

In my case I’ve installed pytorch using the following command: pip install torch_nightly -f https://download.pytorch.org/whl/nightly/cu90/torch_nightly.html

I would recommend to visit pytorch's official website and download the Preview version (not Stable) using the suggested command for your system.

After you unfreeze there are more params, so needs more memory. So that’s expected. However the 2nd epoch shouldn’t be worse than the first, without making that change.

I’m not sure anyone has figured out the best answer there. Personally if I have time I go back and do it again, otherwise I continue with lower max_lr.

It just means that you’re really close already to the best weights for the early layers.

2 Likes

Thanks. Francisco posted his guide last night, in case you want to try it:

I’m running the lesson-1 notebook and notice that when I call learn.fit_one_cycle(4) my training loss is consistently higher than my validation loss by a fair margin. This difference appears to decrease over epochs (see screenshot for precise numbers). I know the opposite – higher validation loss than training loss – could be a sign of overfitting. What does a higher training loss signify, if anything?

2 Likes

Is there an example how to use fastai v1.x for inference on single image or a single document for text classification. I looked through the docs and could not find anything. Thank you

Thrilled to see the fast.ai course and its teaching philosophy featured in the prestigious Economist magazine today! Great international coverage for the tireless efforts by Jeremy and Rachel to take AI to the masses.
“No PhD, no problem - New schemes teach the masses to build AI”

2 Likes