Lesson 2 In-Class Discussion ✅

Damn, Francisco, you keep making me push the envelope! :grin: - which I appreciate… no, I didn’t even know where to set wd but now I’m trying it. But since I’m just shooting in the dark can you suggest how much to increase it? I just ran a quick test with wd=.01 (which I thought was the default) and it seems to have decreased my train_loss, which I don’t understand.

That is the default actually. How much did the training loss decrease? Probably this is due to randomness in training and not specially meaningful. What if you try 2e-2, 3e-2, 4e-2? That should increase your training loss and allow you to train more epochs with a decreasing validation loss (maybe achieving a lower validation loss than before at the end of your training).

1 Like

I ran earlier with:
learn = create_cnn(data, models.resnet34, ps=0.1, metrics=error_rate)
learn.fit_one_cycle(24, max_lr=slice(1e-4,1e-2))
and got for the 8th epoch:
8 0.193530 0.075071 0.013333 (00:07)

Then I ran:
learn = create_cnn(data, models.resnet34, ps=0.1, wd=.01, metrics=error_rate)
learn.fit_one_cycle(8, max_lr=slice(1e-4,1e-2))
and got:
8 0.094480 0.009196 0.000000

So both losses were way down.

Then because you said to increase weight decay I tried:
learn = create_cnn(data, models.resnet34, ps=0.1, wd=.1, metrics=error_rate)
and got:
image

image

I really don’t understand what’s going on…

I should note: this is generally what I’ve seen with many variations in lr and epochs - the training loss is almost always higher than validation loss, and often much higher. Error rate is usually quite good, so it seems like the model is decent, but consistently underfitting.

Ah, this is interesting - following your suggested wd values:
learn = create_cnn(data, models.resnet34, ps=0.1, wd=0.04, metrics=error_rate)

image

image

Wait I’m sorry I reread your original question and realized I read it wrong when I first answered it. You want to get your training loss lower than your validation loss so you should decrease dropout and/or weight decay since both of them make it harder for your training loss to decrease. Could you try wd=0.001? That should help!

1 Like

The funny thing is, I just ran:
learn = create_cnn(data, models.resnet34, ps=0.1, wd=0.04, metrics=error_rate)
for 24 epochs this time, and finally got train_loss < valid_loss!
image
image

So that feels like success, but I don’t understand it! Does it make sense to you?

I’ll run with wd=0.001 now.

You are using 1/5 the default probability of dropout (0.5) that’s why :sunglasses:. Now try with ps=0.1 and wd=0.001 and we should see some nice overfitting.

Sad news…
learn = create_cnn(data, models.resnet34, ps=0.1, wd=0.001, metrics=error_rate)
image
image

…so wd=0.04 looks like the optimum.

Weird. Can you try with wd=0?

Here you go:

learn = create_cnn(data, models.resnet34, ps=0.1, wd=0.0, metrics=error_rate)
image
image

Not as good as wd=0.04

We should continue talking about this in another thread. Could you create one and tag me please?

Ok done. I’ve never created a new thread before… should I move some of these posts over?

That would be great to give some context, thank you!

@lesscomfortable Is there an update on a standard way to do so? :slight_smile: This method seems broke after we have switched to ImageDeleter. Actually, I think the current ImageDeleter may be broken too.

I try to hack the validation set to 0.99 so I can use the old way to clean up image, I find that ImageDeleter is actually looking for my training dataset instead of validation. If I have not misunderstood, I should pass data.valid_ds, it is weird it takes train_ds as default.

You can see in my training set I got 9 images only. And the ImageDeleter is complaining about size


Hey @nok ! You can send data.train_ds to ImageDeleter to work on your training set. I think you are not sending the correct inputs. Please see the example usage here and let me know if it answers the question.

Thanks for the pointer, it shows the usage of ImageDeleter very well. However, I believe it may deserve a quick update in the lecture. The reason is, the lesson 2 notebooks have a commit 2 days ago simply swap the FileDeleter, the result is, it deleting wrong photo silently as the dataset was not pass correctly.

I think most people relies on notebook and only turns into doc if it throws error. It can be quite confusing especially when the changes does not break explicitly but actually cause unexpected behavior, people may simply not aware of they have deleted the wrong photo.

I also add a test for checking length of dataset match len of indexes, to make sure the correct dataset is passed. The reason why it fails silently is because training set is usually bigger than valid set, so it will just delete the wrong photo happily with no complain. I only found out this when i set valid_pct to 0.99, where it throws out of bound error.

I have made a PR to address this issue and fix the notebook. It would be great if someone can help checking on it. As I believe there are a lot of people working on their own classifier over the weekend.

@lesscomfortable

2 Likes

I was having trouble getting the download images javascript code to run, and here is what I had to do:

  • Disabled ad blocker, it blocked the window popup.
  • Disabled Chrome Office Editing for Docs, Sheets & Slides extension as it tried to handle the csv file download and left me with blank window.

The Javascript code from the lesson2-download notebook didn’t fully work for me.

The line below generated a list of URLs that I then had to copy & paste into a .txt doc:

document.body.innerHTML = `<a href="data:text/csv;charset=utf-8,${escape(Array.from(document.querySelectorAll('.rg_di .rg_meta')).map(el=>JSON.parse(el.textContent).ou).join('\n'))}" download="links.csv">download urls</a>`;

If you want to save time opening the console & pasting in the text, you can (literally) drag-and-drop the text snippet below into your browser’s bookmarks toolbar. Then you can just click it and you will get a download button when ever you want:

javascript:document.body.innerHTML = `<a href="data:text/csv;charset=utf-8,${escape(Array.from(document.querySelectorAll('.rg_di .rg_meta')).map(el=>JSON.parse(el.textContent).ou).join('\n'))}" download="links.csv">download urls</a>`;

Josh C., who runs the East Bay MeetUp Code Self Study made & gave me the above code. Check out his meetup on Wednesdays or Saturdays:

4 Likes

Flask doesn’t have async feature.
there might be some workaround by using it with some other service. But i am not sure.

but Flask doesn’t have async feature. What did you do about that ?