Kaggle Comp: Plant Seedlings Classification

Hi,

I was working my code from scratch in Pytorch. My log-loss and accuracy is similar to what you are getting in the screenshot above. Can you explain how you fixed this issue?

For reference :- https://github.com/brightertiger/kaggle/blob/master/playground/plant-seed/02-resnet-transfer-learning.ipynb

I removed F1 as metric and redid everything without specifying any explicit metric. Smaller batch size and more lower lrs after couple of epochs (after unfreezing) did the trick.
Btw, if you want I could look at the code and see.

2 Likes

Remember Jeremy had recommended to use this lrs=np.array([lr/9,lr/3,lr]) when the dataset is very different from what is in ImageNet. For this dataset, however, I think you made a good observation to lower the learning rates in earlier layers :slight_smile:

4 Likes

I actually used what Jeremy had recommended for 1st submission.
The output suggested that model was learning more when lrs were smaller and had longer cycle length. So, I add 3 cycles with lower lrs and it helped.

4 Likes

I confused a bit - is not np.array([lr/18,lr/6,lr/2]) == np.array([lr/9,lr/3,lr]) when lr = 1/2 lr ?

Metrics are only printed - they don’t change the training at all. So that can’t be the source of any difference you saw…

2 Likes

Correct. Somehow the numbers weren’t promising. The score started around .4 and remained around .6 after unfreezing. Maybe I made a mistake.

Correct. I further halved the lrs!

No I don’t think its the same because the value of lr didn’t change, just that the first learning rate in the array he also decided to lower it by half here. i.e. if lr = 1 now the first lr in the array is 0.5 but the lower layers are much lower than before (1/18 vs 1/9, 1/6 vs 1/3)

What if we do

lr /= 2
lrs = np.array([lr/9,lr/3,lr]) 

instead of

lrs = np.array([lr/18,lr/6,lr/2])

its not the same?

It’s same?
[lr/9, lr/3, lr] /2 == [lr/18, lr/6, lr/2]
or I could have done is lr = lr/2
and then used [lr/9,lr/3,lr]

Yes this time its the same. But I don’t think he changed the variable lr value itself…perhaps I’m wrong tho, I just didn’t see this line anywhere(lr /= 2)

1 Like

Now I’m curious, Is there any difference if I change it and then use!

Here is another example…
lr = 1
np.array([lr/9,lr/3,lr]) == np.array([1/9,1/3,1])
np.array([lr/18,lr/6,lr/2]) == np.array([1/18,1/6,1/2])

Only if he changed the first line to define lr then they are not the same :slight_smile:

@vikbehal can you confirm did you redefine lr anywhere? Or you just kept it constant and took fraction of it each time?

2 Likes

Okay, we are on the same page. I would prefer lr /= 2 because its just shorter. BTW guys do you have your loss/accuracy dynamics ?

1 Like

I used this line but the idea was to half the lrs. I did not change original ‘lr’.
[lr/18, lr/6, lr/2]

2 Likes

So, I assume both are same!
[lr/18, lr/6, lr/2] and
lr = lr/2
[lr/9, lr/3, lr]

We shouldn’t share that here since this is an ongoing competition.

Minor improvement:

5 Likes

Dude you are about to topple me -