Kaggle Comp: Plant Seedlings Classification

(<^..^>) #112


I was working my code from scratch in Pytorch. My log-loss and accuracy is similar to what you are getting in the screenshot above. Can you explain how you fixed this issue?

For reference :- https://github.com/brightertiger/kaggle/blob/master/playground/plant-seed/02-resnet-transfer-learning.ipynb

(Vikrant Behal) #113

I removed F1 as metric and redid everything without specifying any explicit metric. Smaller batch size and more lower lrs after couple of epochs (after unfreezing) did the trick.
Btw, if you want I could look at the code and see.

(James Requa) #114

Remember Jeremy had recommended to use this lrs=np.array([lr/9,lr/3,lr]) when the dataset is very different from what is in ImageNet. For this dataset, however, I think you made a good observation to lower the learning rates in earlier layers :slight_smile:

(Vikrant Behal) #115

I actually used what Jeremy had recommended for 1st submission.
The output suggested that model was learning more when lrs were smaller and had longer cycle length. So, I add 3 cycles with lower lrs and it helped.

(sergii makarevych) #116

I confused a bit - is not np.array([lr/18,lr/6,lr/2]) == np.array([lr/9,lr/3,lr]) when lr = 1/2 lr ?

(Jeremy Howard) #117

Metrics are only printed - they don’t change the training at all. So that can’t be the source of any difference you saw…

(Vikrant Behal) #118

Correct. Somehow the numbers weren’t promising. The score started around .4 and remained around .6 after unfreezing. Maybe I made a mistake.

(Vikrant Behal) #119

Correct. I further halved the lrs!

(James Requa) #120

No I don’t think its the same because the value of lr didn’t change, just that the first learning rate in the array he also decided to lower it by half here. i.e. if lr = 1 now the first lr in the array is 0.5 but the lower layers are much lower than before (1/18 vs 1/9, 1/6 vs 1/3)

(sergii makarevych) #121

What if we do

lr /= 2
lrs = np.array([lr/9,lr/3,lr]) 

instead of

lrs = np.array([lr/18,lr/6,lr/2])

its not the same?

(Vikrant Behal) #122

It’s same?
[lr/9, lr/3, lr] /2 == [lr/18, lr/6, lr/2]
or I could have done is lr = lr/2
and then used [lr/9,lr/3,lr]

(James Requa) #123

Yes this time its the same. But I don’t think he changed the variable lr value itself…perhaps I’m wrong tho, I just didn’t see this line anywhere(lr /= 2)

(Vikrant Behal) #124

Now I’m curious, Is there any difference if I change it and then use!

(James Requa) #125

Here is another example…
lr = 1
np.array([lr/9,lr/3,lr]) == np.array([1/9,1/3,1])
np.array([lr/18,lr/6,lr/2]) == np.array([1/18,1/6,1/2])

Only if he changed the first line to define lr then they are not the same :slight_smile:

@vikbehal can you confirm did you redefine lr anywhere? Or you just kept it constant and took fraction of it each time?

(sergii makarevych) #126

Okay, we are on the same page. I would prefer lr /= 2 because its just shorter. BTW guys do you have your loss/accuracy dynamics ?

(Vikrant Behal) #127

I used this line but the idea was to half the lrs. I did not change original ‘lr’.
[lr/18, lr/6, lr/2]

(Vikrant Behal) #128

So, I assume both are same!
[lr/18, lr/6, lr/2] and
lr = lr/2
[lr/9, lr/3, lr]

(Jeremy Howard) #129

We shouldn’t share that here since this is an ongoing competition.

(Vikrant Behal) #130

Minor improvement:


Dude you are about to topple me -