Kaggle Comp: Plant Seedlings Classification

brightertiger · November 29, 2017, 3:48pm

Hi,

I was working my code from scratch in Pytorch. My log-loss and accuracy is similar to what you are getting in the screenshot above. Can you explain how you fixed this issue?

For reference :- https://github.com/brightertiger/kaggle/blob/master/playground/plant-seed/02-resnet-transfer-learning.ipynb

vikbehal · November 29, 2017, 7:31pm

I removed F1 as metric and redid everything without specifying any explicit metric. Smaller batch size and more lower lrs after couple of epochs (after unfreezing) did the trick.
Btw, if you want I could look at the code and see.

jamesrequa · November 29, 2017, 7:33pm

Remember Jeremy had recommended to use this lrs=np.array([lr/9,lr/3,lr]) when the dataset is very different from what is in ImageNet. For this dataset, however, I think you made a good observation to lower the learning rates in earlier layers

vikbehal · November 29, 2017, 7:41pm

I actually used what Jeremy had recommended for 1st submission.
The output suggested that model was learning more when lrs were smaller and had longer cycle length. So, I add 3 cycles with lower lrs and it helped.

sermakarevich · November 29, 2017, 7:48pm

I confused a bit - is not np.array([lr/18,lr/6,lr/2]) == np.array([lr/9,lr/3,lr]) when lr = 1/2 lr ?

jeremy · November 29, 2017, 7:48pm

Metrics are only printed - they don’t change the training at all. So that can’t be the source of any difference you saw…

vikbehal · November 29, 2017, 7:50pm

Correct. Somehow the numbers weren’t promising. The score started around .4 and remained around .6 after unfreezing. Maybe I made a mistake.

vikbehal · November 29, 2017, 7:50pm

Correct. I further halved the lrs!

jamesrequa · November 29, 2017, 7:53pm

No I don’t think its the same because the value of lr didn’t change, just that the first learning rate in the array he also decided to lower it by half here. i.e. if lr = 1 now the first lr in the array is 0.5 but the lower layers are much lower than before (1/18 vs 1/9, 1/6 vs 1/3)

sermakarevich · November 29, 2017, 7:56pm

What if we do

lr /= 2
lrs = np.array([lr/9,lr/3,lr])

instead of

lrs = np.array([lr/18,lr/6,lr/2])

its not the same?

vikbehal · November 29, 2017, 7:57pm

It’s same?
[lr/9, lr/3, lr] /2 == [lr/18, lr/6, lr/2]
or I could have done is lr = lr/2
and then used [lr/9,lr/3,lr]

jamesrequa · November 29, 2017, 7:57pm

Yes this time its the same. But I don’t think he changed the variable lr value itself…perhaps I’m wrong tho, I just didn’t see this line anywhere(lr /= 2)

vikbehal · November 29, 2017, 7:58pm

Now I’m curious, Is there any difference if I change it and then use!

jamesrequa · November 29, 2017, 8:00pm

Here is another example…
lr = 1
np.array([lr/9,lr/3,lr]) == np.array([1/9,1/3,1])
np.array([lr/18,lr/6,lr/2]) == np.array([1/18,1/6,1/2])

Only if he changed the first line to define lr then they are not the same

@vikbehal can you confirm did you redefine lr anywhere? Or you just kept it constant and took fraction of it each time?

sermakarevich · November 29, 2017, 8:02pm

Okay, we are on the same page. I would prefer lr /= 2 because its just shorter. BTW guys do you have your loss/accuracy dynamics ?

vikbehal · November 29, 2017, 8:02pm

I used this line but the idea was to half the lrs. I did not change original ‘lr’.
[lr/18, lr/6, lr/2]

vikbehal · November 29, 2017, 8:04pm

So, I assume both are same!
[lr/18, lr/6, lr/2] and
lr = lr/2
[lr/9, lr/3, lr]

jeremy · November 29, 2017, 8:10pm

We shouldn’t share that here since this is an ongoing competition.

vikbehal · November 29, 2017, 8:56pm

Minor improvement:

mmr · November 29, 2017, 9:23pm

Dude you are about to topple me -