Revisiting IMDB: Can we push state of the art? - An attempt

muellerzr · August 25, 2019, 8:51pm

@Daniel.R.Armstrong thank you for your kind words!

What are your thoughts on label smoothing?

On label smoothing: I did a small test on the sample to just to see, I did not see any improvements whatsoever, but this should be done on the entire dataset. For the next few days I am out of a laptop so I can’t run anything large yet.

I tried using the new radam, but I didnt see any improvements, it seems like the biggest advantage is when you are doing 30+ epochs.

On RAdam, I did not test RAdam due to that reason. What I did test was Ralamb, ImageNette/Woof Leaderboards - guidelines for proving new high scores? - #35 by grankin

I found that on small scale, Ralamb did improve it! I went from 78% on the IMDB sample to 80/81%, so I believe there’s enough there to try it out with the entire dataset. Also, my learning rate was increased to 3e-1. I did not test any smaller yet though. I will update this post in a little bit when I have

Do have any thoughts on how convert the SentencePiece parts back into words?

For converting back… I struggled with that for days when I was working on getting SP to work. I have nothing yet. I tried looking at their source-code to no avail I’m afraid

So next plan is later this week I want to run it back with LabelSmoothing on the entire dataset (to put it to rest) and use Ralamb as I saw small-scale success