Lesson 8 - Official topic

harish3110 · May 11, 2020, 3:25am

Yes, it is quite fascinating. I would assume that Jeremy collected tabular data of all his experiments by noting down all the hyperparameter values chosen and the final metric/score achieved in each case. He would have then applied a random forest model on this data to calculate a partial dependence plot for the lower layer learning rates in order to get the best score.

That’s my best guess anyway! Maybe @muellerzr or @sgugger could provide more info and insight on this!

It would be extremely immense if we could see this data that Jeremy worked on!