I saw my work colleagues training a neural network with 10k epoch. Then, I have a strong feeling that it could be reduced if he knew how to find the right learning rate then do a few epoch like demonstrate in lessons.
Then, I do some research on this topic. I narrowed the topic’s scope to
How to train a resnetXX from scratch(no pretrained), simple and economical? what is the current best practice?
I went thru the top entries of the DAWN Standford list. and read their source codes.
- XLRScheduler of Huawei Cloud - proprietary? automate? schedule lr and batch size
- fast.ai/DIUx - Use carefully crafted lr,img size,batch size and warm-up lr
- Others (intel google) - does not mention lr in their code explicitly
In conclusion, I did not find a simple way of doing this. fast.ai way seemed to obtain after numerous trial-and-error. 35-50 epoch with 4-5 new lr, 3 new sizes and 2 new batch sizes.
My next heading is to try fit_one_cycle()
pretrained experiment(it is going to be expensive?). how many epoch? how many times?
I want to hear the experience from your guys who have done a similar thing before.
Remark: My understanding is up to lesson 2 2019.