Best practices to retrain a model

minh · August 19, 2019, 12:35pm

Hello everyone.

I built an image classifier after watching the first 2 lessons. (It is available here: https://golf-discriminator.onrender.com/). I showed the model for my friends and they found out some cases when the model does not work well. Now I want to retrain or update the model with more images. What is the best practice to do this? Which of the following practices is the best?

Combine the new images with the old images and start training everything again from scratch.
Combine the new images with the old images and continue training with the existing model.
Train the existing model using ONLY the new images.

Please let me know how you guys think. Thanks.

dhoa · August 19, 2019, 2:28pm

I think using your 2nd proposition is better.

Training all from scratch might give better result but it might take too long. Using only new images might overfit only on the new images.

However, sometimes when I use the existing model, it is not easy to find the optimum learning rate (because it is already fit in a subset of data already). So I have to learning all from scratch.

Maybe someone else can have the better practice

minh · August 19, 2019, 6:45pm

It’s tricky right? But I think updating the model is a very common practice in production so I believe there must be a good way to do it out there. I am seeking an answer elsewhere too and will post here if I get any update.

muellerzr · August 19, 2019, 6:51pm

2 is best. Just expand your dataset higher for that class and reuse the weights and train further.

If that’s not possible, use a subset of the other class to reverse the imbalance perhaps. But 2 is ideal.

dhoa · August 28, 2019, 9:21am

I take advantage of this post to ask a question about how to train a big dataset There is a model that I can have a good result after training with lot of epochs (about 700 epochs). I remember reading somewhere in the forum that it might be better that we train 10 epochs than finding the new learning rate than retrain 10 epochs again … then we can get good result in a shorter time ?! (I can’t find this post anymore).

Anyone have experience about this ?

hammao · August 28, 2019, 10:50am

I will like to suggest that you watch the video on Lesson 4 part 1… especially where Jeremy was talking about the planet dataset.

Best
H

rsomani95 · August 28, 2019, 11:06am

If it isn’t too expensive, then I think the best way to do this would be 1) i.e. retrain from scratch.
This way, your model will probably perform better after just one epoch when compared to your old model (since you have more data). Overall, with the same number of epochs as before, you will get better performance. This is better than 2) because overall, your model will have seen the older images fewer times, and the lesser the total number of epochs, the more your model tends generalise rather than memorise.

However, if you’re only trying to take care of one tiny issue in the data, and you’re confident your new images will do that, then 2) or 3) are viable options too.

minh · August 28, 2019, 11:35am

Thank you guys for your comments.
On my side, I have asked two CS professors in my university and both told me that option 1 is the best given that I have enough resources (time and computation). Option 2 is the next option to go. Option 3 is definitely not good since the model will overfit to the new images.