Hi @SMEissa! If we focus on the second case (wd=0.1), by epoch 5 both your training and validation losses are still getting better… so try training a bit longer until your training loss keeps getting better but your validation loss starts getting worst. This is the time to stop.
Weight decay has a regularization effect that prevents from ovefitting (which is a good thing), but it means also that it can take longer for your model to learn. That’s why in your second case more than 5 epochs may be required.
I think you have it right. You can choose any file name you want for the model. But then when you use load_model you have to pass it the filename. So in your example, you can retrieve the saved model with my_model_objects = load_model('my_model.pth')
then you can check if you got everything you saved with dir(my_model_objects)
You should see the model and the optimizer that you saved.
You could also utilize the SaveModelCallback, which has a parameter for a filename that it will save it to (I believe you can also have it simply save every iteration). Then do a learn.load (or load_model) to bring it back in
In the TabularPandas and TabularProc section of 09_tabular.ipynb
We are splitting the training and the validation set with before November 2011 and after November 2011
On a related note (and apologies if I’ve misread the text), in this paragraph
To calculate the result for a particular movie and user a combination we have to look up the index of the movie in our movie latent factors matrix, and the index of the user in our user latent factors matrix, and then we can do our dot product between the two latent factor vectors. But look up in an index is not an operation which our deep learning models know how to do.
it seems to say that we cannot simply index into an embedding, but doesn’t this implementation later on in the chapter
If you don’t use small NNs, the combined size of your model gets large very fast. The question then is if you shouldn’t just use a single large NN instead.