Thanks!
How come the validation loss is smaller than the training loss?
Is lesson1 using the test data at all? I donât see a cats/dogs split in the test data dir.
Why arenât we looking at validation accuracy? I thought validation accuracy and validation loss donât correspond linearly. Or was it validation error?
In Kaggle competitions you donât get the labels for the test dataset. You can only predict over the test dataset.
Thereâs a recent paper called âDonât Decay the Learning Rate, Increase the Batch Sizeâ Is adjusting the learning rate the most effective for converging or is adjusting the batch size effective as well?
Test data doesnât have labels. These are unlabeled images from the Kaggle competition. If you want to make a Kaggle submission, you should get these images, predict a class using your model and upload predictions to Kaggle.
So the labels werenât released after the fact?
As far as I know, correct labels for the test set are not available at kaggle.com.
You have to have super GPUs to increase enough batch size
whatâs the atom optimizer?
â
Adam, he will explain it later.
Actually, it is Adam. Another optimizing method
Does this Cyclical Learning rate method work for all ML architectures, or is it meant for this sort of classification problem?
Does the learning rate finder work with all types of NNs or just CNNs?
For all.
I didnât get the reason for not picking the minima of the learning rate vs loss curve, but a rate slightly higher.