Lesson 2 In-Class Discussion

Thanks!

How come the validation loss is smaller than the training loss?

5 Likes

Is lesson1 using the test data at all? I don’t see a cats/dogs split in the test data dir.

Why aren’t we looking at validation accuracy? I thought validation accuracy and validation loss don’t correspond linearly. Or was it validation error?

In Kaggle competitions you don’t get the labels for the test dataset. You can only predict over the test dataset.

There’s a recent paper called “Don’t Decay the Learning Rate, Increase the Batch Size” Is adjusting the learning rate the most effective for converging or is adjusting the batch size effective as well?

2 Likes

Test data doesn’t have labels. These are unlabeled images from the Kaggle competition. If you want to make a Kaggle submission, you should get these images, predict a class using your model and upload predictions to Kaggle.

So the labels weren’t released after the fact?

As far as I know, correct labels for the test set are not available at kaggle.com.

1 Like

You have to have super GPUs to increase enough batch size

6 Likes

what’s the atom optimizer?

https://arxiv.org/abs/1506.01186

Cyclical Learning Rate

1 Like

http://arxiv.org/abs/1506.01186

–

1 Like

Adam, he will explain it later.

1 Like

Actually, it is Adam. Another optimizing method

Does this Cyclical Learning rate method work for all ML architectures, or is it meant for this sort of classification problem?

2 Likes

Does the learning rate finder work with all types of NNs or just CNNs?

1 Like

For all.

I didn’t get the reason for not picking the minima of the learning rate vs loss curve, but a rate slightly higher.

3 Likes