Training loss VS Validation loss?

lobvh · August 31, 2021, 7:59pm

I understand that we use loss function to update parameters of the network during the training, but how is validation loss calculated if it doesn’t serves to ‘updating parameters’? Is the loss function same for training and validation set, and you just ‘check’ what is the loss for the SAME loss function on the validation data?

I imagine having one loss function for both training and validation set, and after you do the forward pass, calculate (training?) loss, change parameters of the network towards lower (training?) loss you then run each data point of validation set and check for the same loss function how much is validation loss? Finally, you calculate the accuracy.

ali_baba · September 1, 2021, 4:10pm

Validation loss is used to ascertain performance on an entire epoch after the training loss has been calculated and weight updates have been completed. The loss function should be the same and the purpose is to make sense of models performance on unseen data – data that is not used to update the weights. this is done in order to understand how the model may perform on data that is NOT seen at training.

Conversely, think of what it would mean to only have a loss value for the training data. Since our model is seeing this data and working on improving it’s ability to predict this data… an improvement in loss is expected. However, we have no idea whether this will apply to unseen data (generalizability) – holding out some of the data (as a validation or test set) allows us to determine how the model is doing on samples that it is not directly using to update it’s weights with.

Since the loss function is chosen in order to satisfy various mathematical requirements for training (needs to be differentiable etc) and these requirements do not align (always) with human understanding, we utilize a metric (I.e. accuracy) to make sense of model performance. This metric is also calculated on the entire validation set after the weight updates, in order to ascertain the models performance on the unseen data – or the models ability to generalize on data that it has not directly used in training.