Validation loss is used to ascertain performance on an entire epoch after the training loss has been calculated and weight updates have been completed. The loss function should be the same and the purpose is to make sense of models performance on unseen data – data that is not used to update the weights. this is done in order to understand how the model may perform on data that is NOT seen at training.
Conversely, think of what it would mean to only have a loss value for the training data. Since our model is seeing this data and working on improving it’s ability to predict this data… an improvement in loss is expected. However, we have no idea whether this will apply to unseen data (generalizability) – holding out some of the data (as a validation or test set) allows us to determine how the model is doing on samples that it is not directly using to update it’s weights with.
Since the loss function is chosen in order to satisfy various mathematical requirements for training (needs to be differentiable etc) and these requirements do not align (always) with human understanding, we utilize a metric (I.e. accuracy) to make sense of model performance. This metric is also calculated on the entire validation set after the weight updates, in order to ascertain the models performance on the unseen data – or the models ability to generalize on data that it has not directly used in training.