Hi,
I wanted to investigate what the average training loss displayed by the fit function, equates to in terms of accuracy and loss per epoch. For this comparison I used the seedlings data set, with 80% of the data in the train folder and 20% in a valid folder.
To calculate the training accuracy on this I first ran 5 epochs of learn.fit() with the training and validation set as
arch=resnet34 data = ImageClassifierData.from_paths(PATH, tfms=tfms_from_model(arch, sz),trn_name="train",val_name="valid") learn = ConvLearner.pretrained(arch, data, precompute=True) learn.fit(0.01, 5)
The output from fit with the validation containing 20% of the original data is
epoch trn_loss val_loss accuracy 0 1.496924 1.013778 0.642708 1 1.051771 0.805401 0.727083 2 0.831304 0.726356 0.747917 3 0.729012 0.672901 0.76875 4 0.644722 0.680163 0.758333
I then set val_name=“train” so that both the training and validation sets were the same to investigate the training loss and accuracy at the end of each epoch as
data = ImageClassifierData.from_paths(PATH, tfms=tfms_from_model(arch, sz),trn_name="train",val_name="train") learn = ConvLearner.pretrained(arch, data, precompute=True) learn.fit(0.01, 5)
The output from fit when setting the validation set to be the same as the training set is
epoch trn_loss val_loss accuracy 0 1.534358 0.783512 0.757999 1 1.082118 0.563303 0.833817 2 0.866195 0.453154 0.870945 3 0.73888 0.387739 0.877195 4 0.655201 0.339189 0.901414
I think I have an error in my understanding because the validation loss and accuracy reported when val_name=“train” is much much higher than on the validation set for all corresponding epochs.
I know I the results at epoch 0 for both runs will have differences but it looks like the training accuracy at epoch 0 as ~76% and the validation accuracy as ~64%, or am I missing something?
I alternatively tried running a single epoch as
arch=resnet34 data = ImageClassifierData.from_paths(PATH, tfms=tfms_from_model(arch, sz),trn_name="train",val_name="valid") learn = ConvLearner.pretrained(arch, data, precompute=True) learn.fit(0.01, 1)
with the following output
epoch trn_loss val_loss accuracy 0 1.504691 1.008132 0.65
and then tried to calculate the accuracy on the training set as
log_preds, y = predict_with_targs(learn.model,learn.data.trn_dl) probs = np.exp(log_preds) preds = np.argmax(probs, axis=1) sum(preds==y)/y.size
again resulting in ~76%.
I think I have an error in either my understanding of the training and validation loss and/or how they are calculated. Can anyone point me in the right direction?
Thank you