I love these curiosity questions about an anomaly. Sometimes they lead to great discoveries. Sometimes “only” to better understanding.
First, I do not see this issue with fastai2. I am not using fastai1 currently.
If I had to risk a wild guess about fastai1:
valid_loss: the mean of all the errors squared for the whole validation set
root_mean_squared_error: the mean of the square roots of MSE across a smaller group, like batches.
And mean(sqrt()) != sqrt(mean())
You could find out exactly what is happening by putting a debugger on the metrics. Please let us know if you find the right answer!
I find that FastAI 1 calculates the valid loss and RMSE using the same method. It first calculates the score on mini-batches, then doing a weighted average to get the final loss/score, and the weight is the number of data in the mini-batch.