Problems about RMSE and MSE

ouou4516 · May 7, 2020, 1:28pm

Hi,

I am having problems interpreting my training results. I use the default flattened MSE Loss to do regression on a dataset, and RMSE is used as ``metrics’’ when creating the Learner.

However, the problem is, the RMSE is not equal to the root of the valid loss (which is MSE). It is always smaller.

epoch	train_loss	valid_loss	root_mean_squared_error	time
0	0.223325	0.192066	0.437771	00:08
1	0.170728	0.126881	0.355616	00:09

The full code:

github.com

SmithLiu95/Perceived-Emotions-Baseline/blob/master/train_IAPS.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Import Libraries"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "_cell_guid": "b1076dfc-b9ad-4769-8c92-a6c4dae69d19",
    "_uuid": "8f2839f25d086af736a60e9eeb907d3b93b6e0e5"
   },
   "outputs": [],
   "source": [
    "\n",
    "import numpy as np # linear algebra\n",

This file has been truncated. show original

I am using FastAI 1.0.60.
Any idea why this happens?

Thank you very much!

Pomo · May 7, 2020, 7:19pm

Hi Chang and welcome,

I love these curiosity questions about an anomaly. Sometimes they lead to great discoveries. Sometimes “only” to better understanding.

First, I do not see this issue with fastai2. I am not using fastai1 currently.

If I had to risk a wild guess about fastai1:
valid_loss: the mean of all the errors squared for the whole validation set
root_mean_squared_error: the mean of the square roots of MSE across a smaller group, like batches.
And mean(sqrt()) != sqrt(mean())

You could find out exactly what is happening by putting a debugger on the metrics. Please let us know if you find the right answer!

Cheers,
Malcolm

ouou4516 · May 8, 2020, 2:54am

Hi McLean,

Thanks for your answer!

I find that FastAI 1 calculates the valid loss and RMSE using the same method. It first calculates the score on mini-batches, then doing a weighted average to get the final loss/score, and the weight is the number of data in the mini-batch.

You may find the code here:

github.com

SmithLiu95/Uncategorized-Codes/blob/master/how RMSE is calculated in FastAI.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Import Libraries"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "_cell_guid": "b1076dfc-b9ad-4769-8c92-a6c4dae69d19",
    "_uuid": "8f2839f25d086af736a60e9eeb907d3b93b6e0e5"
   },
   "outputs": [],
   "source": [
    "import numpy as np # linear algebra\n",
    "import pandas as pd # data processing\n",

This file has been truncated. show original

Thanks,
Chang

Pomo · May 8, 2020, 6:32pm

Chang, I am glad you figured it out. I think this issue has been fixed in fastai2.