Mean Absolute Error in Random Forest Regression


(odysseus.kaziolas) #1

I am trying to resolve the Allstate Kaggle challenge to get a better feeling for the Random Forest Regression technique.

The challenge is evaluated based on the MAE for each row.

I’ve run the RandomForrestRegressor on my validation set, using the criterion=mae attribute. To my understanding this will run the Forest algorithm calculating the mae instead of the mse for each node.

After that I’ve used this: metrics.mean_absolute_error(Y_valid, m.predict(X_valid)) in order to calculate the MAE for each row of data.

What I would like to know is if the logic I’m following is sound. Am I making a fundamental mistake or missing something here?
Should I have used the default MSE based Regressor and then calculate the MAE of each row using the mean_absolute_error function?