Confidence based on tree variance - Random Forest Interpretation

mdavala · November 2, 2018, 10:07am

Hi Everyone,
I am using Random Forest for one of my datasets (Classification problem), I see that
Training Accuracy is 97.9
Validation Accuracy is 95.5
Test Accuracy is 80.4
I have couple of question in random forest interpretation.

I would like to check the confidence of my trees, so I checked
%time preds = np.stack([t.predict(X_valid) for t in m.estimators_])

total number of estimators is 250 here. I see preds.shape as (250, 18681)
what is 18681 here?

To check my confidence I calculated mean and std of the preds,
np.mean(preds[:,0]), np.std(preds[:,0])
I see for mean and std - (0.0, 0.0)

What it does mean? Should I take any other approach rather than this for classification problem?

Thanks,
Mohan Raj