Lesson 7 - Official topic

tanguyen14 · April 29, 2020, 3:04am

Similar in a sense that they are small sample of the data, but different because the mini-batches within an epoch are all different, but the random samples used to train each tree could have overlapping records/ data points. Also i think random forest also sample a subset of the features for each tree instead of using all of the features, and the mini-batch should use all of the features (please correct me if i’m wrong).

FraPochetti · April 29, 2020, 3:04am

At the end of the day it is almost the same, even though not fully comparable.
For out-of-fold error, the data points you are calculating it onto, pass through EVERY tree in the ensemble, whereas in OOB this is not guaranteed.
In a sense OOB is more overfit, still it gives a good perspective.
And it does it for free.

raphaelr · April 29, 2020, 3:05am

@jeremy, does that method of measuring featuring importance end up being roughly the same as shuffling the values in a column and measuring how much the model’s performance degrades?

FraPochetti · April 29, 2020, 3:06am

Yes. This is correct. It gives more or less the same results.
I actually like your approach better.

ilovescience · April 29, 2020, 3:06am

How is this true? I don’t think it is. Is there something I am missing?

To clarify, I guess Bagging with 5 decision trees is equivalent to 5-fold CV of decision tree.

arunslb123 · April 29, 2020, 3:07am

Side note: When we are doing time based train/valid split(ex: future sale predictions), OOB score is less good because it shuffles the data.

harish3110 · April 29, 2020, 3:07am

Is cluster_columns doing something like a hierarchical clustering? The plot looks very similar…

FraPochetti · April 29, 2020, 3:07am

Yep, agreed.

ilovescience · April 29, 2020, 3:12am

I think we are briefly going to discuss the concept behind boosting in this chapter (so this lesson or the next)

MJB · April 29, 2020, 3:12am

The results and technical aspects of creating the model are interesting but I am wondering about the utility of the interpretation. If the goal was to predict the sale price of the bulldozer, the tree seems to indicate that the newer, larger vehicles fetch the higher price. This seems like a lot of work to state confirm what is intuitive when selling used vehicles.

jwuphysics · April 29, 2020, 3:13am

It’s great that your intuition aligns with what the model found in this simple case! Unfortunately, often it’s quite difficult to discern which variables are most important, so you can imagine that in other cases this would be quite valuable.

FraPochetti · April 29, 2020, 3:13am

What I meant is that you generally calculate your out-of-fold error using the entire Random Forest, trained on the rest of the dataset.

Yes, I agree. I guess I replied too fast and didn’t think correctly at the depth of the point you raised.

ilovescience · April 29, 2020, 3:14am

Ah OK makes sense. Glad we agree

MJB · April 29, 2020, 3:15am

Sure but we’re talking about this case. Does this mean we need to frame a better question about predictions?

jwuphysics · April 29, 2020, 3:16am

I think that the ability to understand and compare dependence on other, subdominant variables (such as ProductSize) is illuminating.

It may mean that there are better (or additional) questions to ask!

FraPochetti · April 29, 2020, 3:18am

I consider interpretable ML (together with fairness and bias in AI) as one of the most fascinating and crucial aspects of being an effective Data Scientist.
A while ago I put together this tutorial with a review of the latest techniques/algorithms/approaches to “look” into a model.

Side note: this is a true gem. Cannot recommend it enough. And it covers NN interpretation too!

ilovescience · April 29, 2020, 3:19am

On that related note, PyTorch has a great package for interpretable neural networks here:

erlapi · April 29, 2020, 3:20am

Isn’t partial dependence a better way of understanding variable importance? (better than the scikit learn automated way of doing it based on tree splits?)

mrfabulous1 · April 29, 2020, 3:20am

Hi MJB Hope all is well!

The amazing part is we are using software to model intuition, also even if you have no previous intuition or domain experience you can apply these techniques and probably compete with others who have many years of experience.

Cheers mrfabulous1

FraPochetti · April 29, 2020, 3:22am

This is the original blog post from Ando Saabas who, back in 2014, invented the treeinterpreter approach.
True genius.

His entire blog is a goldmine btw.