I’ve trained a RF with 200 estimators and I’m trying to find the optimal number of trees. I’m doing it with the parallel_trees function that @jeremy explains in Lesson 3, like:
preds = np.stack(parallel_trees(model, get_preds))
This takes about 1 minute (I have 230k rows). However, when I call
model.predict(train_x), it takes just 5 seconds.
Why does this happen? Aren them doing the same, internally? Passing the 230k rows trough each one of the 200 trees and being the final averaging of the results the only difference? Both cases are parallelized (8 cores) so where’s the difference? Some kind of vectorization maybe?