Somewhere in the video the professor explains that he changed the value of n_jobs=-1 to something different because in his computer it performed a bit better.
As we all run on different systems, I was curious to choose and set easily a ‘good’ value for that easily applicable to anyone (at least on Linux).
Here are my results in a 6 cores system.
Seems to me, than selecting ncores-1 (use all -1 ) provides a good result.
I am curious, what about your results?
Code:
# Get the number of cores/cpus on the machine
!nproc --all
# Save it
ncores=!nproc --all
ncores=int(ncores[0])
# See how different values of n_jobs affect the duration of the task
values_to_try= list(range(1,ncores+1)) + [-1]
set_rf_samples(10000)
for cores_to_use in values_to_try:
m = RandomForestRegressor(n_estimators=60, min_samples_leaf=3, max_features=0.5, n_jobs=cores_to_use, oob_score=False)
print(f'\nUsing {cores_to_use}/6 cores it took:')
%time m.fit(X_train, y_train)