Lesson 3: Optimal value for n_jobs?

rgarcia · April 22, 2019, 6:55pm

Somewhere in the video the professor explains that he changed the value of n_jobs=-1 to something different because in his computer it performed a bit better.

As we all run on different systems, I was curious to choose and set easily a ‘good’ value for that easily applicable to anyone (at least on Linux).

Here are my results in a 6 cores system.

Seems to me, than selecting ncores-1 (use all -1 ) provides a good result.

I am curious, what about your results?

Code:

# Get the number of cores/cpus on the machine
!nproc --all

# Save it
ncores=!nproc --all
ncores=int(ncores[0])


# See how different values of n_jobs affect the duration of the task
values_to_try= list(range(1,ncores+1)) + [-1]


set_rf_samples(10000)
for cores_to_use in values_to_try:
    m = RandomForestRegressor(n_estimators=60, min_samples_leaf=3, max_features=0.5, n_jobs=cores_to_use, oob_score=False)
    print(f'\nUsing {cores_to_use}/6 cores it took:')
    %time m.fit(X_train, y_train)