Fastai v2 docs page "Performance Tips and Tricks"

etremblay · August 25, 2020, 1:25pm

In fastai v1, there was this page in the docs to improve performance for vision tasks: https://fastai1.fast.ai/performance.html

The page is not there anymore in the fastai v2 docs. Is it still relevant to use libjpeg-turbo and Pillow-SIMD in fastai2 if we want to increase training speed?

Thanks,

muellerzr · August 25, 2020, 1:35pm

Yes it still is

a_yasyrev · September 5, 2020, 8:14am

Did somebody compare performance on v1 vs v2?
I has strange results, cant understand why.
Same model, same optimizer (i tried set same parameters ) same machine i had different results. On v2 cant reach same results as on v1. Sometimes results worse more than 10-15%.
Same with epochs time. 55 sec for epoch on v1 vs 59 on v2.
Spent lot of time trying understand why.
What i find - pytorch 1.6 slower than 1.5 (v1) - 59 sec epochs vs 55 sec on 1.5.
On colab i tried fit model on v1 first. Than upgraded fastai to v2 (2.0.8). On v1 epoch time was 3.30 min, on v2 3.29.
But accuracy 15% lower on v2.

BradleyB19 · January 1, 2021, 9:56pm

Any luck resolving the issue? I am getting the same problem of worse performance on v2 with the exact same model and hyperparameters.

Divyanshu · May 3, 2021, 6:54pm

Any luck? I have been having the same issue, using the same architecture and optimizer, but getting worse performance on v2. @muellerzr

muellerzr · May 3, 2021, 6:55pm

We need a bit more information here:

Are you 100% sure it’s the same? How are you comparing? Are the DataLoaders the same?

If you (or anyone else) can write a gist of what they believe is a 1:1 comparison, we can go from there

Divyanshu · May 3, 2021, 7:17pm

Yes sure. I trained a model for the task of image regression in fastai v1. I used the same dataset and created the databunch as shown below:

data = ImageList.from_df(df=data_f,path = path).split_by_idxs(train_idx,valid_idx).label_from_df().databunch(bs=192).normalize()

After creating the databunch, I simply created a cnn_learner with resnet34 architecture as shown below:

learn = cnn_learner(data,models.resnet34,metrics=[mse,mae,r2_score,rmse])

After that, I used data-parallel to use all the four GPU available.

learn.model = nn.DataParallel(learn.model)

After this, I simply followed the fastai training style where I first found the optimal learning rate and then did learn.fit_one_cycle(). I repeated the process until the model converged giving a final rmse of 0.68.

For fastai V2, I followed the exact same process where I used the same dataset and created the databunch using ImageDataLoader as shown below:

dl = ImageDataLoaders.from_df(path = ‘/data/’,df = data_train,fn_col=‘Image_Path’,label_col=‘HC’,bs=128,y_block=RegressionBlock,valid=‘is_valid’)

And similarly created the cnn_learner as shown:

learn = cnn_learner(dl,models.resnet34,metrics = [rmse,R2Score()])

Finally used data-parallel exactly as shown above and followed the exact same training steps but the final rmse didn’t go below 1.

The loss function used in both the experiments was MSELossFlat.