RuntimeError: CUDA error: an illegal memory access was encountered

When I am running following code on Gradient, it is working fine but it is throwing me error after running for few seconds in 1st epoch in Kaggle notebook. When I am using bs=32 then it is running fine in Kaggle notebook as well.

Code
from fastai.text.all import *
dls = TextDataLoaders.from_folder(untar_data(URLs.IMDB), valid=‘test’, bs=16)
learn = text_classifier_learner(dls, AWD_LSTM, drop_mult=0.5, metrics=accuracy)
learn.fine_tune(4, 1e-2)

Error
RuntimeError: CUDA error: an illegal memory access was encountered CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

Can anyone please help me out?

Could you confirm the pytorch version?

Some of the workarounds suggested were a) lower batch sizes b) setting specific gpu torch.cuda.set_device(1)

This seems like a pytorch issue. It is not clear to me that the issue is completely fixed.

Current PyTorch version in Kaggle notebook is ‘1.9.1’.
If ‘lower batch sizes’ is suggested workaround then why batch_size = 16 is not working but batch_size = 32 is working?