RuntimeError: DataLoader worker is killed by signal

I think Kaggle still doesn’t have a high enough shared memory limit for their Docker containers.

Some options:

  1. reduce your batch size, say to bs=16 maybe, instead of the default 64.
  2. reduce the number of workers. This will slow down your training.
  3. train on Colab instead of Kaggle. Colab fixed this issue in fall 2018.

I would favor option #1 or #3.

2 Likes