This previous forum post identifies this issue as ThreadPoolExecutor
greedily pulling batches for each iteration of self.sampler
into memory in Python 3.6 vs. pulling them in lazily in Python 3.5.
There are two workarounds:
- Set
num_workers
to0
, which then runs batches in a single thread. This resulted in the consumption of a max of 3GB of memory in the scenario above. - Use the dataloader iterator from Pytorch as described here.
I’m continuing to research ways to make a permanent fix. Any ideas on how to do that or things to look into would be much appreciated.
https://github.com/fastai/fastai/blob/master/fastai/dataloader.py