Very slow dataloader

I am creating an ObjectItemList and running into very slow loading of each batch.
x,y = next(iter(dl))
takes upwards of 1 minute for each batch, even when the batch size is small (16).

This slowness shows up in calls such as show_batch(), show_results() and training in each epoch. Am I missing something? Using 64GB RAM, i7 Extreme processor and GP100 GPU.

2 Likes

Digging further, it looks like it might be related to https://github.com/pytorch/pytorch/issues/12831 (Windows issue). However, in the fastai code num_workers is set to 0 before a batch is fetched:
https://github.com/fastai/fastai/blob/master/fastai/basic_data.py#L141

Why is num_workers being set to 0?

I have a workaround for this issue. In basic_data.py num_workers needs to be set to 0 in the intercept_args function.

Rohid.
I have the same issue.
It takes about a 40 sec in my case to fetch a batch.
But I do not understand what have you done to solve it.
num_workers=0 is already there

Would you have a minute to share how exactly I need to modify basic_data.py?

In fastai\basic_data.py, add the line setting “num_workers = 0” to the intercept_args() method, to workaround this problem:

def intercept_args(self, dataset, batch_size=1, shuffle=False, sampler=None, batch_sampler=None,

             num_workers=0, collate_f=default_collate, pin_memory=True, drop_last=False,

             timeout=0, worker_init_f=None):

num_workers = 0 # add this line

self.init_kwargs = {'batch_size':batch_size, 'shuffle':shuffle, 'sampler':sampler, 'batch_sampler':batch_sampler,

                    'num_workers':num_workers, 'collate_fn':collate_fn, 'pin_memory':pin_memory,

                    'drop_last': drop_last, 'timeout':timeout, 'worker_init_fn':worker_init_fn}

old_dl_init(self, dataset, **self.init_kwargs)

Rohit

3 Likes

This explains a lot! I assumed the speed I was getting in windows was normal until I tried fastai in Ubuntu on my system on the exact same notebook, and it ran / trained ~3x faster.

On my hardware, your fix took an unet / resnet101 128x128x3 BS = 8 epoch from around 3:10 to 1:25. I get around :55 in ubuntu on the same hardware, but this is at least more reasonable on windows.

Thanks.

1 Like

I receive a

IndentationError: unexpected indent

when I add

num_workers = 0

line at the place you suggested. I tried no indentation, to multiple level indent, but none of them is working.
Do you have any clue for this?