Hello everyone. Im currently taking the fast.ai 2020 course and I encountered a problem on lesson 2. When I try to run the learner I receive the following error:
RunTimeError: DataLoader worker (pid 4509) is killed by signal: Killed.
Then if I insist and run it again I receive:
OSError: [Errno 12] Cannot allocate memory
Im working with a paperspace gradient GPU over a Linux VM. Does someone has a clue on how to solve this issue?
Its weird because yesterday I was able to run It without any inconvenience… I tried solving it by setting the num_workers = 0 but the same problem appears, the kernel dies or it takes way to long. Same with reducing the batch size.
@Kornel Thank you for your response, I appreciate it. I will try all of the above and, In case it fails, how can I check if memory is busy before running first cell? With the memory profiler? Just in case, Is it possible that because Im not storing the images and the new notebook on the /storage folder this is causing the problem?
And I correct myself, It doesnt say “Out of Memory” It says:
I am having memory issues as well, attempting to train a model locally on a computer without a viable GPU. I have cut the training set down to 37 images now (~3MB each) and have code like:
So, I’ve set bs to a low value (as I gather that helps) and am using the smallest resnet arch I think, as well as resizing images to 100x100 (I think that’s what Resize(100) does, at least…).
Despite this, as soon as I start running this in a jupyter notebook I see the reported memory usage balloon to ~25GB of RAM in top (I only have 8GB on this machine) before the jupyter kernel crashes (I don’t see any progress on the progress bars at all before this happens).
I’m using:
python 3.8.5,
fastai 2.0.8,
notebook 6.1.3
I’ve no idea what’s causing such massive memory usage; is ~25GB expected for such a small data set and the above code? What can I do to reduce memory usage (aside from using a GPU)?
I came up with exactly the same problem, I am working on gradient free account which gives me 5Gb of memory. Could you please explain how did you solve the problem?