Im writing this post just in case other find either of these two errors while training models:
[Errno 12] Cannot allocate memory
DataLoader worker (pid 63) is killed by signal: Killed.
For me the solution was getting on to a machine that has a GPU.
Hope that was of any help!
Hi @Salazar. I am getting the same error. I don’t quite follow your suggested solution . DO you mean I need to use a non_GPU instance? How can model training work without a GPU?
Ahh sorry yes thats confusing - I found that the reason why I was getting that error was because I was on a machine with no GPU. So once I switched to a machine that has a gpu it worked.
Thanks, that solved the problem for me too.
Mysteriously I had set up Paperspace with a GPU and successfully trained a model before, but it then somehow reverted to a CPU-only instance.
one magic line which you should have at the top of every notebook:
assert torch.cuda.is_available(), "GPU not available"