Now i am getting goodl results with the learning again, but I am getting LOTS of warnings, hundreds or thousands per epoch. So many its clogging the output cells and making chrome hiccup.
Any idea what this is? Or at least how to suppress it in google colab?
Exception ignored in: <bound method _DataLoaderIter.__del__ of <torch.utils.data.dataloader._DataLoaderIter object at 0x7fc7d56086a0>>
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 717, in __del__
self._shutdown_workers()
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 713, in _shutdown_workers
w.join()
File "/usr/lib/python3.6/multiprocessing/process.py", line 122, in join
assert self._parent_pid == os.getpid(), 'can only join a child process'
AssertionError: can only join a child process
Intermittently, yes. I don’t think it’s an actual problem though. I believe it’s an artifact of the ipython notebook layer. It’s very annoying because it makes the browser so slow, but the actual code appears to run fine
I don’t think the problem is with the jupyter notebooks. I have also encountered this annoying error, which litters up my notebook from time to time. But I get the same message if I run the code in plain python.
I believe the problem is with the multiprocessing logic in pytorch’s dataloader.py. There is a 214-line comment titled “Data Loader Multiprocessing Shutdown Logic”, and it is dizzyingly complex.
It sure looks like a race condition to me. I’m not sure whether the message is innocuous or not. For me, it is sporadic, sometimes doesn’t happen, and seems to depend on what else I’m doing on the machine.
It has “race condition” written all over it.
I don’t have time, or the inclination, to fix this kind of code for pytorch. But I hope they fix it. I regret ever having switched over from my dataloader code to theirs.
I have recently done a new machine build using linux and am running into the same error when running lesson 1 data.show_batch
AssertionError: can only join a child process
File "/home/oos/anaconda3/envs/fastaiv1/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 713, in _shutdown_workers
w.join()
Exception ignored in: <function _DataLoaderIter.__del__ at 0x7fe0ed4470d0>