Why doesn't opencv seem to work with multiprocessing?


I am new to using multiprocessing. This works


But if I were to use the open_image function that lives in dataset.py instead of the PIL Image.open a process will get stuck on reading in the first file, so it seems.

I am curious why that is? I noticed in the library that we seem to be using PIL with multithreading - is there something specific to opencv that causes this?

If I could expand a bit on this, is there some subset of libraries that doesn’t play nicely with multiprocessing and if so, why is that?

I get that this might be a very basic question - if someone had some nice resources on this and wouldn’t mind sharing or would be kind enough to write a couple of words on this that would be greatly appreciated.

(Jeremy Howard) #2

Yes opencv doesn’t play nice with multiprocessing in python. There’s an issue filed on the opencv github about it but there doesn’t seem to be any interest there in fixing it.

However, because opencv releases the GIL it works exceptionally well with multi-threading, which makes it much faster.

There are lots of reasons that libraries can fail in a multiprocessing environment, generally due to race conditions, deadlocks, and such things.