In imagenet_process.ipynb
of Part 2, @jeremy use t1 = threading.local()
to place results from each thread for resizing image. I’m not fully understand of this usage. May I just use a global variable ?
No, if you’re using multi-threading you’ll need this to avoid a race condition. It’s discussed in the video - let me know if anything there is unclear.
@jeremy Thanks for your reply.
I wrote a piece of simple code.
import time
import concurrent.futures
import threading
import random
import copy
def pp(i):
time.sleep(10 * random.random())
return i
#t1 = threading.local()
#t1.data = None
a = []
with concurrent.futures.ThreadPoolExecutor(5) as e:
s = range(10)
res = e.map(pp, s)
print(res)
for i in res:
a.append(i)
print(a)
I ran this code a number of times. Output appears the same.
Output:
<generator object Executor.map.<locals>.result_iterator at 0x7f256438f360>
[0]
[0, 1]
[0, 1, 2]
[0, 1, 2, 3]
[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4, 5]
[0, 1, 2, 3, 4, 5, 6]
[0, 1, 2, 3, 4, 5, 6, 7]
[0, 1, 2, 3, 4, 5, 6, 7, 8]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Could you explain race condition in this code in detail?
Because you’re running pure python code, the global interpreter lock (GIL) is protecting you (but also resulting in no speed up from your code). You’ll only see it if calling external C/fortran code (e.g. numpy). Maybe try creating a single big numpy array, and firing off lots of processes (not threads), and have them all modify the whole columns of the array at the same time?
@jeremy Thanks.
I modified the code as follows:
def pp(i):
time.sleep(10 * random.random())
return i
#t1 = threading.local()
#t1.data = None
a = np.zeros((300, 100))
with concurrent.futures.ProcessPoolExecutor(15) as e:
s = range(50)
res = e.map(pp, s)
print(res)
for i in res:
a[:, i] = i
print(a[:, i])
But the output is still confusing. Race condition doesnot emerge as expected.