Hi, I have a large dataset(around 100GB). so I tried bcolz_array_iterator.py discussed in the forum(chunklen is set to 16.) Batch size is set to 1024. The training time takes 2-3 hours for one epoch. Instead i tried fitting by first loading them all into memory, then using fit method. It takes only less 10 minutes for one epoch(I use the muti_gpu.py discussed here, but it seems not working with fit_generator).
I also tried tuning the number of workers parameter(from 2 to 16) as discussed in the forum, but it seems not working. I observed that the utility of gpu is very low(almost 0) for the fit_generator.
So why is it so slow to use fit_generator? how can I improve the speed using it( I have larger dataset which cannot fitted into the memory anyway). How do you deal with very very large dataset? Hope to get a more concrete solution. Thanks a lot!