How large is your dataset? The BrokenProcessPool is a known bug ([L14] Problem with futures and ProcessPoolExecutor) and happens when you have a large dataset. You can set n_cpus=1 in order to fix this when creating a databunch, but keep in mind that it will be slower then.