I have two learners in which I am trying to call get_preds(). Calling them alone takes too much time, and I am trying to use the parallel function to call them at the same time.
from fastai import *
from fastai.core import parallel
from fastai.text import *
learner0 = load_learner('./ulmfit_0/')
learner1 = load_learner('./ulmfit_1/')
def get_preds(learner,index):
return learner.get_preds(DatasetType.Test,ordered=True)
def parallelize_preds(func,learners):
return parallel(func, learners)
def main():
learner0.data.add_test(values)
learner1.data.add_test(values)
preds = parallelize_preds(get_preds,[learner0,learner1])
preds0 = [i[1].item() for i in preds[0][0]]
preds1 = [i[1].item() for i in preds[1][0]]
if __name__ == "__main__":
main()
When I run the above code, I receive the following error:
_pickle.PicklingError: Can't pickle <function get_preds at 0x7fc4835e8510>: attribute lookup get_preds on __main__ failed
After receiving the error, the script also hangs and doesnât finish without stopping the kernel.
Is this a problem with my code, or is the parallel function not able to be used with a learner object?
I also tried parallelizing the two lines
learner0.data.add_test(values) learner1.data.add_test(values)
in a similar way as I did above with get_preds. This also failed, so if you have any suggestions in how to parallelize these, please let me know.
Edit:
After further investigation, I was able to produce a more specific error message:
File â/home/ec2-user/anaconda3/envs/pytorch_p36/lib/python3.6/multiprocessing/queues.pyâ, line 234, in _feed
obj = _ForkingPickler.dumps(obj)
File â/home/ec2-user/anaconda3/envs/pytorch_p36/lib/python3.6/multiprocessing/reduction.pyâ, line 51, in dumps
cls(buf, protocol).dump(obj)
AttributeError: Canât pickle local object âTextClasDataBunch.create..â
Traceback (most recent call last):
File â/home/ec2-user/anaconda3/envs/pytorch_p36/lib/python3.6/multiprocessing/queues.pyâ, line 234, in _feed
obj = _ForkingPickler.dumps(obj)
File â/home/ec2-user/anaconda3/envs/pytorch_p36/lib/python3.6/multiprocessing/reduction.pyâ, line 51, in dumps
cls(buf, protocol).dump(obj)
AttributeError: Canât pickle local object âTextClasDataBunch.create.locals.lambdaâ
It seems the issue lies with the following line found in the TextClasDataBunch class:
train_sampler = SortishSampler(datasets[0].x, key=lambda t: len(datasets[0][t][0].data), bs=bs)
Python canât pickle a lambda function with the pickle module, so I tried importing dill to circumvent this. However that didnât work either. I also tried going into the source code and defining the function, but I received the same error.