Automated Learning Rate Suggester

Joan · May 1, 2019, 3:50pm

I am trying different lr_diff and seems that this is quite specific for every dataset and I cannot find a way to generalize nicely. However 40-45 seems to be a good start but I have to run more test.

Regarding this, I am trying to get reproducible results using the function described here. However, when I run the code using num_workers = 0 when generating the DataBunch I got an error:

Traceback (most recent call last): File "/users/genomics/jgibert/Scripts/Lymphoma_Fastai_Neptune.py", line 63, in <module> selected_lr = find_appropriate_lr(learn) File "/users/genomics/jgibert/Scripts/Lymphoma_Fastai_Neptune.py", line 40, in find_appropriate_lr model.lr_find() File "/soft/EB_repo/devel/programs/goolf/1.7.20/Python/3.6.2/lib/python3.6/site-packages/fastai/train.py", line 32, in lr_find learn.fit(epochs, start_lr, callbacks=[cb], wd=wd) File "/soft/EB_repo/devel/programs/goolf/1.7.20/Python/3.6.2/lib/python3.6/site-packages/fastai/basic_train.py", line 196, in fit fit(epochs, self, metrics=self.metrics, callbacks=self.callbacks+callbacks) File "/soft/EB_repo/devel/programs/goolf/1.7.20/Python/3.6.2/lib/python3.6/site-packages/fastai/basic_train.py", line 111, in fit finally: cb_handler.on_train_end(exception) File "/soft/EB_repo/devel/programs/goolf/1.7.20/Python/3.6.2/lib/python3.6/site-packages/fastai/callback.py", line 322, in on_train_end self('train_end', exception=exception) File "/soft/EB_repo/devel/programs/goolf/1.7.20/Python/3.6.2/lib/python3.6/site-packages/fastai/callback.py", line 250, in __call__ for cb in self.callbacks: self._call_and_update(cb, cb_name, **kwargs) File "/soft/EB_repo/devel/programs/goolf/1.7.20/Python/3.6.2/lib/python3.6/site-packages/fastai/callback.py", line 240, in _call_and_update new = ifnone(getattr(cb, f'on_{cb_name}')(**self.state_dict, **kwargs), dict()) File "/soft/EB_repo/devel/programs/goolf/1.7.20/Python/3.6.2/lib/python3.6/site-packages/fastai/callbacks/lr_finder.py", line 40, in on_train_end self.learn.load('tmp', purge=False) File "/soft/EB_repo/devel/programs/goolf/1.7.20/Python/3.6.2/lib/python3.6/site-packages/fastai/basic_train.py", line 265, in load state = torch.load(source, map_location=device) File "/soft/EB_repo/devel/programs/goolf/1.7.20/Python/3.6.2/lib/python3.6/site-packages/torch/serialization.py", line 368, in load return _load(f, map_location, pickle_module) File "/soft/EB_repo/devel/programs/goolf/1.7.20/Python/3.6.2/lib/python3.6/site-packages/torch/serialization.py", line 549, in _load deserialized_objects[key]._set_from_file(f, offset, f_should_read_directly) RuntimeError: storage has wrong size: expected 4355518534081521830 got 2048

I am not quite sure why is this happening, I check some post and seems to be related with serialization. Any idea why is this happening?

Thanks!