Fit_one_cycle division by 0 error

I’m using fast.ai for computer vision, and before unfreezing had use fit_one_cycle to train for several epoches and it worked fine. But after unfreezing it threw this error, how can I fix it?(I had other versions of kernel training after unfreezing and they work fine, just this one threw and error.

Seems that len(self.layer_groups) being one in basic_train.py in lr_range(self, lr) caused the error. Here’s the error message.

learn.fit_one_cycle(4, max_lr=slice(1e-6,1e-4))

ZeroDivisionError Traceback (most recent call last)
in
1 #learn.fit_one_cycle(4, max_lr=slice(5e-6,1e-3))
----> 2 learn.fit_one_cycle(4, max_lr=slice(1e-6,1e-4))

/opt/conda/lib/python3.6/site-packages/fastai/train.py in fit_one_cycle(learn, cyc_len, max_lr, moms, div_factor, pct_start, final_div, wd, callbacks, tot_epochs, start_epoch)
16 wd:float=None, callbacks:Optional[CallbackList]=None, tot_epochs:int=None, start_epoch:int=None)->None:
17 “Fit a model following the 1cycle policy.”
—> 18 max_lr = learn.lr_range(max_lr)
19 callbacks = listify(callbacks)
20 callbacks.append(OneCycleScheduler(learn, max_lr, moms=moms, div_factor=div_factor, pct_start=pct_start,

/opt/conda/lib/python3.6/site-packages/fastai/basic_train.py in lr_range(self, lr)
185 “Build differential learning rates from lr.”
186 if not isinstance(lr,slice): return lr
–> 187 if lr.start: res = even_mults(lr.start, lr.stop, len(self.layer_groups))
188 else: res = [lr.stop/10]*(len(self.layer_groups)-1) + [lr.stop]
189 return np.array(res)

/opt/conda/lib/python3.6/site-packages/fastai/core.py in even_mults(start, stop, n)
151 “Build log-stepped array from start to stop in n steps.”
152 mult = stop/start
–> 153 step = mult**(1/(n-1))
154 return np.array([start*(step**i) for i in range(n)])
155

ZeroDivisionError: division by zero

This error is usually thrown whenever we do not generate layer groups. How are you creating your model? And was it designed for a transfer-learning type application? If so, we need to split the model at some point! If not, we can safely just pass in one learning rate in :slight_smile:

4 Likes

Thanks for your reply. It is for transfer learning(densenet201), load using:

learn = load_learner(’…/’,file = ‘model330.pkl’)
learn.data = data

The model330.pkl was unfreezed when I saved it, so it is that all laypers of learn belong to one group and nothing were freezed? Could you please explain how to split? Thanks.

Hey i have the same issue, did you figure out the problem?

hey i figured out why this is happening for me. I had used learn.export which does not save layer_groups. so when u export and load a model u need to make layer_groups again. The way i did it was create anotther model fresh, then simply copy its layer_groups to the loaded model with the weights i want.

2 Likes

@akashgshastri can u pls share me ur code? I am strugglinh with this problem as well

What I understood from @akashgshastri comment is:

  1. Create Fresh Model

learn = cnn_learner(data, models.{YOUR MODEL}, metrics=[accuracy])

  1. Load saved weights

saved_model = learn.load(’{SAVED MODEL}’)

  1. Copy layer groups

saved_model.layer_groups = learn.layer_groups

Hi, I am a Newbie to fastai, I am having zero Division error when using

en_learn.fit_one_cycle(5, max_lr = slice(1e-6, 1e-5), callbacks=[ShowGraph(en_learn),SaveModelCallback(en_learn)])

Please find the error below:

ZeroDivisionError Traceback (most recent call last)
in
----> 1 en_learn.fit_one_cycle(5, max_lr = slice(1e-6, 1e-5), callbacks=[ShowGraph(en_learn),SaveModelCallback(en_learn)])

/opt/conda/lib/python3.7/site-packages/fastai/train.py in fit_one_cycle(learn, cyc_len, max_lr, moms, div_factor, pct_start, final_div, wd, callbacks, tot_epochs, start_epoch)
17 wd:float=None, callbacks:Optional[CallbackList]=None, tot_epochs:int=None, start_epoch:int=None)->None:
18 “Fit a model following the 1cycle policy.”
—> 19 max_lr = learn.lr_range(max_lr)
20 callbacks = listify(callbacks)
21 callbacks.append(OneCycleScheduler(learn, max_lr, moms=moms, div_factor=div_factor, pct_start=pct_start,

/opt/conda/lib/python3.7/site-packages/fastai/basic_train.py in lr_range(self, lr)
186 “Build differential learning rates from lr.”
187 if not isinstance(lr,slice): return lr
–> 188 if lr.start: res = even_mults(lr.start, lr.stop, len(self.layer_groups))
189 else: res = [lr.stop/10]*(len(self.layer_groups)-1) + [lr.stop]
190 return np.array(res)

/opt/conda/lib/python3.7/site-packages/fastai/core.py in even_mults(start, stop, n)
151 “Build log-stepped array from start to stop in n steps.”
152 mult = stop/start
–> 153 step = mult**(1/(n-1))
154 return np.array([start*(step**i) for i in range(n)])
155

ZeroDivisionError: division by zero

I am using EfficientNet model from this “from efficientnet_pytorch import EfficientNet”. I would love to hear your thoughts on how to resolve this. Thanks in advance.

More than likely you are not splitting your model, so there are no layer groups. Just the one.

Can you please let me know about how to split the model and have layer groups to it. I tried the methods mentioned above with this " saved_model.layer_groups = learn.layer_groups". But it didn’t work for me. Thanks in advance

Could it be that you are loading a model from an export rather than a save? I had this error as well and that was my problem.

As far as I understand, export is for inference and the model layer group is lost. If you want to save/load your model to avoid losing your work and keep the ability to further train, you should use save instead of export.

You can check your model groups with
len(learn.layer_groups) if it is 1, it means there’s only 1 group.

1 Like