Fit_one_cycle division by 0 error

(JianWen Yang) #1

I’m using fast.ai for computer vision, and before unfreezing had use fit_one_cycle to train for several epoches and it worked fine. But after unfreezing it threw this error, how can I fix it?(I had other versions of kernel training after unfreezing and they work fine, just this one threw and error.

0 Likes

(JianWen Yang) #2

Seems that len(self.layer_groups) being one in basic_train.py in lr_range(self, lr) caused the error. Here’s the error message.

learn.fit_one_cycle(4, max_lr=slice(1e-6,1e-4))

ZeroDivisionError Traceback (most recent call last)
in
1 #learn.fit_one_cycle(4, max_lr=slice(5e-6,1e-3))
----> 2 learn.fit_one_cycle(4, max_lr=slice(1e-6,1e-4))

/opt/conda/lib/python3.6/site-packages/fastai/train.py in fit_one_cycle(learn, cyc_len, max_lr, moms, div_factor, pct_start, final_div, wd, callbacks, tot_epochs, start_epoch)
16 wd:float=None, callbacks:Optional[CallbackList]=None, tot_epochs:int=None, start_epoch:int=None)->None:
17 “Fit a model following the 1cycle policy.”
—> 18 max_lr = learn.lr_range(max_lr)
19 callbacks = listify(callbacks)
20 callbacks.append(OneCycleScheduler(learn, max_lr, moms=moms, div_factor=div_factor, pct_start=pct_start,

/opt/conda/lib/python3.6/site-packages/fastai/basic_train.py in lr_range(self, lr)
185 “Build differential learning rates from lr.”
186 if not isinstance(lr,slice): return lr
–> 187 if lr.start: res = even_mults(lr.start, lr.stop, len(self.layer_groups))
188 else: res = [lr.stop/10]*(len(self.layer_groups)-1) + [lr.stop]
189 return np.array(res)

/opt/conda/lib/python3.6/site-packages/fastai/core.py in even_mults(start, stop, n)
151 “Build log-stepped array from start to stop in n steps.”
152 mult = stop/start
–> 153 step = mult**(1/(n-1))
154 return np.array([start*(step**i) for i in range(n)])
155

ZeroDivisionError: division by zero

0 Likes

(Zachary Mueller) #3

This error is usually thrown whenever we do not generate layer groups. How are you creating your model? And was it designed for a transfer-learning type application? If so, we need to split the model at some point! If not, we can safely just pass in one learning rate in :slight_smile:

1 Like

(JianWen Yang) #4

Thanks for your reply. It is for transfer learning(densenet201), load using:

learn = load_learner(’…/’,file = ‘model330.pkl’)
learn.data = data

The model330.pkl was unfreezed when I saved it, so it is that all laypers of learn belong to one group and nothing were freezed? Could you please explain how to split? Thanks.

0 Likes

(Akash ) #5

Hey i have the same issue, did you figure out the problem?

0 Likes

(Akash ) #6

hey i figured out why this is happening for me. I had used learn.export which does not save layer_groups. so when u export and load a model u need to make layer_groups again. The way i did it was create anotther model fresh, then simply copy its layer_groups to the loaded model with the weights i want.

0 Likes