Learner layer_groups of length 1? ZeroDivisionError with max_lr slice

meli · November 14, 2018, 9:15am

Hi,
I am using fastai v.1.0.20, and I am trying to pass a slice in max_lr when calling fit_one_cycle . But I am running into the following error:

ZeroDivisionError                         Traceback (most recent call last)
<ipython-input-42-c91c52def3d0> in <module>()
----> 1 learner.fit_one_cycle(5, max_lr=slice(1e-5, 1e-4))

~/anaconda3/lib/python3.7/site-packages/fastai/train.py in fit_one_cycle(learn, cyc_len, max_lr, moms, div_factor, pct_start, wd, callbacks, **kwargs)
     16                   wd:float=None, callbacks:Optional[CallbackList]=None, **kwargs)->None:
     17     "Fit a model following the 1cycle policy."
---> 18     max_lr = learn.lr_range(max_lr)
     19     callbacks = ifnone(callbacks, [])
     20     callbacks.append(OneCycleScheduler(learn, max_lr, moms=moms, div_factor=div_factor,

~/anaconda3/lib/python3.7/site-packages/fastai/basic_train.py in lr_range(self, lr)
    148         "Build differential learning rates."
    149         if not isinstance(lr,slice): return lr
--> 150         if lr.start: res = even_mults(lr.start, lr.stop, len(self.layer_groups))
    151         else: res = [lr.stop/3]*(len(self.layer_groups)-1) + [lr.stop]
    152         return np.array(res)

~/anaconda3/lib/python3.7/site-packages/fastai/core.py in even_mults(start, stop, n)
    102     "Build evenly stepped schedule from `start` to `stop` in `n` steps."
    103     mult = stop/start
--> 104     step = mult**(1/(n-1))
    105     return np.array([start*(step**i) for i in range(n)])
    106 

ZeroDivisionError: division by zero

I created my model with :

body = create_body(models.resnet34(True), -2)
custom_model = models.unet.DynamicUnet(body, n_classes = len(classes))
learner = Learner(bunch, custom_model, loss_func=nn.BCEWithLogitsLoss(), 
                  metrics=[get_jaccard],  callback_fns=ShowGraph)

But when checking the length of layer_groups, I get 1 which is why I get a division by zero error. layer_groupslooks like this :

[Sequential(
   (0): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
   (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
   (2): ReLU(inplace)
   (3): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
   (4): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
...)]

Is is normal that I have to run :

learner.layer_groups = learner.layer_groups[0]

to be able to use a slice in max_lr ? Is that even the correct approach ?

Thank you very much !

sgugger · November 14, 2018, 2:22pm

It you don’t use the create_unet fonction, the learner object doesn’t know how to split the layers for the different learning rates. Look at the source code to know how we do it usually (or use the create_unet fonction).