I’m working on Colab and was experimenting with several tweaks offered by XResnet
. I’ve some strange observations that caused me to write this post.
- When I choose the clean approach of
cnn_learner
with no tweaks in the model, I’m able to train the network without any memory error.
# arch=xresnet50
learn = cnn_learner(dls,arch,opt_func=ranger,metrics=error_rate)
- Now, since I want to customize the architecture, I opt in for
Learner
and tried instantiatingmodel
as done by @ducha-aiki in this notebook
model = xresnet50(n_out=dls.c)
learn = Learner(dls,model,opt_func=ranger,metrics=error_rate,
splitter=lambda m: L(m[0][:3],m[0][3:],m[1:]).map(params))
This causes following error:
TypeError Traceback (most recent call last)
<ipython-input-38-d81c6bd29d71> in <module>()
----> 1 learn.lr_find()
4 frames
/usr/local/lib/python3.6/dist-packages/fastai2/callback/schedule.py in lr_find(self, start_lr, end_lr, num_it, stop_div, show_plot, suggestions)
221 n_epoch = num_it//len(self.dls.train) + 1
222 cb=LRFinder(start_lr=start_lr, end_lr=end_lr, num_it=num_it, stop_div=stop_div)
--> 223 with self.no_logging(): self.fit(n_epoch, cbs=cb)
224 if show_plot: self.recorder.plot_lr_find()
225 if suggestions:
/usr/local/lib/python3.6/dist-packages/fastai2/learner.py in fit(self, n_epoch, lr, wd, cbs, reset_opt)
180 def fit(self, n_epoch, lr=None, wd=None, cbs=None, reset_opt=False):
181 with self.added_cbs(cbs):
--> 182 if reset_opt or not self.opt: self.create_opt()
183 if wd is None: wd = self.wd
184 if wd is not None: self.opt.set_hypers(wd=wd)
/usr/local/lib/python3.6/dist-packages/fastai2/learner.py in create_opt(self)
129 def _bn_bias_state(self, with_bias): return bn_bias_params(self.model, with_bias).map(self.opt.state)
130 def create_opt(self):
--> 131 self.opt = self.opt_func(self.splitter(self.model), lr=self.lr)
132 if not self.wd_bn_bias:
133 for p in self._bn_bias_state(True ): p['do_wd'] = False
<ipython-input-34-1c418864616f> in <lambda>(m)
1 learn = Learner(dls,model,opt_func=ranger,metrics=metrics,
----> 2 splitter=lambda m: L(m[0][:3],m[0][3:],m[1:]).map(params))
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/container.py in __getitem__(self, idx)
66 def __getitem__(self, idx):
67 if isinstance(idx, slice):
---> 68 return self.__class__(OrderedDict(list(self._modules.items())[idx]))
69 else:
70 return self._get_item_by_idx(self._modules.values(), idx)
TypeError: __init__() missing 1 required positional argument: 'nf'
- I’m not sure, but thought this has something to do with
head
of the model and went ahead to write customcreate_cnn_model
method as shown by @muellerzr in this notebook
def create_custom_model(arch):
def get_arch(pretrained=True):
return arch(sa=True, pool=MaxPool,act_cls=Mish,pretrained=pretrained)
# arch = partial(xresnet50, sa=True, pool=MaxPool, act_cls=Mish)
body = create_body(get_arch,cut=-4)
nf = num_features_model(body) * 2
body = convert_MP_to_blurMP(body, nn.MaxPool2d)
head = create_head(nf, dls.c)
model = nn.Sequential(body,head)
return model
Note: create_body
need callable function and hence one of the option was using partial. I tried using partial with all required args predefined which lead to “out of memory” error. Then I tried
- removing
Mish
- removing all the custom args:
partial(xresnet50)
just to clarify if any param causing the issue - define a closure
get_arch
as shown above
Nothing helps. So clearly, either the partial or Learner somehow causing “Out of Memory” errors since cnn_learner
has no issues and am able to train the model.
So please discuss the ideal ways of dealing with Custom Architectures that won’t cause any Memory Errors