Dear @wgpubs, sorry if I missed yesterday,
I was wondering about this chart, it the green bars the one that influence most our predictions?
or we should just looking at the lenght of the bars?
Also this part doesn’t make sense to me, he is talking about ModelID and fiModelDesc to have an high number…
so why he decided to remove fiModelDescriptor.
I am a bit confused
I am wondering the same thing
I think it may be a typo … that what they meant to do is drop fiModelDesc
because, based on the numbers, it seems like it may have a strong correlation with ModelID
.
Could you link to this post from the official lecture thread? Would be nice if either Jeremy or Sylvain can confirm things either way.
As I understand things, what this shows is for a given row (so for a specific record), how we got to the final prediction we did for it (e.g., the column labeled “net”) from the mean value of our target/dependent variable (the prediction before any splits).
So in the example, that mean value is 10.10 (that is where we start) … and the prediction for this record is 9.98 (where we end up). “YearMade” had a negative effect on the price … dropping it by .423, whereas “ProductSize” had a positive effect … raising it by .201 over that mean. The waterfall plot shows what effect each of our independent variables played in getting us to that 9.98 prediction.
-.423+.201+.046+.127-.043+0.038-.098+.084-.055-.122 = -0.122
10.104-9.982 = -0.122
Thanks you so much
I had a question regarding the max_card value set in the cont_cat_split
method in the tabular lesson
Any thoughts?
Are you on an old version of fastai2? I think databunch has been renamed to dataloaders.
Anyway, to me it looks like data.databunch is returning a ImageDataBunch object. So you would need to do something like:
db = data.databunch(bs=64)
db.train
Though again I am running on a different version of fastai, so I can’t test to make sure.
Quick question about Object Detection. I was trying to run the coco dataset for object detection. I followed the instructions here to create the Dataloaders https://dev.fast.ai/tutorial.datablock#Bounding-boxes
coco = DataBlock(blocks=(ImageBlock, BBoxBlock, BBoxLblBlock),
get_items=get_image_files,
splitter=RandomSplitter(),
get_y=[lambda o: img2bbox[o.name][0], lambda o: img2bbox[o.name][1]],
item_tfms=Resize(128),
batch_tfms=aug_transforms(),
n_inp=1
I am trying to define a leaner using the unet architecture
coco_learn = unet_learner(coco_dls, resnet18)
However, when I try to fit it coco_learn.fit_one_cycle(2)
I am getting the following error.
Note that the get_y
method is unique here. Has both bounding box and the label: get_y=[lambda o: img2bbox[o.name][0], lambda o: img2bbox[o.name][1]],
What am I getting wrong here ? Is unet_learner
not suitable for object detection ? I have seen it being used for segmentation. Or am I missing something else
Here is the error
TypeError Traceback (most recent call last)
<ipython-input-30-743c03b6f6bf> in <module>()
----> 1 coco_learn.fit_one_cycle(2)
6 frames
/usr/local/lib/python3.6/dist-packages/fastcore/utils.py in _f(*args, **kwargs)
429 init_args.update(log)
430 setattr(inst, 'init_args', init_args)
--> 431 return inst if to_return else f(*args, **kwargs)
432 return _f
433
/usr/local/lib/python3.6/dist-packages/fastai2/callback/schedule.py in fit_one_cycle(self, n_epoch, lr_max, div, div_final, pct_start, wd, moms, cbs, reset_opt)
111 scheds = {'lr': combined_cos(pct_start, lr_max/div, lr_max, lr_max/div_final),
112 'mom': combined_cos(pct_start, *(self.moms if moms is None else moms))}
--> 113 self.fit(n_epoch, cbs=ParamScheduler(scheds)+L(cbs), reset_opt=reset_opt, wd=wd)
114
115 # Cell
/usr/local/lib/python3.6/dist-packages/fastcore/utils.py in _f(*args, **kwargs)
429 init_args.update(log)
430 setattr(inst, 'init_args', init_args)
--> 431 return inst if to_return else f(*args, **kwargs)
432 return _f
433
/usr/local/lib/python3.6/dist-packages/fastai2/learner.py in fit(self, n_epoch, lr, wd, cbs, reset_opt)
198 try:
199 self.epoch=epoch; self('begin_epoch')
--> 200 self._do_epoch_train()
201 self._do_epoch_validate()
202 except CancelEpochException: self('after_cancel_epoch')
/usr/local/lib/python3.6/dist-packages/fastai2/learner.py in _do_epoch_train(self)
173 try:
174 self.dl = self.dls.train; self('begin_train')
--> 175 self.all_batches()
176 except CancelTrainException: self('after_cancel_train')
177 finally: self('after_train')
/usr/local/lib/python3.6/dist-packages/fastai2/learner.py in all_batches(self)
151 def all_batches(self):
152 self.n_iter = len(self.dl)
--> 153 for o in enumerate(self.dl): self.one_batch(*o)
154
155 def one_batch(self, i, b):
/usr/local/lib/python3.6/dist-packages/fastai2/learner.py in one_batch(self, i, b)
159 self.pred = self.model(*self.xb); self('after_pred')
160 if len(self.yb) == 0: return
--> 161 self.loss = self.loss_func(self.pred, *self.yb); self('after_loss')
162 if not self.training: return
163 self.loss.backward(); self('after_backward')
TypeError: __call__() takes 3 positional arguments but 4 were given
It’s been great having you host the sessions, @wgpubs.
I really appreciate all the work you’ve done
to help us beginners understand the lessons!
Best regards,
Butch
My pleasure. It was a lot of fun.
Stay safe and sane for however your long your lock down continues. see ya in part 2.
Hello @wgpubs. Do you know how to do this? (“you can define how the layers in a model are split up”)
For example, how can define 1 group = 1 layer and pass to lr_max
an array of learning rates (one for each group, which means one for each layer in this case)?
Sure.
Take a look at how I do it with hugginface here: https://ohmeow.github.io/blurr/modeling-core/#hf_splitter
Lmk if you have any questions on it.
-wg
Splitting the model is kind of easy if you build it with nn.Sequential and you are able to index into the model (model[0][5] for example).
You can define a splitter function
Group1: Layers 0…5 of the body (model[0][:6])
Group2: Layers 6…max of the body (model[0][6:])
Group3: Head of the model (model[1])
def splitter(m):
return L(m[0][:6], m[0][6:], m[1]).map(params)
And pass it to the learner:
learn = Learner(dls, model, metrics=accuracy, splitter=splitter, cbs=ShowGraphCallback())
See this notebook for the complete code:
hi @wgpubs.
Do you know the differences between learn.opt.param_lists and learn.opt.param_groups?
len(learn.opt.param_groups)
and len(learn.opt.param_lists)
give the same results.
Do you know how to verify the Learning Rate value(s) by a Pytorch and/or fastai v2 function?
For example, learn.lr
gives only the Learning Rate of the last layer (head of the model).
Not really. You can print out the param groups but its not intuitive as to what layers they correspond too (can’t remember the syntax exactly but I think its something like learn.opt.param_groups
).
Here’s the documentation for how discriminative LRs are defined and applied in fastai.
[ EDIT 05/25/2020 ] I found the function fastai2 that allows to get the Learning Rate values used by the optimizer. See my post about that.
Thank you but this does not answer my question (and you gave a link to the fastai v1 doc, not v2).
For example, if you use lr_max=slice(10e-6, 10e-4)
, how to get the Learning Rate values used by the optimizer? (I’m not talking about getting the parameters groups).
Hi @florianl. You’re right about your affirmation. Then, I tested in this notebook (nbviewer version) how to create parameters groups for a model more complicated like resnet18.
As well, I tested all the methods to pass different Learning Rates (one by parameters group).
In fact, they are 4 possibilities, not 3.
For example with 3 parameters groups, you can do the following (for a Learner
unfrozen and the use of learn.fit_one_cycle()
):
- if lr_max = 1e-3 -> [0.001,0.001,0.001]
- if lr_max = slice(1e-3) -> [0.0001,0.0001,0.001]
- if lr_max = slice(1e-5,1e-3) -> array([1.e-05, 1.e-04, 1.e-03]) #LRs evenly geometrically spaced
- if lr_max = [1e-5, 1e-4, 1e-3] -> array([1.e-05, 1.e-04, 1.e-03]) #LRs evenly linearly spaced or not
Explications
- All parameters groups will use the same Learning Rate (and the same Optimizer method like Adam + 1cycle policy for all).
- The last layer group’s Learning Rate (max) value is setup to lr, and all previous parameters groups ones to lr/10.
- Train the very first layers at a Learning Rate of 1e-5, the very last at 1e-3, and the Learning Rates of other parameters groups are evenly geometrically spaced between theses two values.
- Train the very first layers at a Learning Rate of 1e-5, the very last at 1e-3, and the Learning Rates of other parameters groups are evenly linearly spaced between theses two values or you can pass as a list any Learning Rate values.
WARNING
Points 3 and 4 are not equivalent for a number of parameters groups greater than 3!!!
- point 3: Learning Rates are calculated geometrically.
- point 4: you can pass an array with the Learning Rate values you want.
List of Learning Rates: last values of cosine annealing for the 1cycle policy
In order to get the list of Learning Rates by parameters group that was passed to the Learner
by the Optimizer
during the training, you can display the hyperparameters by using learn.opt.hypers as following:
for i,h in enumerate(learn.opt.hypers):
print(i,h)