Documentation improvements

stas · March 18, 2019, 2:11am

In this case, it will freeze group (0), keeping group (1) trainable. In the case of transfer learning it’s everything but the custom head (1).

Sequential(
  (0): Sequential(...
  )
  (1): Sequential(...
  )

You can use learn.summary to see the effects before and after this call (hint: Trainable column).

Since you posted this question in the doc improvement thread, once you get the clarity please consider to expand that one line entry to be more clear, perhaps with some examples and submit a PR with it. Thank you.

Daniel · March 18, 2019, 7:45am

Thanks! This is very helpful. @stas

I tried a few times with PR but found it quite tedious to proceed to the end. Also I am not sure whether what I did is acceptable to the documentation requirement.

As for freeze, I could not find it in basic_train.pynb. could you point me to the right notebook?

Before I could do the PR, I do like to contribute back too. Here is what I learn about freeze with Kaggle kernel based on your suggestion.

I found the idea of converting documentation into online kernels very appealing.

stas · March 18, 2019, 4:01pm

As for freeze , I could not find it in basic_train.pynb . could you point me to the right notebook?

Just replace the file in docs .html with .ipynb, all the freeze functions are here: https://docs.fast.ai/basic_train.html#Discriminative-layer-training
So the source is here: https://github.com/fastai/fastai/blob/master/docs_src/basic_train.ipynb

Your proposed doc looks great, Daniel, with perhaps just this modified:

all transfer learning models have only two layer groups, printed out by learn.model : (0) Sequential and (1) Sequental, the second or last one is the head, including multiple layers inside (this is why calling it group).

It’s probably not guaranteed that it will always be 2 groups, as it’ll depend on the model. The last group in transfer learning, which is the last fully connected layers (everything after the last conv layer), is replaced with a custom head of full connected layers and a few other non convolution layers to adapt for the desired regression/categorization output.

Finally, adding just the doc part of your kernel should be sufficient, with perhaps just a few hints how a user can see for themselves the Trainable column and know for sure what their freeze/unfreeze command did. For an example please see: https://docs.fast.ai/callbacks.hooks.html#model_summary (hint, it doesn’t have to be code, you can just paste a small chunk of the output into the markdown cell.

It should be relatively easy to make a PR using the helper tool git Notes – fastai as it does all the setup for you. Then the only nuance is to sort out how to edit the docs_src nbs (and remembering to save them) as explained @ https://docs.fast.ai/gen_doc_main.html#process-for-contributing-to-the-docs. When you did that a few times it’s then easy to remember. If you get stuck please don’t hesitate to ask for support here in this forum.

Daniel · March 18, 2019, 11:00pm

This is very helpful @stas , Thanks!

I have editted accordingly, could you have another look at it?

If the doc explanation part is ok, I will try to push a PR for both freeze and freeze_to in basic_train.ipynb.

stas · March 19, 2019, 1:29am

Excellent improvements, @Daniel.

These need to be improved next:

freeze_to(-3) is equivalent to unfreeze() , all trainable parameters are ready to train.
freeze_to(-2) only freeze only a small proportion of conv layers in the (0) Sequential

freeze_to operates only on groups, and not what you wrote above.

Here is the simplified (meta) code:

    def freeze_to(self, n:int)->None:
        for g in self.layer_groups[:n]: freeze
        for g in self.layer_groups[n:]: unfreeze

The freeze_to is useful when there is more than 2 groups, in more complex models. So just saying n is the number of the group is probably good enough.
See https://github.com/fastai/fastai/blob/master/fastai/basic_train.py#L208.

Then freeze_to(0) == unfreeze

Daniel · March 19, 2019, 2:00am

@stas Thanks a lot for your help on freeze_to and you are right that seeing the source code is very illuminating.

I have noticed in run-test step of the PR guide:

In the docs_src folder, if you made changes to the notebooks, run:

 cd docs_src
 ./run_tests.sh

You will need at least 8GB free GPU RAM to run these tests.

I guess it is the reason why running test taking ages, as my Mac has no GPU.

Do we have to run test on all doc_src notebooks even though I only edited basci_train.ipynb? Is there a way to run test faster locally?

Thanks!

stas · March 19, 2019, 2:08am

Please have a look at: https://docs.fast.ai/gen_doc_main.html#process-for-contributing-to-the-docs
No need to do anything beyond what it says, i.e. no need to run any tests
just edit ipynb, save it and commit. 4 steps as it says. no more.

Daniel · March 19, 2019, 2:49am

This is a relief, thanks! @stas
I have just pushed PR for the first time, but found an additional file has been added fastai-make-pr-branch. Is it my error? should I remove this file, and then push PR again?
Thanks a lot!

stas · March 19, 2019, 4:02am

I followed up in the PR.

Daniel · March 19, 2019, 6:17am

Hi @stas
Is keeping my new-feature-branch updated with the master or upstream master not necessary for PR?
If I want to update my new-feature-branch with upstream/master before PR or before PR is accepted, should I do git merge --no-edit upstream/master?

Daniel · March 19, 2019, 8:48am

Hi @stas Thanks for your help to get my first PR merged!

Although my PR procedure is still clumsy, it works! I will keep practising and when it is more fluent I will make a video guide in Chinese so that it may be easier for others to follow.

Daniel · March 19, 2019, 1:36pm

Hi @stas
I have followed your guide and made corrections to my previous understanding of freeze_to in a new Kaggle kernel.

Should I add the understanding to freeze_to docs? what do you think?

Thanks!

stas · March 19, 2019, 8:03pm

If github doesn’t indicate a conflict (which happens when the same files you edited have diverged since you checked them out), you don’t need to keep it in sync.

If I want to update my new-feature-branch with upstream/master before PR or before PR is accepted, should I do git merge --no-edit upstream/master ?
Yes, after fetch:

git fetch upstream
git merge --no-edit upstream/master
# resolve conflicts if any followed by `git add` for resolved files
git push

Alternatively you sync your forked master first:

git fetch upstream
git checkout master
git merge --no-edit upstream/master
# resolve conflicts if any followed by `git add` for resolved files
git push --set-upstream origin master

and then update your branch:

git checkout your-branch-name
git merge origin/master
git push

We should make a script to automate this, except if you’re doing this because github indicates a conflict, merge is likely to fail and require a manual conflict resolution.

But as I said it’s rare when you need to rebase your PR branch.

stas · March 19, 2019, 8:10pm

Sure, why not, I’d just tweak it to show that your first code sample is a pseudo-code, but I can do it from your PR.

Daniel · March 19, 2019, 10:29pm

Thanks @stas
I have pushed a PR to explain how freeze_to work under the scenes here, and made a few edits from the kaggle kernel version trying to make words clearer. could you have a look? Thanks!

Daniel · March 20, 2019, 12:03am

Hi @stas
Since I have done two PR by now and a little more confidence in the procedures, I have created a visual guide for myself and other beginners.

Could you have a look? Thanks!

stas · March 20, 2019, 3:58am

Since I have done two PR by now and a little more confidence in the procedures, I have created a visual guide for myself and other beginners.

Looks great, @Daniel! I linked to it from https://docs.fast.ai/gen_doc_main.html#step-4-submit-a-pr-with-your-changes - just please don’t change the url or if you do, send a PR that fixes it. Thanks.

The only recommendation I’d add is to add the commands in your console snapshots so that users could copy-n-paste them.

Also I expanded on the branch update section: git Notes – fastai

stas · March 20, 2019, 4:00am

I see Sylvain beat me to merging your commit, @Daniel! Your final edits were good. Thank you.

BTW, you don’t need to notify us in the forums when you make a PR, github sends everybody who is interested in watching PRs and Issues a notification (email or browser), so it will be seen and will be attended when the maintainers get a chance to do so.

Daniel · March 20, 2019, 5:28am

Thanks @stas

The url won’t be changed, and I have updated codes for easy copy-n-paste to the guide, together with the relevant links you recommended.

Daniel · March 21, 2019, 1:18pm

Hi @stas

When we do freeze , unfreeze , we do it to ‘layer groups’. We know ‘layer groups’ are groups of layers of a model. Different models may have different number of layer groups, some has 2, some has more.

The natural question to follow is, why do we use ‘layer groups’ instead of individual layers? Of course, it is much simpler to deal with a few layer groups instead of dozens or hundreds of layers. But how do the model designer choose which layers to group together and how many groups to have? What purpose does it serve besides convenience?

One small thing I want to check is whether learn.layer_groups comes with the Resnet model itself or it is a feature of fastai.

I digged a little into vision.models.resnet34 and found out the model has 4 types of ‘layers’ (rather look like learn.layer_groups ), but when looking into learn.model for its layer groups they are not quite the same. Also, learn has 3 layer groups, but Resnet34 has 4 so-called 'layer’s. Is there a relationship between Resnet34’s layer1 , layer2 , layer3 , layer4 with learn.layer_groups ? If so, what is it?

The paragraphs and links above are from my kaggle kernel, you can see details there.

Thanks!