Documentation improvements

sgugger · March 21, 2019, 6:53pm

Mostly convenience, and we empirically found it worked. As for where we put the split between the first and the second group, it’s empirical trials too.

No, there is no link. A Learner in vision usually has three layer groups that are lower part of the body, upper part of the body and head. The 4 layers you see in the resnet correspond to the the 4 different parts, each ending with the image size being divided by 2.

Daniel · March 21, 2019, 10:18pm

Thank you so much for your explanation! it is very helpful!
When you say ‘empirically found it worked’, do you indicate that although in lesson notebooks we only see two ways of freezing the model: normal mode (freeze up to the head, the last layer group), unfreeze mode (freeze no layers), there is actually a third way of freezing the model for training in your actual practice worth trying, that is middle mode (freeze up to the middle group, if there are three layer groups)?

sgugger · March 21, 2019, 11:13pm

The two different groups are intended for discriminative LRs (give a lower one to the first group), but you could certainly try unfreezing one group after the next.

Daniel · March 21, 2019, 11:16pm

@sgugger Thanks a lot! This is nice to know the groups’ purposes.

stas · March 22, 2019, 5:31pm

@ashaw, here is another small doc gen improvement request.

We get quite a few PRs with users modifying the autogenerated html, since they don’t realize they are autogenerated.

In the fastai_docs when we autogenerate .py code we inject this header at the top of the file:

#################################################
### THIS FILE WAS AUTOGENERATED! DO NOT EDIT! ###
#################################################
# file to edit: dev_nb/01_matmul.ipynb

So I was thinking perhaps it’d work to inject something similar in our html files? e.g. for docs/basic_train.html:

<!--


#################################################
### THIS FILE WAS AUTOGENERATED! DO NOT EDIT! ###
#################################################
# file to edit: docs_src/basic_train.ipynb
# instructions: https://docs.fast.ai/gen_doc_main.html


-->

I added the ample vertical whitespace so that hopefully it’ll stand out from the dense HTML once the user opens it in their editor. I’m not sure whether it can appear at the very top, or after the jekyll headers.

Thank you.

Daniel · March 23, 2019, 11:41pm

Is `from fastai import *` really necessary?

Hi everyone,

on import of docs.fast.ai, it says:

In order to do so, the module dependencies are carefully managed (see next section), with each exporting a carefully chosen set of symbols when using import * . In general, for interactive computing, you’ll want to import from both fastai , and from one of the applications , such as:

from fastai.vision import *

it seems to suggest we should do import for interactive computing in the following way

from fastai import *
from fastai.vision import *

However, if you experiment as I did it here on kaggle, you will notice that from fastai import * add nothing to from fastai.vision import *.

Therefore, I attempt to say that from fastai import * is unnecessary.

Am I missing something here? if so, please correct me. Thanks

sgugger · March 23, 2019, 11:44pm

Oh this is legacy behavior. It used to be necessary to do two imports, but nowadays it’s either

from fastai.basics import *

(just the core + training loop)
or

from fastai.{application} import *

If you want to adjust the docs, feel free to suggest a PR!

Daniel · March 23, 2019, 11:47pm

Thanks @sgugger

Here is my proposed change to the doc. Please have a look.

In order to do so, the module dependencies are carefully managed (see next section), with each exporting a carefully chosen set of symbols when using import * . In general, for interactive computing, to play around the core module and training loop you can do

from fastai.basics import *

If you want experiment with one of the applications such as vision, then you can do

from fastai.vision import *

Daniel · March 24, 2019, 10:46am

index page: data link points to wrong page?

On this page Welcome to fastai – fastai, all the data links point to vision.data, but according to the context, they should point to links of basic_data. Do I understand the context correctly? Could anyone double check them for me? thanks!

Then, there are three modules directly on top of torch_core :

data , which contains the class that will take a Dataset or pytorch DataLoader to wrap it in a DeviceDataLoader (a class that sits on top of a DataLoader and is in charge of putting the data on the right device as well as applying transforms such as normalization) and regroup then in a DataBunch .

This takes care of the basics, then we regroup a model with some data in a Learner object to take care of training. More specifically:

callback (depends on data ) defines the basis of callbacks and the CallbackHandler . Those are functions that will be called every step of the way of the training loop and can allow us to customize what is happening there;

From data we can split on one of the four main applications , which each has their own module: vision , text collab , or tabular . Each of those submodules is built in the same way with:

@sgugger

Daniel · March 24, 2019, 11:36am

Which modules do `learn` below refer to

Welcome to fastai – fastai in the last two blocks of text, we can see

learn (depends on callbacks ) defines helper functions to invoke the callbacks more easily.

optionally, a submodule named learn that will contain Learner specific to the application.

There are no modules named learn any more. My guess is the following. Could you verify them for me? @sgugger Thanks!

sgugger · March 24, 2019, 12:57pm

Yes this wasn’t properly updated when we changed data to basic_data.

sgugger · March 25, 2019, 1:05pm

The first learn is now train, and in the second, the submodule is {application}.learner, that’s correct.
Thanks so much for proofreading and making this consistent with the current stage of the library!!!

Eva · March 25, 2019, 5:15pm

I am trying to work with the tabular module but found the documentation a bit incomplete (for beginners)… Now I am going to try and look for improvements! I think it is a great way to learn and help other beginners. (Although I am a bit afraid of making mistakes)

Daniel · March 25, 2019, 6:33pm

@Eva
My experience tells me the following

Try hard and ask for help and keep up, and you will see how friendly and supportive this place is, and your worry will be gone.

Daniel · March 25, 2019, 6:37pm

Thanks @sgugger

I will make them a PR about those tiny changes.

Just feel proud to contribute to the best deep learning library and organization!

yohan · March 25, 2019, 7:32pm

@sgugger Thank you for all the work done on the fastai library and documentation. However, I personnaly think that there is much more than just examples that are missing. I have been struggeling for quite some time to understand arguments in several basic functions and I would gladly help to enrich the doc once I get a better understanding of it.

The most essential part that I find missing is a clear description of each parameters and not just its type and default value. Without this information I am for example left hanging when just trying to specify a validation folder working on the first lesson with another dataset that has a definite train and test set that are seperated into 2 folders and are both labeled on their filenames.

The takeaway of my message is really, not only examples, but also parameters description.

Thanks again for all the hard work !

Eva · March 26, 2019, 8:59am

I ran into the same issue! It is hard to get a clear description of the functions when you want to use your own dataset or work on kaggle competitions.

Let’s get to work!

Benudek · April 19, 2019, 8:39pm

maybe sth like this can be of interest in the future: https://developers.google.com/season-of-docs/docs/

@sgugger

Daniel · April 26, 2019, 11:08pm

Having trouble to verify " `items`. `create_func` will default to `open_image` " in `ImageList`

I have tried to read source code of ItemList, ImageList, and their from_folder to figure out how ImageList.from_folder work.

I can use pdb to walk through the flow of codes, but I can’t find the exact step for turning image file path object into Image object, see below for comparison

So, I go to check on the docs, the second sentence makes perfect sense to explain the missing puzzle I encountered above:

Create a ItemList in path from filenames in items . create_func will default to open_image .

However, I could not locate the place where items.create_func is set to open_image, in fact the items.create_func seem not exist

So, could you show me exact where in the source code items.create_func is set to open_image?

Thanks!

Daniel · April 28, 2019, 8:38am

First of all, I found the exact codes for turning Path object into Image object below

Second, there is no such thing called items.create_func. So, I would like to rewrite the sentence as follows

It inherits from ItemList and overwrite ItemList.get to call open_image in order to turn an image file in Path object into an Image object.

What do you think? Thanks
@stas @sgugger

Documentation improvements

Is from fastai import * really necessary?

index page: data link points to wrong page?

Which modules do learn below refer to

Having trouble to verify " items. create_func will default to open_image " in ImageList

Is `from fastai import *` really necessary?

Which modules do `learn` below refer to

Having trouble to verify " `items`. `create_func` will default to `open_image` " in `ImageList`