Documentation improvements

These need to be reworked:

  1. the paragraph of https://docs.fast.ai/basic_train.html#TTA:

We take the average of our regular predictions (with a weight beta ) with the average of predictions obtained …"

That sentence doesn’t quite make sense. And later that large paragraph needs some complete sentences too.

  1. https://docs.fast.ai/vision.transform.html#Data-augmentation

“Images are often rectangles of different ratios, so to get them to the target size, we can either/ By default”

Looks like the second part of the sentence got lost at “either/” and the next sentence merged into it.

Thanks.

Yes, not sure why the third one is here, but the other three come from old notebooks that were renamed, so you can remove those four files.

Nope, they’re not even clear to me!

Type abbreviations are currently over-used in fastai. Where I notice an abbreviation is only used a couple of times, I tend to remove it and replace it with the full version.

There is a dictionary of type abbreviations provided in fastai and these are used to create links to more info in the docs, but sometimes we forget to add things there. It would be nice if it were autogenerated.

Thank you, Jeremy. I’m glad to hear it wasn’t just me.

So let’s deal with them as they get encountered, let’s start with the mysterious T_co in:

add_test(items:Iterator[T_co], label:Any=None)

There is a dictionary of type abbreviations provided in fastai and these are used to create links to more info in the docs, but sometimes we forget to add things there. It would be nice if it were autogenerated.

Which dict is that? It surely isn’t this one, https://docs.fast.ai/dev/abbr.html, so please share which dict you’re referring to and where it is documented. Thanks.

I took a stab at a few examples in core and submitted the PR. Let me know if this is along the lines of what you are looking for (more examples…) and I can continue to proceed with more of the functions there. If not, any direction on what you think is relevant is appreciated.

Hi, I was examining to_data function. And while doing so… I found a strange behaviour in a simple code. Below is the code sample:

from fastai import *
from fastai.vision import *

path = untar_data(URLs.MNIST_SAMPLE)
data = ImageDataBunch.from_folder(path)

# Examining the labels
print(set(data.y)) 

The output of the above is given below:

{Category 3, Category 3, Category 3, ...<repeated many times>, Category 7, Category 7, Category 7}

Whereas the expected output IMO is:

{Category 3, Category 7}

Just wanted to know, is this an expected behavior or a bug. I guess a new Category class is being instantiated for each image and it looks a bit inefficient from a very high-level understanding. Sorry if this is a silly question or if I missed something.

Also, I have submitted a PR. I have completed most of the functions in a specific section of torch_core. Do send me feedback on what can be improved. :slight_smile:

PR Link: https://github.com/fastai/fastai/pull/1485

Thanks
NVS Abhilash

At http://docs.fast.ai/, can those blocks be displayed with markdown formatting?
Like this:


(It was confusing to me, and I think it can be confusing for other beginners)

update: resolved.

1 Like

Here are the things that need fixing in the doc generating code: fastai/gen_doc/nbdoc.py

  1. backticks around all arguments and defaults in show_doc should be removed? Each component is already wrapped in <code></code>

    PointsItemList(`args`, `convert_mode`=`'RGB'`, `kwargs`) ::
    

    So the backticks are just an unnecessary noise.

  2. show_doc ignores *, ** in function definitions, e.g. showing kwargs instead of **kwargs.

    Inside format_param, p.name returns kwargs - not sure how to retrieve ** from p. It does know that the original arg was <Parameter "**kwargs">, but none of the attributes indicate that.

update: both were resolved by Andrew! Thank you.

2 Likes

The docs css needs to be improved to have consistent font sizes:

  1. the show_doc sig uses larger fonts than the rest of the html, so it’s a bit painful. I guess we need to tweak the .css to get to use similar fonts for mono and non-mono fonts.

  2. same for quoting - uses a much bigger font for blockquote, e.g. the top of https://docs.fast.ai/dev/abbr.html

I added a basic entry for the resize transform, it could use an expansion to offer more efficient ways to do the resize once, rather than doing it on the go.

If you’d like to make a PR with various ways one could do the resize on the filesystem once, instead of re-doing it on every run, that would be great. I usually use imagemagick but I don’t know if it’s still the best method. There are a few threads about the subject matter here on the forums, so perhaps if you could make a summary of the best methods including the nuances of resizing mixed sized images, etc. that would be useful to have documented in one place.

5 posts were merged into an existing topic: Misc issues

Another essential help that’s needed is fixing broken links in docs:

Hi Stas, I can take care of the links. Can you please quickly point to what I need to modify to fix them ? Thanks !

1 Like

Thank you, @PierreO!

  1. Edit the source *.ipynb notebooks under docs_src - don’t forget [Save]!!!
  2. Convert them to html https://docs.fast.ai/gen_doc_main.html#updating-html-only
  3. Install everything you need to get the docsite locally and see that you can start it https://docs.fast.ai/gen_doc_main.html#testing-site-locally but then shut it down
  4. Run the link-checker locally https://github.com/fastai/fastai/blob/master/tools/checklink/README.md#checking-the-site-locally to validate that the links/anchors have been fixed (it will start the local server that you enabled in step 3). This step will also require a one time link-checker setup stage, which you will find in the same document.

If you have any difficulties please let me know.

p.s. I know @sgugger is currently doing some major API updates (non-breaking - replacing kwargs with explicit args), which will require doc updates, so I recommend you ask him whether it’s a good time for this effort (because 2 people editing the same notebooks at the same time is a difficult).

1 Like

I just finished to update the docs accordingly, so you can go ahead!

2 Likes

I’m trying to fix the broking links to SMScores and RegMetrics in metrics. Those two classes aren’t shown in the metrics page (there’s no show_doc for them), but when I try to add it to the notebook I get the following error :

NameError                                 Traceback (most recent call last)
<ipython-input-7-d36052e817e9> in <module>
----> 1 show_doc(CMScores, title_level=3)

NameError: name 'CMScores' is not defined
NameError                                 Traceback (most recent call last)
<ipython-input-26-3ff80097d200> in <module>
----> 1 show_doc(RegMetrics, title_level=3)

NameError: name 'RegMetrics' is not defined

Not really sure why … Any idea ?

EDIT : same issue for CategoryListBase in data_block

That’s because they are not in the __all__ variable of that module, because they are internal classes. You can put them there if you want to document them, or manually import them (from bla import bli will still work).

1 Like

Oh right, thanks.

One last question : I did a first PR 2 days ago (that I later closed because I saw some mistakes). In it seems some markdown got converted to html tags (** [...]** converted to <b> [...] </b> for example). Could you point me out how to fix this ?

Hey dear maintainers:

In my current understanding, at least for the Training section, each documentation page corresponds to a Python module file. So the page of basic_train corresponds to basic_train.py.

However, in basic_train, it also shows the docs of fit_one_cycle and lr_find, both of which are actually defined in train.py and do have docs in train. I understand that these two methods are very commonly used and, in this sense, are very basic. But assuming that we still want to stick to our current organization of documentation, that is, a one-to-one correspondence between the docs page and the Python module file, should we then remove fit_one_cycle and lr_find from basic_train?