Documentation improvements

That’s because I didn’t regenerate the website. I was going to to it yesterday and then forgot. Will run the script sometime today.

Oh okay. So what does “Site last generated: Dec 20, 2018” on the bottom right mean ?

I think it’s in link with the last commit, or @stas is it something the building tool adds?

It means that is when github rebuilt the site. It’s done automatically. It’s unrelated to when the html docs were generated.

As @sgugger said your commit to docs_src, @PierreO, wasn’t converted to html yet, so it’s not live yet.

The flow is: docs_src (.ipynb) -> docs (md/html)-> githubpages website.

1 Like

I see ! Thanks for the explanation :slight_smile:

For better documentation, I think it’d also be a good idea to migrate some of the notes in forums to corresponding fastai docs, often it’s very difficult to find some of those gems in the forums.

So if you stumble upon useful summaries, tips, performance notes, and other useful information that can be extracted/compiled into a stand-alone document or a section of an existing one, that would be a great service to this community.

In general we put normal user docs under fastai’s repo docs/*md, and more of the core dev docs under docs/dev/*md, but don’t worry too much about the proposed placement - it’s easy to reorganize those later.

Reading through some of the posts by @jeremy and @sgugger is probably a good starting point, but there are plenty other wonderful contributions by others hidden in the forums.

4 Likes

I compiled some of the answers in this thread into extra notes in the first post. The first post is now a wiki, so please don’t hesitate to improve it.

But also, ideally, improve the main document on doc authoring: https://docs.fast.ai/gen_doc_main.html.

Thank you.

3 Likes

I created a new step-by-step section on “Process for contributing to the docs”: https://docs.fast.ai/gen_doc_main.html#process-for-contributing-to-the-docs

please let me know if you find any problems/difficulties with it.

5 Likes

Here is another project, but this one is on the system-level wrt function definitions:

fastai uses an elaborate system of arg types encoded in the functions and shown via show_doc.

How does one find out what those types are? e.g.: https://docs.fast.ai/data_block.html#LabelLists.add_test

add_test[source] 
add_test(items:Iterator[T_co], label:Any=None)

What is T_co? (please don’t answer this question literally) Perhaps the docs need to include a page with the different types and have those types crosslinked to that page?

and why is there a need for an extra note?

“Note: Here items can be an ItemList or a collection.?”

I thought the purpose of the complex typestrings was to take care of not needing to explain what each argument is? And to a user this note is not very clear. ItemList is clear, a collection is not - a collection of what?

The point I’m trying to convey is that a lot of those things are very clear to the person who designed the system, but to a user they appear overly cryptic and confusing, IMHO.

The doc website needs a cleanup - it contains files that shouldn’t be there.

There is https://docs.fast.ai/tta.html, which I think shouldn’t be there, as there is no tta.ipynb. There are seem to be 4 of them:

perl -le 'for (@ARGV) { $x=$_; s#docs/(.*?)\.html#docs_src/$1.ipynb#; print "$x" unless -e $_} ' docs/*html
docs/callbacks.tracking.html
docs/text.models.qrnn.html
docs/tooltips.html
docs/tta.html

@sgugger, is it OK to nuke those 4?

These need to be reworked:

  1. the paragraph of https://docs.fast.ai/basic_train.html#TTA:

We take the average of our regular predictions (with a weight beta ) with the average of predictions obtained …"

That sentence doesn’t quite make sense. And later that large paragraph needs some complete sentences too.

  1. https://docs.fast.ai/vision.transform.html#Data-augmentation

“Images are often rectangles of different ratios, so to get them to the target size, we can either/ By default”

Looks like the second part of the sentence got lost at “either/” and the next sentence merged into it.

Thanks.

Yes, not sure why the third one is here, but the other three come from old notebooks that were renamed, so you can remove those four files.

Nope, they’re not even clear to me!

Type abbreviations are currently over-used in fastai. Where I notice an abbreviation is only used a couple of times, I tend to remove it and replace it with the full version.

There is a dictionary of type abbreviations provided in fastai and these are used to create links to more info in the docs, but sometimes we forget to add things there. It would be nice if it were autogenerated.

Thank you, Jeremy. I’m glad to hear it wasn’t just me.

So let’s deal with them as they get encountered, let’s start with the mysterious T_co in:

add_test(items:Iterator[T_co], label:Any=None)

There is a dictionary of type abbreviations provided in fastai and these are used to create links to more info in the docs, but sometimes we forget to add things there. It would be nice if it were autogenerated.

Which dict is that? It surely isn’t this one, https://docs.fast.ai/dev/abbr.html, so please share which dict you’re referring to and where it is documented. Thanks.

I took a stab at a few examples in core and submitted the PR. Let me know if this is along the lines of what you are looking for (more examples…) and I can continue to proceed with more of the functions there. If not, any direction on what you think is relevant is appreciated.

Hi, I was examining to_data function. And while doing so… I found a strange behaviour in a simple code. Below is the code sample:

from fastai import *
from fastai.vision import *

path = untar_data(URLs.MNIST_SAMPLE)
data = ImageDataBunch.from_folder(path)

# Examining the labels
print(set(data.y)) 

The output of the above is given below:

{Category 3, Category 3, Category 3, ...<repeated many times>, Category 7, Category 7, Category 7}

Whereas the expected output IMO is:

{Category 3, Category 7}

Just wanted to know, is this an expected behavior or a bug. I guess a new Category class is being instantiated for each image and it looks a bit inefficient from a very high-level understanding. Sorry if this is a silly question or if I missed something.

Also, I have submitted a PR. I have completed most of the functions in a specific section of torch_core. Do send me feedback on what can be improved. :slight_smile:

PR Link: https://github.com/fastai/fastai/pull/1485

Thanks
NVS Abhilash

At http://docs.fast.ai/, can those blocks be displayed with markdown formatting?
Like this:


(It was confusing to me, and I think it can be confusing for other beginners)

update: resolved.

1 Like

Here are the things that need fixing in the doc generating code: fastai/gen_doc/nbdoc.py

  1. backticks around all arguments and defaults in show_doc should be removed? Each component is already wrapped in <code></code>

    PointsItemList(`args`, `convert_mode`=`'RGB'`, `kwargs`) ::
    

    So the backticks are just an unnecessary noise.

  2. show_doc ignores *, ** in function definitions, e.g. showing kwargs instead of **kwargs.

    Inside format_param, p.name returns kwargs - not sure how to retrieve ** from p. It does know that the original arg was <Parameter "**kwargs">, but none of the attributes indicate that.

update: both were resolved by Andrew! Thank you.

2 Likes

The docs css needs to be improved to have consistent font sizes:

  1. the show_doc sig uses larger fonts than the rest of the html, so it’s a bit painful. I guess we need to tweak the .css to get to use similar fonts for mono and non-mono fonts.

  2. same for quoting - uses a much bigger font for blockquote, e.g. the top of https://docs.fast.ai/dev/abbr.html