First modules: core training

sgugger · August 16, 2018, 12:57pm

You may have noticed that the first modules of fastai_v1 appeared on github yesterday. They cover the notebooks 1, 3 and 4 (everything in dev_nb apart from data augmentation) which represents everything that has to do with training.

Even if a lot of stuff is still missing, the structure is already there: the imports folder will contain specific imports module separated by topics (core, torch, vision, text, structured/tabular). Basically each module will depend on imports.core then we add the stuff we need depending on the situation.

Inside the library, the lowest modules are core and torch_core, which contains the utility functions that will be used in the other modules. Following the conclusion of this topic, there won’t be systemic import * in fastai_v1, so the conventions are to import core as c and torch_core as tc. In the other modules, we import the specific functions/classes needed.

Above core is data, which contains the Dataset/DataLoader and DataBunch (an object regrouping the different data loaders). It will depend on the module transforms once this one exists.

Just on top of data is the basic module that defines the callbacks, callback. It also contains the wrapper around the optimizer and the Recorder (since this callback is always created by Learner objects). Then we have basic_training, where the training loop is as well as the Learner object.

The more advances features are defined in callbacks, a separate folder where each separate callback is defined in his own file for more readability. The imports.callbacks module is there to regroup all the names of those callbacks, and the convention is to import it as cb in the other modules.

Then the highest module is train, which contains the helper function that will create the callbacks and launch training (for instance lr_find(learn), or fit_one_cycle(learn, lr, cyc_len)). Lastly there is a fastai.everything just to be able to type from fastai.everything import *, which is going to be the norm in the notebooks of the course.

Here the graph of dependencies:

Full file is here if you can’t see properly, and the bigger one with external modules is here.

jeremy · August 16, 2018, 7:07pm

This may change a little. Thru careful use of __all__ and keeping track of the dependency graph, we may find we can use import * in some limited cases without any significant downside. This is still a somewhat open topic and will require more experimentation to come to a final conclusion.

What we can definitely say it that subsets of the library will be usable without importing anything unnecessary, and each module’s namespace will only include stuff that makes sense to be there.

PranY · August 19, 2018, 7:45am

In a serious attempt to understand the fastai library structure and the functioning of callbacks, I tried to manually create and prune(if possible) the complete dependency graph using the link you shared.

I couldn’t simplify it much but added decent level of symmetry and spatial relatedness, this one helps me understand the picture better, passing it on for everyone

I have tried my best to keep related things as close as possible in the space and use relevant colored arrows to indicate the flow. I have intentionally removed fastai.everything for now and left space under fastai.imports to account for various upcoming things. Once the imports are approximately fixed (core, torch, vision, text, structured/tabular), I’ll re-think about how to connect everything more neatly with fastai.everything

Will try to explore what jeremy mentioned above with one connection at a time as I’m very much new to selective dependency parsing and the use of __all__