I’ve been working on making it as easy as possible to setup multitask models with Fastai. For reference, in multitasking, you use a single NN to solve several problems at a time (eg several classifications, regressions, etc.). The way to do it is to have the NN output a vector containing the concatenated predictions for each sub-task.
here is a work-in-progress notebook:
The goal is to support it natively in Fastai and make it compatible with all tooling (metrics, top losses interpretation, import/export, etc.). Could you guys have a look and give your feedback on the implementation details?
To maintainers: Do you think this functionality could be brought to the core library? If so I will work on a proper PR, otherwise, I’m thinking about publishing it in on its own repo as a sort of plugin.
Besides, I have a few questions regarding the implementation:
- Ideally, we would like to support null values in Y vectors. as Andrew Ng explains in his DL course (https://www.youtube.com/watch?v=UdXfsAr4Gjw), we can just ignore them when computing the loss. This turns out to be tricky to implement:
- fastai throws an error when it sees null values, we need to disable it
- pytorch’s long() converts nan to big negative values
- complex Tensor dimensionality handling across all functions (training, get_preds, etc.)
Do you have any advice or comment regarding this?
- To avoid having to manually adjust weights of sub-losses, we would need to normalize float values in regression sub-tasks, so that the RMSE take values in similar ranges as CrossEntropy, ie between 0 and 1. As far as I can tell there’s nothing in Fastai to do it (the existing Normalize processor is specific to Tabular data and only handles x values). Do you have any suggestions?
- Generalize the existing Normalize module
- Write a specific module
- ask users to pre-normalize their data (which is I assume the current advice for folks running regression NN today?), but it requires saving the normalization stats independently which is a bit annoying.
- Simply run the RMSE sub-losses through a sigmoid (not sure if this would be ok?)
- I ended up subclassing and overriding a lot of classes/methods/functions, often just to work-around Fastai code structure: for example use of global functions, some classes creating instances of other classes (eg “LabelLists" creates instances of “LabelList” but we can’t easily tell it to use a custom “LabelList” class). This raises the broader question of Fastai’s extensibility, in some cases I’m not sure if there are reasons for the existing code structure or if we can suggest some refactoring & improvements to make the lib easier to override and extend.
I would really appreciate hearing your thoughts on this!