I was unsure where the loss function is (implicitly?) being defined. It would make more sense that the loss is part of the fitting function, so tht we can change it without changing the ‘learner’, although perhaps this is a matter of taste.
The last line of code in the fit_one_cycle(...) source code calls learn.fit(...). The fit() function in basic_train.py finishes by calling fit() from learner.py… OK. This in turn finishes by calling fit_gen(), and I don’t see any loss function defined there either.
Anyone want to help me walk through this to better understand what’s going on? I know Jeremy mentioned that it’s just a MSE loss being used in this case, but I don’t see where that’s happening.
Remember that collab_learner defines the structure or architecture of your model. The fit_one_cycle is only for training the architecture you have defined in the collab_learner. To answer your question: the loss function is an essential part of your model architecture and should therefore be defined in the collab_learner and not in the training process.
From a more holisitic viewpoint, remember that the loss function is the main objective of the model. The loss function tells us about the performance of our model. With this in mind, why would you like to be able to change the loss function which is the central part your model during the training process?
I had these same questions! If memory serves, the loss function is chosen automatically when you create the Learner, on the basis of the DataBunch. You can see the chosen loss function afterwards at learn.loss_func. And you can specify it when the Learner is created, or change it by simple assignment afterwards. The same automatic process applies to the final activation function too.
I think the best way to understand such fastai internals is to trace with a source code debugger, watching variables, deducing from function names, taking notes, and working at it. You won’t find any design overview or “why” explanations in the source code. I admit to having frustration with this style. Once you go past the initial stage of simply trusting fastai, it makes it difficult to confirm that fastai is making the right choices for more complex problems. On the other hand, you will learn a tremendous amount about training loops, model architectures, and ML/Python best practices by tracing fastai code.
I use PyCharm, and hear that VSCode is a good choice too. The “Find usages” function is also very helpful for finding where specific fields are ultimately set. HTH with getting oriented, M