Having a low training loss but high validation loss does not necessarily mean that your model is already overfitting. This is a point of confusion for most of us when starting out and trying to understand these ideas.
As mentioned in the book; when the validation set accuracy stops getting better, and instead goes in the opposite direction and gets worse, that is when you can start thinking that the model is starting to memorize the training set rather than generalize, and has started to overfit.
Taken straight from the book, Chapter 1 :
In other words, as you progress further on the training, once the validation loss starts getting consistently worse than what it was before, (and hence meaning that it’s not able to generalize as well to unseen data, while getting better on training data) then the model has started to overfit.
Not to be taken as an exact example, but I’ve tried to quickly sketch out what this could mean in practice.
The save() method is intended to be used during the traning process, for eg. after you’ve run some training and would like to save your work, so that you can resume later. Say your model is 98% accurate, and you’ve worked enough on it today. You can save('awesome-model') this, and then load('awesome-model') it back at a later time/tomorrow to the same notebook without re-training everything. Very handy for saving a good model so you can come back to it later !
On the other hand, the export() method is intended to be used for inference/deployment purposes, for eg. when you’re all done with training and want to export the necessary bits to be able to use it in another project/script/production etc. purposes. You’ve already seen this used for inference in the Gradio projects we’ve been working on.
I’m not sure if fastai has a built-in function to list all the models available. If somebody’s aware of such a function, please do share it.
One way to figure this out is to browse the source and see what’s supported in the codebase.
Now, you can import this in your notebook, play around and see what’s actually provided by the all.py module. This gives you a good starting point to see what’s provided right away.
import fastai.vision.models.all as fastai_vision_models
If you prefer slightly more cleaned up list, this is another quick one.
import fastai.vision.models.all as fastai_vision_models
models = inspect.getmembers(fastai_vision_models,
lambda x: inspect.isclass(x) or inspect.isfunction(x))
print([m for m in models])
Now, this will only list the models provided by fastai directly.
In the coming chapters, Jeremy will also most likely share more info on integrating another library called timm which has a lot more pre-trained models available.
that I started to work through, but they are distracting me from finishing my current task to do the app-export from kaggle. Its not clear how the github focus of these tutorials interacts with kaggle, i.e. mainly getting settings.ini into the correct file location, and then if any fields need kaggle specific values.
I think Jeremy might be using a screen multiplexing program like tmux or screen here. It’s a linux utility program that lets your run multiple sessions from the same terminal window and you can detach from it (if you’re logged in remotely) and your session will keep running on the machine you logged into (and started the tmux session on)
Thanks for the info! I was wondering if you had any advice on methods I could use for reducing my error rate for my problem. It was still quite high at ~.26 even after I unfroze and did 15 more epochs. While it was still training, it did so very slowly. I tried more images and will try a deeper architecture today (resnet18 vs resnet50).
This may sound like a roundabout way of doing this, but if you have a local install of fastai/nbdev you could probably export it out of kaggle, load into your local jupyter and do the nbdev export there. I checked my local install and it seems to have a barebones settings.ini file. I think it gets auto-generated when nbdev is installed.
It seems Colab has the same issue. I created a new notebook here to test it. I adjusted cells to make it easier to test both ways.
btw, I made an interesting discovery, that I’d like to confirm… I had assumed when notebook2script() ran, it parsed the whole text of the notebook top to bottom. However I see that executing an “#|export” cell several times, “a=2” gets doubled up in the generated file like this…
# AUTOGENERATED! DO NOT EDIT! File to edit: . (unless otherwise specified).
__all__ = ['a', 'a', 'a']
So “#|export” must be appending to a cache which “%notebook -e testnbdev.ipynb” pulls from. More to dig into there, but I’ll park it for now.
One naive first impression… given that all the other generation-meta-cmds start with “#|” , calling a normal function “notebook2script()” to read a file from disk to write “app.py” to disk breaks the paradigm. I feel its missing a “#!generate” cell, that would read from the “#|export” cache, to directly write “app.py”.
No, it’s because of the weird way we have to hack this to make in work in colab and kaggle, which is to use the %notebook magic. That saves a copy of every cell that has been run, in the order it’s run. That’s why you should just run once from top to bottom when doing this. Here’s the docs for %notebook.
Having said that, I recommend using your local machine for stuff like this - i.e when you’re not actually training a model - as I showed in the lesson. Then you don’t have to deal with any of this messiness. Or at least use a “real” notebook, like Paperspace or Sagemaker Studio Lab, rather than Colab or Kaggle.
tl;dr: Use Colab or Kaggle for free model training, but not for developing scripts/apps.