Live coding 10

Yes, I have seen the same. Looking at the bug mentioned in the other thread, it looks maybe like a rounding/precision error - so if things get loaded into memory differently maybe things go ok.

1 Like

For anyone who finds this thread with a search:

One other thing that took me some time to work out is that if you are loading a model to resume training (or even do inference on large amounts of data), the load_learner function loads to the CPU by default, so everything is very slow. If you plan to do additional training, set a flag in the load_learner like this:

load_learner('path/to/file', cpu=False)

This puts it on the GPU. Took me some time to figure out what was going on and how to fix it.

tags for searches: learner is slow, put learner on GPU

2 Likes

FYI, I tried it both ways and your “divide original LR by 5” way worked better.

2 Likes

So I did my homework and I think I understand tta and its rationale

I presume that if you use tta on your test set as the validation of your model, then you would also use tta for inference if the model is put into production?

3 Likes

Yeah otherwise you wouldn’t get similar results in production.

3 Likes

I copied and ran Jeremy’s code twice using the convnext_small_in22k model and factory method from_folder. The overall result on the validation set using tta was the same, but each epoch (and the final accuracy and validation loss) were slightly different from their corresponding epochs in the other experiment. I used the same seed for each experiment. (And obviously the same hyperparameters.)

There must be some randomness, but I thought that using the same seed was supposed to eliminate this. What is the source of randomness? Does this have any implications for creating models (eg. you may get slightly different results if you rerun so run more than once)?

(I also ran using the DataBlock api and got a slightly different answer).

I encountered a problem to run this notebook:
course22/10-scaling-up-road-to-the-top-part-3.ipynb at master · fastai/course22 · GitHub

To fix this, I modified the following, which is in the pull request right now. Please take a look at the following pull request. Thanks

2 Likes