Sorry, I saw the FileDeleter in the video, but at that moment was not 100% mentally at the video. Now I want to use that in my notebook, but do not find it anymore?
Train loss larger than validation could be because of dropout? So why is Jeremy saying this is a sign the model isn’t fitting…
I would love to see a clear definition of error… I’m not sure what the fast.ai code is actually reporting
error is equivalent to 1-accuracy
. Where accuracy is defined as number of correct prediction divided by validation set size. Both accuracy and error is calculated on validation set.
If your validation set contains qualitatively different data than your training set, you should change one of them so they reflect the same distribution.
The validation set is there to mimic the real world where the model will be tested. The distribution that generated the data in the validation set must be the same as the distribution that generated the data in the training set. If you are training the model on black bears and then show orange bears on validation, you would not be overfitting, you would be validating on a dataset generated by a different distribution. This is called ‘distribution shift’.
Many techniques. Weight decay, dropout and higher learning rates all have regularizing effects and help reduce overfitting.
Not sure if you already got the answer on this, but if not … the answer is that the pre-trained weights do not need to be downloaded and that can be made to happen if you set pretrained=False
. So this is how you’d need to modify line 3 above:
learn = create_cnn(data2, models.resnet34, pretrained=False)
With that set, you load your saved (trained) weights in the same way you have it above.
I’d also love it if FileDeleter had the option to show the images in a grid … with a dropdown to enable moving it to a different class (because it was misclassified by a human or by google) in addition to the option to delete it.
Can we get one added for “Local”
It might be nice to include at least the basic steps required to use the library and keep it updated for folks who are using their own DL rig. Might even be nice to include some links there to threads on the forums describing approaches to building such setups for folks interested in moving from the cloud based systems to something of their own design.
I was looking at the code last night to see how difficult it would be. We can probably change what we send to FileDeleter from just file_paths to indexes and Dataset - that way, it has all the information it needs to display an actual label and a way to update it if it’s in a wrong class.
I believe that @zachcaceres is planning to add that.
@admin
[lesson2-sgd.ipynb]
in the sgd lesson there seems to be a couple of errors in the beginning of the notebook - I updated my work area course-v3 from github but it was still giving the errors.
a = tensor(3.,2); a
gives an error:
TypeError Traceback (most recent call last)
in
----> 1 a = tensor(3.,2); a
TypeError: tensor() takes 1 positional argument but 2 were given
I think this should be
a = tensor([3.,2]); a
also a bit later
a = tensor(-1.,1)
should be
a = tensor([-1.,1])
Does anyone know if lesson slides are available somewhere?
You need to conda update.
thanks
conda install -c fastai fastai
I didn’t see this instruction in the Paperspace Gradient instructions for returning to work.
When you run the trained model on the data used to train it you will get the lowest possible error. When you show the model new data it will not be able to perform as well, so the error will be higher.
I like Lawrie’s answer. You could define a class called ‘other’ and then classify an image as ‘other’ if the probability to be in all of the bear classes is lower than an empirical threshold, which you could define from your own investigations. Say you set a threshold of 0.15; then any image whose probabilities to be in the teddy, brown and grizzly classes are all < 0.15 would be declared as ‘other’.
Jeremy mentioned that training error should always be lower than validation error. What happen when using dropout, which is applied only to training, increasing the training loss?
Can anyone help me understand this?
I like your skepticism! Of course the training images are used to train your resnet. But your model doesn’t work by just using the training images as a lookup table.
You will know this when you apply your trained model to the new unseen images in your validation set (which you have been careful to quarantine from the model-building process). Achieving good accuracy on the validation set proves that your model is able to generalize beyond the training set data!