No, the learning rate after unfreezing is very often different.
The idea of progressive resizing. One example: Progressive Growing of GANs for Improved Quality, Stability, and Variation | Research
https://www.fast.ai/2018/04/30/dawnbench-fastai/
Instead, we turned to a method we’d developed at fast.ai, and teach in lessons 1 & 2 of our deep learning course: progressive resizing. Variations of this technique have shown up in the academic literature before (Progressive Growing of GANs and Enhanced Deep Residual Networks) but have never to our knowledge been applied to image classification.
I am trying to update fastai but I get permission denied. Anything I need to add specifically? I tried conda and pip and both had permission issues.
I’ve been reading Leslie Smith’s paper and he provides some guidance on batch size with one cycle training. Basically I think go as large as you can, up to that point that you get diminishing returns.
3e-3 === 0.003
1e-3 === 0.001
isn’t dice score a more relevant metric for segmentation problems?
What about loss function for multi label classification? Does the same loss function ( cross entropy) work for multi label classification as well?
In Kerala , I used to use binary cross entropy+sigmoid for last layer, it’s not clear how fastai takes care of this
How can we ignore specific pixel value in an image while training ? i.e. igonire pixel value = 255
Hey @rachel I finally have 8 votes on this
This is the metric that was used in the paper introducing camvid, that’s why Jeremy is using it.
It almost feels like some form of data augmentation
Given it usually takes a minute or so to train a whole epoch, how is lr_find so fast when looking at lots of different learning rates? Does it run just a few iterations for each learning rate? I looked at the documentation but still don’t quite understand how it works.
It does it for you. If you want to know more, you should ask on the advanced forum for now, this will be covered later in the course.
Any recommendations for making sense of cutting edge academic papers? I often see an interesting-looking paper on something I’m generally familiar with, but the jargon in academic papers can be overwhelming.
As per my understanding, it plots loss for different learning rates for different mini batches. That’s why it doesn’t take long
It does 100 iterations from 1e-5 to 10, growing the learning rate exponentially.
You’d want to use BCEWithLogitsLoss in Pytorch, it’s binary cross entropy+sigmoid combined
Welcome to Microsoft’s Github.
Some tips around reading (and implementing) new papers are covered in part 2 of the course.