In case anyone missed it, James Dellinger wrote about the highlights of this interview and see what Leslie has to say:
If you are on Twitter, I recommend you tag Leslie so he is aware of your blog post.
Finally, I have a chance to revisit this interview and the following are my short notes:
I watched the interview through the live stream and got a chance to ask questions and answered!
Hiromi: Thank you so much for broadcasting this. Opportunities like this are hard to come by for some of us.
Abhishek Sharma​: If there is one researcher everyone knows in FastAI community, it’s Leslie Smith.
100
Comments and questions that I asked:
Regarding Leslie’s few shots learning research direction:
Some of us did a literature search around this “incremental learning” problem. The goal is to be able to learn as you go - online learning of new classes. In Leslie’s idea, I think we have yet to see anything where the same classes are used in each epoch but with stage-wise increase in batch sizes. I believe Jeremy’s idea is related to curriculum learning. We couldn’t find work related to class-incremental learning (adding new classes as we run more batches).
Change data augmenation and hyperparameters in a “u-shaped” way: Low, then high, then low again. The middle “high” should help to generalize better. The last “low” should help to pinpoint the sweet spot. Or high, low, high, depending on the specific parameter. (This is similar to the 1-cycle policy.)
“Batchnorm zero” trick for resnet blocks from Accurate, Large Minibatch SGD - Training ImageNet in 1 Hour. Take the last batchnorm layer in the conv/residual path of the resent block and initialize the learnable multiplication parameter with zero. Then all resnet blocks represent at the beginning of training an identity function. This should improve the model (and is also included in the PyTorch ResNet models).