I was wondering why Jeremy picked resnet34… this paper shows resnet34 has better accuracy and uses less computation… it covers ~10 other architectures too including inception
Isn’t the current state of the art, densenet?
This paper is ~2 years old
the electronic frontier foundation maintains results on various benchmarks
https://www.eff.org/ai/metrics
resnet occurs on leaderboards 11 times
I didn’t see densenet in there but I only did a cursory search
In some areas - particularly segmentation - densenet can work great. Not often seen in SoTA work in classification however.
@jeremy I would like to know your views on this paper. Looks like they managed to get SoTA for CIFAR.
https://arxiv.org/abs/1804.05340 (A sparse version of densenet)
It’s a good result, but their SoTA claim is wrong
Thats interesting ! One of my colleges also mentioned about the results.
How to know if a paper is legit or not ? Provided that I don’t have the resources to replicate the results. It is already a hard task to go through the recent papers, and feels a bit disappointing when their claims are wrong.
Thanks for mentioning about the shakedrop paper. I wasn’t familiar with these kind of approaches. It seems to work really well for wider networks, especially wide resnet. Also, this paper introduced me to a lot of new concepts like EraseRelu. I would get back to this after going through the previous regularisation techniques (Shake-Shake & Stochastic depth).
Last time I checked, Squeeze-and-Excitation network (SE-ResNet) was the “State of the Art” -https://arxiv.org/abs/1709.01507 or maybe NASNet (https://ai.googleblog.com/2017/11/automl-for-large-scale-image.html). Bottom line: it doesn’t matter at this point because ResNet works perfectly fine for most applications and it’s reasonably fast.
What about in terms of training time and no of parameters? Do you know architectures which have been providing reasonable benchmarks but keeping these two optimal?
SENet and ResNet are very similar, SENet has a bit more params since it has one further skip connection in every block. However, they should be comparable in terms of training time. NASNet is off the charts with the number of params. DenseNet has fewer params than ResNet but is deeper, therefore, trains a bit longer. If there are resource constraints and want something small and fast, there’s MobileNet and SqueezeNet - both have fewer params, train fast, although have (in general) worse accuracy. There’s so many architectures now, it’s crazy. Then again, on average I would say ResNet is the best choice considering accuracy, simplicity, number of params and speed of training.