Paper on ResNet34, VGG and various other architectures

gjohn · October 23, 2018, 2:51am

I was wondering why Jeremy picked resnet34… this paper shows resnet34 has better accuracy and uses less computation… it covers ~10 other architectures too including inception

paws · October 23, 2018, 3:04am

Isn’t the current state of the art, densenet?

wdhorton · October 23, 2018, 3:58am

This paper is ~2 years old

gjohn · October 23, 2018, 4:02am

the electronic frontier foundation maintains results on various benchmarks
https://www.eff.org/ai/metrics

resnet occurs on leaderboards 11 times
I didn’t see densenet in there but I only did a cursory search

jeremy · October 23, 2018, 5:41am

In some areas - particularly segmentation - densenet can work great. Not often seen in SoTA work in classification however.

SHAR1 · October 23, 2018, 8:08am

@jeremy I would like to know your views on this paper. Looks like they managed to get SoTA for CIFAR.
https://arxiv.org/abs/1804.05340 (A sparse version of densenet)

jeremy · October 23, 2018, 1:24pm

It’s a good result, but their SoTA claim is wrong

SHAR1 · October 23, 2018, 4:25pm

Thats interesting ! One of my colleges also mentioned about the results.

How to know if a paper is legit or not ? Provided that I don’t have the resources to replicate the results. It is already a hard task to go through the recent papers, and feels a bit disappointing when their claims are wrong.

Thanks for mentioning about the shakedrop paper. I wasn’t familiar with these kind of approaches. It seems to work really well for wider networks, especially wide resnet. Also, this paper introduced me to a lot of new concepts like EraseRelu. I would get back to this after going through the previous regularisation techniques (Shake-Shake & Stochastic depth).

danielhavir · October 31, 2018, 4:46pm

Last time I checked, Squeeze-and-Excitation network (SE-ResNet) was the “State of the Art” -https://arxiv.org/abs/1709.01507 or maybe NASNet (https://ai.googleblog.com/2017/11/automl-for-large-scale-image.html). Bottom line: it doesn’t matter at this point because ResNet works perfectly fine for most applications and it’s reasonably fast.

paws · October 31, 2018, 4:51pm

What about in terms of training time and no of parameters? Do you know architectures which have been providing reasonable benchmarks but keeping these two optimal?

danielhavir · October 31, 2018, 4:57pm

SENet and ResNet are very similar, SENet has a bit more params since it has one further skip connection in every block. However, they should be comparable in terms of training time. NASNet is off the charts with the number of params. DenseNet has fewer params than ResNet but is deeper, therefore, trains a bit longer. If there are resource constraints and want something small and fast, there’s MobileNet and SqueezeNet - both have fewer params, train fast, although have (in general) worse accuracy. There’s so many architectures now, it’s crazy. Then again, on average I would say ResNet is the best choice considering accuracy, simplicity, number of params and speed of training.