Wow, this is awesome work!
I’ve been looking in a somewhat orthogonal direction using network deconvolution, which I’ve described briefly in another forum post. Through some early tests, I’ve found that these deconv (FastDeconv
implemented from the official repo by Ye et al. 2020) layers can be substituted into the (x)resnet stem, where they can be drop-in replacements for the Conv2d
layers and completely obviate BatchNorm
layers!
The results are also quite nice, too. Using an xse_resnext34 model with FastDeconv
, I’ve gotten 79.92% ± 0.72%. This can be found near the bottom of this notebook. I also think that this model can be combined with your mixed depth-wise kernels, and might achieve some really nice results