Good readings 2020

morgan · July 22, 2020, 6:21pm

Jeez they got some nice results! Thanks for sharing

Diganta · July 23, 2020, 4:54am

This is my personal reading list, has some very theoretical and math intense papers which provide much better insight into why things work the way they do - https://github.com/digantamisra98/Library

s.s.o · July 25, 2020, 1:12am

Funnel Activation for Visual Recognition (@Diganta might be interesting for you)

s.s.o · July 25, 2020, 1:21am

What Makes Training Multi-Modal Classification Networks Hard?

The code was also released.

muellerzr · July 25, 2020, 1:35am

If anyone is interested in trying to tackle porting this over, I’d be interested in helping/leading (when I have time to lead the push). The code is in Caffe2 so it requires a bit more effort to do, but this could be extremely valuable combined with the MixedDL project: GitHub - facebookresearch/VMZ: VMZ: Model Zoo for Video Modeling

Diganta · July 25, 2020, 4:58am

Interesting read but seems high FLOPs because of the way they calculate the spatial conditioning. Anyway, a really nice approach.

morgan · August 6, 2020, 12:23pm

DeLighT: Very Deep and Light-weight Transformer

Overall, DeLighT networks are 2.5 to 4 times deeper than standard transformer models and yet have fewer parameters and operations. Experiments on machine translation and language modeling tasks show that DeLighT matches the performance of baseline Transformers with significantly fewer parameters.

On the WMT’14 En-Fr high resource dataset, DeLighT requires 1.8 times fewer parameters and 2 times fewer operations and achieves better performance (+0.4 BLEU score) than baseline transformers. On the WMT’16 En-Ro low resource dataset, DeLighT delivers similar performance with 2.8 times fewer parameters than baseline transformers

https://arxiv.org/pdf/2008.00623.pdf

MicPie · August 7, 2020, 5:08am

Thank you for sharing!

It seems like Transformers are taking over the world, or at least a newly starting with computer vision:

morgan · September 16, 2020, 8:16pm

Great survey paper here of a selection of the efficient transformer that have come out in the past few years

Efficient Transformers: A Survey

harikrishnanrajeev · September 17, 2020, 6:53am

has anybody tried using SwAV , any experience to be shared ?

stefan-ai · September 18, 2020, 10:08am

That’s a great overview, thanks for sharing! Easy to lose the overview with all the different transformer architectures getting published recently

Just adding the arxiv link too (without going through twitter): https://arxiv.org/abs/2009.06732

morgan · September 21, 2020, 9:41pm

Little comeback here for QRNNs? pQRNN, based off PRADO

Useful for simple classification tasks

s.s.o · September 21, 2020, 10:01pm

ShapeAssembly: Learning to Generate Programs for 3D Shape Structure Synthesis

paper presents a deep generative model which learns to write novel programs in ShapeAssembly, a domain-specific language for modeling 3D shape structures. Executing a ShapeAssembly program produces a shape composed of a hierarchical connected assembly of part proxies cuboids. Our method develops a well-formed latent space that supports interpolations between programs. Above, we show one such interpolation, and also visualize the geometry these programs produce when executed. In the last column, we manually edit the continuous parameters of a generated program, in order to produce a variant geometric structure with new topology.

code and paper https://rkjones4.github.io/shapeAssembly.html

s.s.o · September 21, 2020, 10:06pm

MeLIME: Meaningful Local Explanation for Machine Learning Models

Most state-of-the-art machine learning algorithms induce black-box models, preventing their application in many sensitive domains. Hence, many methodologies for explaining machine learning models have been proposed to address this problem. In this work, we introduce strategies to improve local explanations taking into account the distribution of the data used to train the black-box models. We show that our approach, MeLIME, produces more meaningful explanations compared to other techniques over different ML models, operating on various types of data. MeLIME generalizes the LIME method, allowing more flexible perturbation sampling and the use of different local interpretable models. Additionally, we introduce modifications to standard training algorithms of local interpretable models fostering more robust explanations, even allowing the production of counterfactual examples. To show the strengths of the proposed approach, we include experiments on tabular data, images, and text; all showing improved explanations. In particular, MeLIME generated more meaningful explanations on the MNIST dataset than methods such as GuidedBackprop, SmoothGrad, and Layer-wise Relevance Propagation. MeLIME is available on this https URL.

https://arxiv.org/abs/2009.05818

@muellerzr might want to see code also exists.

s.s.o · September 21, 2020, 10:08pm

Self-supervised Single-view 3D Reconstruction via Semantic Consistency

code https://github.com/NVlabs/UMR
paper https://arxiv.org/pdf/2003.06473.pdf

MicPie · September 27, 2020, 3:08pm

Great posts about GPT-3 that are very interesting and thought provoking for AI development in general:

Best start with this section:

Then this:

And finally the entire long article if you want go for more details:

s.s.o · September 30, 2020, 12:25pm

SCOUTER: An explainable image classifier using a modified version of Slot Attention

code: https://github.com/wbw520/scouter

MicPie · October 3, 2020, 3:46pm

Interesting report:

harikrishnanrajeev · October 5, 2020, 5:33pm

One-sentence Summary: Transformers applied directly to image patches and pre-trained on large datasets work really well on image classification

s.s.o · October 6, 2020, 12:29am

the code for An Image is Worth 16x16 Words: Transformers for Image Recognition…