Good readings 2020

Jeez they got some nice results! Thanks for sharing

This is my personal reading list, has some very theoretical and math intense papers which provide much better insight into why things work the way they do - https://github.com/digantamisra98/Library

3 Likes

Funnel Activation for Visual Recognition (@Diganta might be interesting for you)

2 Likes

What Makes Training Multi-Modal Classification Networks Hard?


The code was also released.

2 Likes

If anyone is interested in trying to tackle porting this over, I’d be interested in helping/leading (when I have time to lead the push). The code is in Caffe2 so it requires a bit more effort to do, but this could be extremely valuable combined with the MixedDL project: GitHub - facebookresearch/VMZ: VMZ: Model Zoo for Video Modeling

Interesting read but seems high FLOPs because of the way they calculate the spatial conditioning. Anyway, a really nice approach.

DeLighT: Very Deep and Light-weight Transformer

Overall, DeLighT networks are 2.5 to 4 times deeper than standard transformer models and yet have fewer parameters and operations. Experiments on machine translation and language modeling tasks show that DeLighT matches the performance of baseline Transformers with significantly fewer parameters.

On the WMT’14 En-Fr high resource dataset, DeLighT requires 1.8 times fewer parameters and 2 times fewer operations and achieves better performance (+0.4 BLEU score) than baseline transformers. On the WMT’16 En-Ro low resource dataset, DeLighT delivers similar performance with 2.8 times fewer parameters than baseline transformers

https://arxiv.org/pdf/2008.00623.pdf

2 Likes

Thank you for sharing!

It seems like Transformers are taking over the world, or at least a newly starting with computer vision:

3 Likes

Great survey paper here of a selection of the efficient transformer that have come out in the past few years

Efficient Transformers: A Survey

3 Likes

has anybody tried using SwAV , any experience to be shared ?

That’s a great overview, thanks for sharing! Easy to lose the overview with all the different transformer architectures getting published recently :slight_smile:

Just adding the arxiv link too (without going through twitter): https://arxiv.org/abs/2009.06732

1 Like

Little comeback here for QRNNs? pQRNN, based off PRADO

Useful for simple classification tasks

3 Likes

ShapeAssembly: Learning to Generate Programs for 3D Shape Structure Synthesis

paper presents a deep generative model which learns to write novel programs in ShapeAssembly, a domain-specific language for modeling 3D shape structures. Executing a ShapeAssembly program produces a shape composed of a hierarchical connected assembly of part proxies cuboids. Our method develops a well-formed latent space that supports interpolations between programs. Above, we show one such interpolation, and also visualize the geometry these programs produce when executed. In the last column, we manually edit the continuous parameters of a generated program, in order to produce a variant geometric structure with new topology.

code and paper https://rkjones4.github.io/shapeAssembly.html

1 Like

MeLIME: Meaningful Local Explanation for Machine Learning Models

Most state-of-the-art machine learning algorithms induce black-box models, preventing their application in many sensitive domains. Hence, many methodologies for explaining machine learning models have been proposed to address this problem. In this work, we introduce strategies to improve local explanations taking into account the distribution of the data used to train the black-box models. We show that our approach, MeLIME, produces more meaningful explanations compared to other techniques over different ML models, operating on various types of data. MeLIME generalizes the LIME method, allowing more flexible perturbation sampling and the use of different local interpretable models. Additionally, we introduce modifications to standard training algorithms of local interpretable models fostering more robust explanations, even allowing the production of counterfactual examples. To show the strengths of the proposed approach, we include experiments on tabular data, images, and text; all showing improved explanations. In particular, MeLIME generated more meaningful explanations on the MNIST dataset than methods such as GuidedBackprop, SmoothGrad, and Layer-wise Relevance Propagation. MeLIME is available on this https URL.

https://arxiv.org/abs/2009.05818

@muellerzr might want to see code also exists.

3 Likes

Self-supervised Single-view 3D Reconstruction via Semantic Consistency

code https://github.com/NVlabs/UMR
paper https://arxiv.org/pdf/2003.06473.pdf

Great posts about GPT-3 that are very interesting and thought provoking for AI development in general:

Best start with this section:

Then this:

And finally the entire long article if you want go for more details:

SCOUTER: An explainable image classifier using a modified version of Slot Attention

code: https://github.com/wbw520/scouter

2 Likes

Interesting report:

5 Likes

One-sentence Summary: Transformers applied directly to image patches and pre-trained on large datasets work really well on image classification

3 Likes

the code for An Image is Worth 16x16 Words: Transformers for Image Recognition…

5 Likes