Good readings 2019

init_27 · March 1, 2019, 8:01pm

I think we can bother @sgugger to use his powers and wiki-fy this discussion.
Then we could collaborate and maintain a list above?
I’ll volunteer to help although all of the wonderful papers shared here seem very intimidating to me.

Please, do share!

sgugger · March 1, 2019, 8:15pm

Agreed, I’ve made the initial post a wiki so that you can put the list up there.

jamesrequa · March 2, 2019, 6:34am

@nbharatula Very interested in the topic of ethics, as I’m sure many others here would be too. As DL/ML practitioners, I think it would be irresponsible of all of us not to take ethics very seriously and always put this into consideration. For the fastai forum, my personal opinion though is that AI Ethics is so important it more than deserves its own separate topic - not just some posts that could get buried in some other topic I actually started a similar topic in the previous Part 2 (2018). I anticipate that this topic would probably spark more conversation now as this subject has gradually (finally) started to reach top-of-mind awareness as a critical issue in our industry.

Also I highly recommend following Rachel on twitter if you aren’t already as she has been championing AI Ethics for quite some time now (even before the big companies took serious notice) and she provides excellent content/references on this subject.

fabris · March 2, 2019, 10:12am

Could be great!! But to be very effective we should carefully select among the papers linked here. I think of a list with no more of 10 entries and ordered by topic. We could start with 4 main topics :

Computer vision : #CV
Natural language Processing : #NLP
Optimizations and Training tricks : #OTT
Ethics and Good Practices : #EGP (seems there exists a new thread on this topic. To be removed)

Any suggestion ? Might hashtag like #CV be useful here?
I can help with the NLP list btw.

nbharatula · March 3, 2019, 2:19am

Hi @jamesrequa, Yes I agree, this deserves its own topic/thread. But sprinkling it across other posts also helps keep it top of mind! I’ll summarise my findings here as well as on the other thread.

@init_27 thanks for starting the 2019 thread for this topic.

And yes, I follow Rachel and a few other awesome folks focused on ethics research. Thanks for the suggestion!

Borz · March 3, 2019, 9:03pm

There’re a few robotics papers I’m looking at. The RCAN paper seems interesting. There was also a link today on twitter about ‘Neural Task Graphs’ claiming to be essential to future robotics work – I haven’t taken a look at it yet.

SHAR1 · March 3, 2019, 11:01pm

Adabound Optimizer : A new optimizer that is as fast as Adam and as good as Sgd

I have to try this out, but the results look good though.

blog --> https://www.luolc.com/publications/adabound/
code + demo (awsome!) —> https://github.com/Luolc/AdaBound/tree/master/demos

Moody · March 4, 2019, 10:37am

NLP: for logographic languages (SOTA results for almost 13 tasks)

Glyce: Glyph-vectors for Chinese Character Representations

harikrishnanrajeev · March 4, 2019, 3:34pm

Neumann optimizer: allows us to use larger batches for training
Link to paper and Openreview

swagman · March 5, 2019, 1:34pm

Training this is a pain I’ll choose SSD or RetinaNet anyday

init_27 · March 11, 2019, 6:58am

@fabris I’ve deleted the OP. However, It’s not a “personal blog” that I had shared. I’ve started a “5-minute paper discussion series” to share quick highlights from papers and I had requested for feedback for my first attempt at this.

Here is the first one in the series for the paper: “Bag of Tricks for Image Classification with Convolutional Neural Networks”: Paper Discussion"

bsalita · March 11, 2019, 9:43am

I’ve tried classifying papers into single categories. Eventually there was so much overlap between categories that I switched to using multiple tags/keywords.

init_27 · March 12, 2019, 1:32pm

I found the paper An Embarrassingly Simple Approach for Transfer Learning from Pretrained Language Models to be very interesting:

Here is a 5 minute summary that I’ve written about the paper.

Quick TL;DR:

The authors have shared their approaches of performing Transfer Learning in NLP and with these tricks, they claim to perform better than the ULMFiT approach on a few datasets, specially when using lesser training examples .

Most interesting one was defining an auxiliary loss ,by adding a new layer and also accounting for the orignal loss value

Loss = Loss (Aux) + Loss (LM)

init_27 · March 13, 2019, 3:38pm

This paper had mentioned an interesting approach of using “Two loss terms” that allowed interestingly better performance.

Thanks to @lesscomfortable who discovered that the Wavenet paper had mentioned an exactly similar method of using two loss values to generalize better

TIL, if you want to do 2 tasks using 1 NN or if you’re doing 1 task using a NN, you can create 2 loss values that you want to optimize and that would lead to better performance!

Edit: Here is the amazing explanation of the “2 loss strategy” by @lesscomfortable: (Copy paste from our slack group, Francisco’s words:)

“I think it’s more about generalizing predictive models. I don’t know if it applies to any model, it seems to be restricted to generalizing predictive series (predict next sound, next word, next stock price etc) to new tasks (classification).
It’s like being less ‘harsh’ when changing tasks. I think about it like 'hey I know that you know how to do this but slowly you will have to use what you know to do that” you’re telling the model ‘take your time to learn’ "

Borz · March 13, 2019, 4:54pm

If I remember correctly the RCNN methods are either obsolete or most of what they can do is done better by SSD-type models. Maybe they’re higher accuracy? I saw some Facebook research on detecting faces in crowds. SSDs don’t use the region-proposal part of RCNNs.

There’s a discussion on RCNN/SSD history in the CS231N course, and fastai part2-2018 talks about the SSD-type method too.

init_27 · March 14, 2019, 3:29pm

EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks
5 minute summary link:

The paper investigates 4 “Easy Data Augmentation” Techniques in NLP (!!)

Synonym Replacement
Random Insertion
Random Swap
Random Deletion

These are tested on 5 text classification data sets, using simple RNN and CNN (yes,CNN) architectures and the authors demonstrate some performance improvements, especially when using “smaller” subsets of training data

nbharatula · March 15, 2019, 5:20am

Sharing some papers that explore ethics issues of AI. Wanted to post a more comprehensive and better annotated list of blogs and papers, but I really need to stop letting the perfect get in the way of the good. So here is a start:

Ethics of AI: #Ethics

Category	Title / Link	Summary
General	In Favor of Developing Ethical Best Practices in AI Research	Best practices to make ethics a part of your AI/ML work.
General	Ethics of algorithms	Mapping the debate around ethics of algorithms
General	Mechanism Design for AI for Social Good	Describes the Mechanism Design for Social Good (MD4SG) research agenda, which involves using insights from algorithms, optimization, and mechanism design to improve access to opportunity
Bias	A Framework for Understanding Unintended Consequences of Machine Learning	Provides a simple framework to understand the various kinds of bias that may occur in machine learning - going beyond the simplistic notion of dataset bias.
Bias	Fairness in representation: quantifying stereotyping as a representational harm	Formalizes two notions of representational harm caused by “stereotyping” in machine learning and suggests ways to mitigate them.
Bias	Man is to Computer Programmer as Woman is to Homemaker?	Paper on debiasing word embeddings.
Accountability	Algorithmic Impact Assessments	AI Now paper defining the processes for auditing algorithms.

mb4310 · March 15, 2019, 12:57pm

https://openreview.net/pdf?id=ryxepo0cFX

Really enjoyed this paper, gives very solid theoretical motivation for their new recurrent architecture called “AntiSymmetric RNN”. Well-designed experiments demonstrate improvements along virtually every dimension of interest over LSTM and GRU (less parameters, faster training, more stability, better end-results).

init_27 · March 15, 2019, 6:32pm

To Tune or Not to Tune?
Adapting Pretrained Representations to Diverse Tasks

5 minute summary: https://hackernoon.com/to-tune-or-not-to-tune-adapting-pretrained-representations-to-diverse-tasks-paper-discussion-2dabe678ef83

"Quick TL;DR

This paper focuses on sharing the best methods to “adapt” your pretrained model to the target task. (answering the question: “To Tune or Not to Tune?”)
It compares two approaches feature extraction Vs Fine Tuning

(Yes the authors use emojis!)

There is also a quick guideline for practitioners:

fabris · March 21, 2019, 5:05pm

#CV

Nice works on GANs:

In High-Fidelity Image Generation With Fewer Labels", the authors propose a new approach to reduce the amount of labeled data required to train state-of-the-art conditional GANs.
In Self-Supervised Generative Adversarial Networks authors exploit two popular unsupervised learning techniques, adversarial training and self-supervision, to close the gap between conditional and unconditional GANs.

The DeepMind team also updated the Compare GAN library, which contains all the components necessary to train and evaluate modern GANs.