Good readings 2019

I think this paper is very important because it shows how language models are capable of capturing syntactic properties of sentences and solve various tasks. It has evaluation for CoVe, ELMo, BERT and GPT for tasks that require the model to answer a question whether some part of the sentence is a noun phrase, has one of the POS or dependency tags, is coreferent with another word etc. It shows that language models have the potential to improve the results on these tasks in real setting.

It would also be cool to see how AWD-LSTM fits into this team of language models, because in my experiments on similar tasks it shows some nice results, e.g. here:

Also note that @sgugger has recently added this to fastai. Use the master version if you try it, since it’s being changed regularly.


Thank you for starting this forum library! So useful! May I suggest creating a system to classify papers around broad topics? Say something like: vision, NLP, GANs, ethics/FAT etc. Perhaps just defining the categories in the first post and listing the hashtag to use for each suffices?

I have been bookmarking a bunch of AI ethics papers (yet to read :frowning_face:) and will be happy to share a summary if others on here are interested in the topic.


I think we can bother @sgugger to use his powers and wiki-fy this discussion.
Then we could collaborate and maintain a list above?
I’ll volunteer to help although all of the wonderful papers shared here seem very intimidating to me.

Please, do share! :slight_smile:


Agreed, I’ve made the initial post a wiki so that you can put the list up there.


@nbharatula Very interested in the topic of ethics, as I’m sure many others here would be too. As DL/ML practitioners, I think it would be irresponsible of all of us not to take ethics very seriously and always put this into consideration. For the fastai forum, my personal opinion though is that AI Ethics is so important it more than deserves its own separate topic - not just some posts that could get buried in some other topic :slight_smile: I actually started a similar topic in the previous Part 2 (2018). I anticipate that this topic would probably spark more conversation now as this subject has gradually (finally) started to reach top-of-mind awareness as a critical issue in our industry.

Also I highly recommend following Rachel on twitter if you aren’t already as she has been championing AI Ethics for quite some time now (even before the big companies took serious notice) and she provides excellent content/references on this subject.


Could be great!! But to be very effective we should carefully select among the papers linked here. I think of a list with no more of 10 entries and ordered by topic. We could start with 4 main topics :

  • Computer vision : #CV
  • Natural language Processing : #NLP
  • Optimizations and Training tricks : #OTT
  • Ethics and Good Practices : #EGP (seems there exists a new thread on this topic. To be removed)

Any suggestion ? Might hashtag like #CV be useful here?
I can help with the NLP list btw.


Hi @jamesrequa, Yes I agree, this deserves its own topic/thread. But sprinkling it across other posts also helps keep it top of mind! :slight_smile: I’ll summarise my findings here as well as on the other thread.

@init_27 thanks for starting the 2019 thread for this topic.

And yes, I follow Rachel and a few other awesome folks focused on ethics research. Thanks for the suggestion!

1 Like

There’re a few robotics papers I’m looking at. The RCAN paper seems interesting. There was also a link today on twitter about ‘Neural Task Graphs’ claiming to be essential to future robotics work – I haven’t taken a look at it yet.

Adabound Optimizer : A new optimizer that is as fast as Adam and as good as Sgd

I have to try this out, but the results look good though.

blog -->
code + demo (awsome!) —>


NLP: for logographic languages (SOTA results for almost 13 tasks)

Glyce: Glyph-vectors for Chinese Character Representations


Neumann optimizer: allows us to use larger batches for training
Link to paper and Openreview


Training this is a pain I’ll choose SSD or RetinaNet anyday


@fabris I’ve deleted the OP. However, It’s not a “personal blog” that I had shared. I’ve started a “5-minute paper discussion series” to share quick highlights from papers and I had requested for feedback for my first attempt at this.

Here is the first one in the series for the paper: “Bag of Tricks for Image Classification with Convolutional Neural Networks”: Paper Discussion"


I’ve tried classifying papers into single categories. Eventually there was so much overlap between categories that I switched to using multiple tags/keywords.

I found the paper An Embarrassingly Simple Approach for Transfer Learning from Pretrained Language Models to be very interesting:

Here is a 5 minute summary that I’ve written about the paper.

Quick TL;DR:

The authors have shared their approaches of performing Transfer Learning in NLP and with these tricks, they claim to perform better than the ULMFiT approach on a few datasets, specially when using lesser training examples .

Most interesting one was defining an auxiliary loss ,by adding a new layer and also accounting for the orignal loss value

Loss = Loss (Aux) + Loss (LM)


This paper had mentioned an interesting approach of using “Two loss terms” that allowed interestingly better performance.

Thanks to @lesscomfortable who discovered that the Wavenet paper had mentioned an exactly similar method of using two loss values to generalize better

TIL, if you want to do 2 tasks using 1 NN or if you’re doing 1 task using a NN, you can create 2 loss values that you want to optimize and that would lead to better performance!

Edit: Here is the amazing explanation of the “2 loss strategy” by @lesscomfortable: (Copy paste from our slack group, Francisco’s words:)

“I think it’s more about generalizing predictive models. I don’t know if it applies to any model, it seems to be restricted to generalizing predictive series (predict next sound, next word, next stock price etc) to new tasks (classification).
It’s like being less ‘harsh’ when changing tasks. I think about it like 'hey I know that you know how to do this but slowly you will have to use what you know to do that” you’re telling the model ‘take your time to learn’ "


If I remember correctly the RCNN methods are either obsolete or most of what they can do is done better by SSD-type models. Maybe they’re higher accuracy? I saw some Facebook research on detecting faces in crowds. SSDs don’t use the region-proposal part of RCNNs.

There’s a discussion on RCNN/SSD history in the CS231N course, and fastai part2-2018 talks about the SSD-type method too.

EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks
5 minute summary link:

The paper investigates 4 “Easy Data Augmentation” Techniques in NLP (!!)

  • Synonym Replacement
  • Random Insertion
  • Random Swap
  • Random Deletion

These are tested on 5 text classification data sets, using simple RNN and CNN (yes,CNN) architectures and the authors demonstrate some performance improvements, especially when using “smaller” subsets of training data


Sharing some papers that explore ethics issues of AI. Wanted to post a more comprehensive and better annotated list of blogs and papers, but I really need to stop letting the perfect get in the way of the good. :slight_smile: So here is a start:

Ethics of AI: #Ethics

Category Title / Link Summary
General In Favor of Developing Ethical Best Practices in AI Research Best practices to make ethics a part of your AI/ML work.
General Ethics of algorithms Mapping the debate around ethics of algorithms
General Mechanism Design for AI for Social Good Describes the Mechanism Design for Social Good (MD4SG) research agenda, which involves using insights from algorithms, optimization, and mechanism design to improve access to opportunity
Bias A Framework for Understanding Unintended Consequences of Machine Learning Provides a simple framework to understand the various kinds of bias that may occur in machine learning - going beyond the simplistic notion of dataset bias.
Bias Fairness in representation: quantifying stereotyping as a representational harm Formalizes two notions of representational harm caused by “stereotyping” in machine learning and suggests ways to mitigate them.
Bias Man is to Computer Programmer as Woman is to Homemaker? Paper on debiasing word embeddings.
Accountability Algorithmic Impact Assessments AI Now paper defining the processes for auditing algorithms.