Hi, I believe that some deep learning papers are worth reading regardless of their application domain. But because of the overwhelming amount of deep learning papers published every day, my hope here is to have your help to create a curated list of cool deep learning papers: a kind of list which we might consider as must-reads in 2019. However, saying whether or not a paper is good it is not an easy task. Luckily those customary simple rules of thumb could be useful.
A paper should be linked below if it satisfies at least one of the following desiderata :
none of the above but you still think it is a cool/crazy idea that is going to work
As a final note, PLEASE, add a few lines introducing the linked paper and DO NOT COMMENT HERE unless strictly necessary, just use “like this post” so to guide other people readings. If you want to discuss some idea just create a new topic
deep neural networks (DNNs) have been recently found vulnerable to well-designed input samples called adversarial examples. In this paper, authors review recent findings on adversarial examples for DNNs, summarize the methods for generating adversarial examples, and propose a taxonomy of these methods.
Describes the Mechanism Design for Social Good (MD4SG) research agenda, which involves using insights from algorithms, optimization, and mechanism design to improve access to opportunity
Provides a simple framework to understand the various kinds of bias that may occur in machine learning - going beyond the simplistic notion of dataset bias.
From the abstract : Noah A. Smith presented ideas developed by many researchers over many decades. After reading this document, you should have a general understanding of word vectors (also known as word embeddings): why they exist, what problems they solve, where they come from, how they have changed over time, and what some of the open questions about them are.
a novel algorithm for generating portmanteaus which utilize word embeddings to identify semantically related words for use in the portmanteau construction.
Bag of Tricks for Image Classification with Convolutional Neural Networks.pdf (538.6 KB)
In this paper, the author studies a series of classification improvements and empirically evaluate their impact on the final model accuracy through ablation study.
Very practical work ! Most of these methods have been implemented in Fastai !
Agreed this is a great paper. One of the concepts mentioned from this paper that I find very interesting which surprisingly still doesn’t seem to get much attention is Knowledge Distillation. There is a great paper on this concept specifically Distilling the Knowledge in a Neural Network
Interestingly Knowledge Distillation also seems to build off of the idea of Label Smoothing, a concept which I believe was first introduced in the Inception v3 paper (again this technique seems to have gone largely overlooked by lots of people despite its effectiveness) which is yet another one of the tricks from the Bag of Tricks paper
This one was kind of surprising to me and go against my intuition:
Showing that training from random initialization can be just as good as transfer learning for computer vision applications: “These observations challenge the conventional wisdom of ImageNet pre-training for dependent tasks and we expect these discoveries will encourage people to rethink the current de facto paradigm of `pre-training and fine-tuning’ in computer vision.” (coming from the authors of ResNet and Mask R-CNN)
Note that they’re using a dataset with a lot of labels - for the kind of things many folks are doing here on the fast.ai forums, often with 100 labels or even less, you won’t make any progress without fine tuning!
If you have over 100,000 labels (as this paper does even in their “limited labels” scenarios) then pre-training may be less important (especially for object detection, where every object has 5 pieces of information attached - 4 coordinates and a classification).
No yeah, I fully agree. I read that so I know their experiments weren’t entirely “usual” compared to what we usually train on. There’s no reason not to use transfer learning (even if just for the shorter training time). It still feels like an interesting finding and worth reading.
This paper was rejected from ICLR but seems useful as it dramatically advances the baseline for the state of the art of a plain language model. The reviewers rejected the work because the authors didn’t demonstrate any downstream task that used the improved language model.
I think this paper is very important because it shows how language models are capable of capturing syntactic properties of sentences and solve various tasks. It has evaluation for CoVe, ELMo, BERT and GPT for tasks that require the model to answer a question whether some part of the sentence is a noun phrase, has one of the POS or dependency tags, is coreferent with another word etc. It shows that language models have the potential to improve the results on these tasks in real setting.
It would also be cool to see how AWD-LSTM fits into this team of language models, because in my experiments on similar tasks it shows some nice results, e.g. here:
Thank you for starting this forum library! So useful! May I suggest creating a system to classify papers around broad topics? Say something like: vision, NLP, GANs, ethics/FAT etc. Perhaps just defining the categories in the first post and listing the hashtag to use for each suffices?
I have been bookmarking a bunch of AI ethics papers (yet to read ) and will be happy to share a summary if others on here are interested in the topic.