Good readings 2019

jcatanza · June 14, 2019, 4:40pm

At ICML workshop “Climate change: How can AI help?”

Andrew Ng will speak on “Tackling climate change challenges with AI through collaboration” livestreaming at 9:45 Pacific time!

Seb · June 14, 2019, 4:46pm

I already posted in another thread the related paper but I’ll repost it here…

Seb · June 14, 2019, 4:55pm

Scheduled speakers for the workshop:
https://icml.cc/Conferences/2019/Schedule?showEvent=3507

Edit: link I posted previously to the recordings doesn’t work now

MadeUpMasters · June 14, 2019, 7:34pm

Hey, have any of you all seen this SciHive twitter. It’s a new free, open-source service (I’m not connected in any way) that allows you to read Arxiv papers and highlight stuff, comment, ask questions…etc. It looks really cool.

It has some nice features too like hovering over an acronym shows what it stands for, hovering over a reference shows the paper and name. I think it’d be incredible to have the papers we all read as individuals collectively annotated with questions, answers, additional resources…etc. Let me know if there’s a similar service you already use to do this as well. Cheers.

digitalspecialists · June 14, 2019, 10:05pm

Winning solution for some of the FGVC challenges at CVPR2019, plus SotA on Stanford Cars.

http://openaccess.thecvf.com/content_CVPR_2019/papers/Chen_Destruction_and_Construction_Learning_for_Fine-Grained_Image_Recognition_CVPR_2019_paper.pdf

Seb · June 24, 2019, 11:40pm

New SOTA on Imagenet… 85.4% when pretraining on Instagram and finetuning on Imagenet.
Edit: I guess it’s from last year, but they just published the models?

Shubhajit · June 26, 2019, 2:51pm

An awesome visualization intro to Numpy: https://jalammar.github.io/visual-numpy

jcatanza · June 26, 2019, 4:36pm

This fellow is a master in the art of visual information display. You can “read” and comprehend the article almost without even paying attention to the words. Thanks for posting, @Shubhajit !

fabris · June 27, 2019, 9:25am

MMDetection: Open MMLab Detection Toolbox and Benchmark

MMDetection is an object detection toolbox that contains a rich set of object detection and instance segmentation methods as well as related components and modules. Authors claim that this toolbox is by far the most complete detection toolbox …
However, the toolbox and benchmark could provide a start to reimplement existing methods and develop your own new detectors.
worth taking a look

jcatanza · June 28, 2019, 2:47am

The Matrix Calculus You Need for Deep Learning, by Tim Parr and Jeremy Howard (revised, v3)
Abstract: “This paper is an attempt to explain all the matrix calculus you need in order to understand the training of deep neural networks. We assume no math knowledge beyond what you learned in calculus 1, and provide links to help you refresh the necessary math where needed. Note that you do not need to understand this material before you start learning to train and use deep learning in practice; rather, this material is for those who are already familiar with the basics of neural networks, and wish to deepen their understanding of the underlying math. Don’t worry if you get stuck at some point along the way—just go back and reread the previous section, and try writing down and working through some examples. And if you’re still stuck, we’re happy to answer your questions in the Theory category at this http URL. Note: There is a reference section at the end of the paper summarizing all the key matrix calculus rules and terminology discussed here. See related articles at this http URL”

AndreaPi · June 29, 2019, 6:42pm

MMDetection is really something…I usually start with Torchvision to get something basic running, but then switch to either MMDetection or maskrcnn-benchmark when I want to improve results.

alvisanovari · July 6, 2019, 1:10am

Possibly something we could throw in with our regular data augmentation steps i.e. change up texture of images with a few different variations?

alvisanovari · July 6, 2019, 11:36pm

It’s not reading but still fascinating ideas and results (Sparse NNs on MNIST and Google Speech Dataset):

AlisonDavey · July 7, 2019, 10:42pm

There is also an interesting podcast from Lex Fridman with Jeff Hawkins.

alvisanovari · July 8, 2019, 8:16am

Yup - that’s where I found him too

digitalspecialists · July 29, 2019, 10:19pm

FixRes

The most remarkable advances are often the simplest… based on leaps of intuition. Significant boost to imagenet SOTA results.

“We experimentally validate that, for a target test resolution, using a lower train resolution offers better classification at test time. We then propose a simple yet effective and efficient strategy to optimize the classifier performance when the train and test resolutions differ. It involves only a computationally cheap fine-tuning of the network at the test resolution

To the best of our knowledge this is the highest ImageNet single-crop, top-1 and top-5 accuracy to date.“

jcatanza · July 29, 2019, 11:18pm

SlimYOLOv3: Narrower, Faster and Better for Real-Time Drone Applications
Drones or general Unmanned Aerial Vehicles (UAVs), endowed with computer vision function by on-board cameras and embedded systems, have become popular in a wide range of applications. However, real-time scene parsing through object detection running on a UAV platform is very challenging, due to limited memory and computing power of embedded devices. To deal with these challenges, in this paper we propose to learn efficient deep object detectors through channel pruning of convolutional layers. To this end, we enforce channel-level sparsity of convolutional layers by imposing L1 regularization on channel scaling factors and prune less informative feature channels to obtain “slim” object detectors. Based on such approach, we present SlimYOLOv3 with fewer trainable parameters and floating point operations (FLOPs) in comparison of original YOLOv3 (Joseph Redmon et al., 2018) as a promising solution for real-time object detection on UAVs. We evaluate SlimYOLOv3 on VisDrone2018-Det benchmark dataset; compelling results are achieved by SlimYOLOv3 in comparison of unpruned counterpart, including

~90.8% decrease of FLOPs,
~92.0% decline of parameter size,
running ~2 times faster and
comparable detection accuracy
as YOLOv3.

Experimental results with different pruning ratios consistently verify that proposed SlimYOLOv3 with narrower structure are more efficient, faster and better than YOLOv3, and thus are more suitable for real-time object detection on UAVs. Our codes are made publicly available at this https URL.

NathanHub · August 5, 2019, 9:47am

Adaptation via fine-tuning. Increasing the crop resolution
at test time is effectively a domain shift. A natural way to
compensate for this shift is to fine-tune the model. In our case,
we fine-tune on the same training set, after switching from
K train to K test . Here we choose to restrict the fine-tuning to the
very last layers of the network.

So, if I understand correctly, an easy way to get significantly better results is to train a network on a given resolution, then fine-tune it on a bigger resolution (which should be the same than the test resolution).

Isn’t it basically what has been taught in fastai for several years now ? A kind of progressive resizing ? That’s nice to finally read an explanation of it

YJP · August 5, 2019, 11:35pm

Article proposing an additional evaluation criterion (efficiency) for research, Green AI: https://arxiv.org/abs/1907.10597

harikrishnanrajeev · August 6, 2019, 5:58am

Expectation-Maximization Attention Networks for Semantic Segmentation

Expectation-maximization attention (EMA) method, which is an augmented version of self attention.
Expectation maximization Attention Unit (EMAU) applied to a semantic segmentation task.