Research Paper Recommendations

asparagui · April 8, 2018, 7:27pm

Colorless green recurrent networks dream hierarchically
https://arxiv.org/abs/1803.11138

digitalspecialists · April 12, 2018, 11:43am

Deep painterly harmonization

luanfujun/deep-painterly-harmonization/blob/master/README.md

# deep-painterly-harmonization
Code and data for paper "[Deep Painterly Harmonization](https://arxiv.org/abs/1804.03189)"  

## Disclaimer 
**This software is published for academic and non-commercial use only.**

## Setup
This code is based on torch. It has been tested on Ubuntu 16.04 LTS.

Dependencies:
* [Torch](https://github.com/torch/torch7) (with [loadcaffe](https://github.com/szagoruyko/loadcaffe))
* [Matlab](https://www.mathworks.com/) or [Octave](https://www.gnu.org/software/octave/)

CUDA backend:
* [CUDA](https://developer.nvidia.com/cuda-downloads)
* [cudnn](https://developer.nvidia.com/cudnn)

Download VGG-19:
```
sh models/download_models.sh

This file has been truncated. show original

https://arxiv.org/abs/1804.03189

Ducky · April 12, 2018, 5:04pm

Data2Vis: Automatic Generation of Data Visualizations Using Sequence-to-Sequence Recurrent Neural Networks
Paper: https://arxiv.org/abs/1804.03126
Blog post: https://towardsdatascience.com/data2vis-automatic-generation-of-data-visualizations-using-sequence-to-sequence-recurrent-neural-5da8e9d3e43e

Basically, they look at creating good data visualizations as a seq2seq problem: a sequence of data comes in, a sequence of code comes out as Vega-Lite declarative visualization code (which then gets turned into a nice graph). Quite nice.

Ducky · April 16, 2018, 8:05pm

This paper, Universal Sentence Encoder, looks very applicable to Lesson 11:
https://arxiv.org/abs/1803.11175v2

jeremy · April 16, 2018, 8:09pm

Yes it is. I know @narvind2003 has talked about benchmarking the TF Hub implementation against our language models. I’d be interested in the results of that.

fmichaelkunz · April 16, 2018, 11:15pm

Im still flabbergasted.

narvind2003 · April 16, 2018, 11:33pm

LOL…

Kasianenko · April 17, 2018, 11:03am

I’d like to talk about DeepLab semantic image segmentation:
https://arxiv.org/abs/1606.00915
As far as I found it’s second solution along with Mask R-CNN to do segmentation and beat all benchmarks with one core difference - Mask does instance image segmentation.
If anyone has suggestions about instance image segmentation - please @mention me or write a pm.

Update:
There is one more network that does instance image segmentation, called (Fully Convolutional Instance-aware Semantic Segmentation)[https//arxiv.org/abs/1611.07709].

Ducky · April 26, 2018, 4:21pm

Last night at our local Data Science Reading Group, we read a paper on topological data analysis which was interesting in a “huh, maybe that would be useful someday” kind of way. This is the paper which we read:
https://arxiv.org/abs/1609.08227
but it is VERY math-y. Here is one which I have only just barely skimmed, just enough to see that it covers the same material and is much more readable:
https://arxiv.org/pdf/1710.04019.pdf
(It’s also more recent.)

Wikipedia article on topological data analysis might give you all of what you need for seeing the basic concept:
https://en.wikipedia.org/wiki/Topological_data_analysis

Basically, you use the shape of the data’s point cloud to give clues to dimensionality reduction. TDA also have gives interesting clustering algorithms.

jeremy · April 26, 2018, 5:33pm

FYI there’s a company called Ayasdi that is commercializing those algorithms.

Ducky · April 26, 2018, 6:15pm

Do you know anything about them? Their name came up in the discussion last night, but nobody had ever heard of them, so we didn’t know how successful they are being.

digitalspecialists · April 28, 2018, 7:12am

A Visual Debugging Tool for Sequence-to-Sequence Models

http://seq2seq-vis.io/

https://arxiv.org/abs/1804.09299

nextM · April 28, 2018, 9:01am

DeepMind just published their ICLR papers list: https://deepmind.com/blog/deepmind-papers-iclr-2018/

One which looks very interesting is MbPA Memory-based Parameter Adaptation which is the generalised version of simple cache pointers, which @sgugger has just tweeted about: https://twitter.com/GuggerSylvain/status/990019822024122368

Sounds like a project to me!

Ducky · May 2, 2018, 4:12am

I’m attending the ICLR conference, and between recent Fast.AI material and ICLR, I’m seeing some definite themes:

Start small, get bigger.

Start with smaller images, and progressively train bigger images. https://github.com/tkarras/progressive_growing_of_gans
Send your image through a few layers, if it is good enough then stop, otherwise send it through a few more layers. https://arxiv.org/abs/1703.09844v4

Have the model look at a bunch of unlabelled data, then learn from a very small amount of labelled data.

Watch video to understand sound: https://arxiv.org/abs/1804.01665v1 (and https://arxiv.org/pdf/1804.03619.pdf)
Robots exploring their world, then tying ropes: https://arxiv.org/abs/1804.08606v1
https://openreview.net/forum?id=ByoT9Fkvz

Ducky · May 2, 2018, 4:18am

This poster about adversarial images was interesting. Basically, they proved that there will always be an adversarial image somewhere which will fool the model.
https://openreview.net/forum?id=SkthlLkPf

(This doesn’t prove that it is easy to find a successful adversarial image, just that they exist.)

Ducky · May 2, 2018, 4:28am

Oh, and this paper said it had a really cheap/easy way to improve the training of GANs:
https://arxiv.org/abs/1802.05957v1

So easy that I think it would be a good candidate to incorporate into the FastAI libraries if it works as advertised.

Ducky · May 4, 2018, 7:06am

I was fortunate to get to spend a fair bit of time with Leslie Smith at the ICLR conference, and wish to report that he thinks that one should cyclically vary the dropout rate in step with the learning rate variation.

Apologies, I don’t remember if he said that he had done experiments or if that was just his intuition.

It made sense to me that you’d want to ramp up the dropout rate, but it wasn’t immediately obvious to me why you would want to ramp down. He made the point that when you are doing inference, you don’t have any dropout. You’d like to end with no dropout so that your end model kind of matches the situation you’re in when you do inference… so you want to ramp down your dropout rate.

jeremy · May 4, 2018, 11:30pm

Yes! I saw a paper that tried this - can’t remember where… Also for DAWNBench I reduced data augmentation at the end. Seems like we should gradually move everything towards inference-time settings - maybe gradually increase batchnorm momentum too…

Ducky · May 4, 2018, 11:39pm

I found this paper on “Curriculum dropout”: https://arxiv.org/abs/1703.06229

Skimming it (SKIMMING!), it looks like they increase dropout but never decrease it.

narvind2003 · May 4, 2018, 11:42pm

Has anyone tried to reduce BPTT + increase bs , gradually, for LM backed classifiers? It seemed to work quite well for me, but I only tried it on Quora dataset so far.
Initially, I dismissed this thinking that there are fewer pad tokens with smaller bptt for a dataset like Quora where the length of an item is approx 20(mean+std Dev).
So I tried training with a fixed but small bptt around 20 and tried:

Steadily increasing bs
Fixed large bs
Neither worked as well as the case where I gradually reduced bptt and increased bs.
Matrix has to go from (short-stout) to (tall-lean). The LM was trained at bptt 20 and the classifier started from bptt 50 and came down to 20 towards the end of training.