Reproducibility Challenge 2020 - fastai folks interested

morgan · October 12, 2020, 1:44am

Hey all,

Just wondering if anyone has any interest in joining up for the reproducibility challenge this year?

The primary goal of this event is to encourage the publishing and sharing of scientific results that are reliable and reproducible. In support of this, the objective of this challenge is to investigate reproducibility of papers accepted for publication at top conferences by inviting members of the community at large to select a paper, and verify the empirical results and claims in the paper by reproducing the computational experiments, either via a new implementation or using code/data or other information provided by the authors.

Theres plenty of time until submissions, (Dec 4th is early submission deadline, Jan 8th is late submission deadline). Could be fun to tackle a paper that has shown promise and would be a useful addition to fastai

Comment below if you think you’ll have a bit of time to spare and what paper you think could be worth reproducing

DanielLam · October 12, 2020, 2:27am

Hi Morgan,

Interesting challenge. Are you thinking of reproducing through the fastai framework or some other or just pytorch?

tyoc213 · October 12, 2020, 6:31am

I would love to do that… but I dont have datasource or paper to replicate.

But Im interested in ASR (automatic speech recognition… is STT even a termn?) and TTS…

If you people know something that would be nice to reproduce, let me know (if possible but not restricted to main languages with little corpora [in audio or text… example no wikipedia source])… so that I can choose.

stefan-ai · October 12, 2020, 8:31am

Hi Morgan,

Excellent idea! I would gladly join such project.

I was thinking about the Reformer, which was presented at ICLR 2020 and could be an interesting and challenging paper to replicate. But I’m also open to other ideas

morgan · October 12, 2020, 2:59pm

@DanielLam Probably keeping much of the boilerplate in fastai I think. For example if it is a new model architecture proposed then that can probably be dropped in to fastai pretty easily without too much concern that we don’t use the exact same Pytorch Dataloader, Dataset etc (as long as the preprocessing/augmentation etc are the same of course)

@tyoc213 ASR/TTS are super interesting alright. I’ve been focussed on NLP recently but would enjoy dipping into another area too

@stefan-ai Yes Reformer is very cool, it would be a lot of fun to implement alright. Just wondering when it comes to experiment replication would the GPU compute needed be too much? But happy to give it a shot if you think its manageable! HuggingFace also had a nice blog explaining it: The Reformer - Pushing the limits of language modeling

(@Richard-Wang you should definitely enter your ELECTRA work to the reproducibility challenge too, you’ve done phenomenal work on it so far!)

I’ll have a look around at a few NLP/ASR/TTS papers and try find some interesting, useful & low resource ones to replicate. I’ll share here when I do. Probably the Good Readings threads have a few contenders:

stefan-ai · October 12, 2020, 3:13pm

Right, that’s a good point I will have a closer look at the paper these days keeping this in mind. Btw, course 4 of the new deeplearning.ai NLP specialization includes an implementation of Reformer in trax. It will certainly be very challenging from a implementation point of view too.

Do you have any other idea which recent NLP paper we could consider?

PriyanK7n · October 12, 2020, 8:12pm

I m down for it currently working on classification of Covid Detetction using deep learning implemented in Fast AI and trying it to improve the results of a previous paper published on 26 April which had a baseline of F1 score of 97.31% so either gonna be getting closer to it… or gonna try to beat it

morgan · October 13, 2020, 12:41am

Encoding word order in complex embeddings

This has been in the back of my mind since ICLR, they improved transformer performances by using a new positional embedding type. There were a couple of varieties if I recall, the second also modified the entire Transformer to be able to deal with comlex numbers.

But the results seemed pretty impressive given that it was just a change to the positional embedding. This, or another positional embedding kind of paper, could be worth looking into.

Code: https://github.com/iclr-complex-order/complex-order

DeLighT: Very Deep and Light-weight Transformer

Paper: https://arxiv.org/pdf/2008.00623.pdf
code: https://github.com/sacmehta/delight

This was another interesting one with a funky layer structure. Again maybe training duration might be an issue, but at least one of their results was trained on a 1080 GPU! But then others were trained on 16 V100s soo…worth a look at least!

hallvagi · October 13, 2020, 7:42am

This sounds interesting! Most of the recent transformer papers are prohibitively compute intensive I guess. Lately I’ve been lookin into transformers for CV, such as this one for example. But not only do they use a ton of compute, but also a private dataset…

morgan · October 13, 2020, 1:38pm

Yes Vision Transformer is super interesting, a few of us on the Discord looked at it last week. Have you seen the pytorch implementation? vit-pytorch/vit_pytorch at main · lucidrains/vit-pytorch · GitHub

It works with a standard fastai vision workflow, just drop in the ViT model instead of a resnet etc…

hallvagi · October 13, 2020, 1:58pm

No, that was new to me - thanks! The code seems pretty understandable (gotta love einsum ), but I guess training it is difficult?

morgan · October 13, 2020, 4:18pm

From the little testing we did, it didn’t perform that well, from the paper it seems to only shine when given huge volumnes of data. However the architecture is super simple so I feel there is a lot to be improved

stefan-ai · October 13, 2020, 4:25pm

Alright, I had a closer look at the Reformer paper and here’s what I found.

Regarding compute:

They only specify vaguely that “Training for all experiments was parallelized across 8 devices (8 GPUs or 8 TPU v3 cores)”
Pricing for 8 TPU v3 cores on GCP is 8 USD per hour
Not sure which GPU type they use, but for 8 V100 GPUs it would be 19.84 USD per hour

Regarding data set sizes:

imagenet64: 12 GB training and 456 MB validation set
WMT’14 English-German: 4.5 million sentence pairs
enwik8: 100 MB (but I’m not entirely sure about this one)

I guess compute requirements and data set sizes are not too large in the world of transformers, but might be still prohibitively large for us here. Not sure how far 300 USD credits on GCP would get us.

I recently came across this program here which gives free TPUs to certain research projects (no idea if this would qualify though): TPU Research Cloud

That paper sounds very interesting too. Would be especially nice to replicate since we would need to work with different models, i.e. fasttext, LSTM, CNN and Transformer. Any idea what the resource requirements would be here?

Dean-DAGs · October 16, 2020, 8:21pm

Hey @morgan, I’d love to try to help if I can.

SamJoel · October 18, 2020, 11:27am

Can I join. I’m trying out some self supervised learning papers…like Simclr and other image representation models like cgd.I am also making a fastai implementation of it as a open source repo. But my models are not working yet though😅. I’ll try my best to make it work.

morgan · October 23, 2020, 4:39pm

Sorry I got swamped with work!

@stefan-ai I’m actually doing a little work with Reformer at the moment (well getting a notebook to work with it), so maybe it might be worth giving it a shot. I have a 2080 ti and theres kaggle gpus and tpus and some GCP credits… With some careful checkpointing and a little patience we might be able to get it done.

Re the cost of the complex embeddings paper, the experiments with the smaller/older models should be fine, but they also have some experiments with Transformer XL, which might be prohibitive to try…

I’m happy to give Reformer a shot, if nothing else we’ll learn a huge amount with all of the new ideas they introduced!

@Dean-DAGs the more the merrier! Do you have a particular area or research or paper you are interested in?

@SamJoel have you seen the unsupervised learning code in the ImageWoof repo and also the “self-supervised” repo (https://github.com/KeremTurgutlu/self_supervised/)?

SamJoel · October 23, 2020, 5:12pm

@morgan I have not seen the repo but I finished implementing some unsupervised and semi supervised models like SimClr and CGD for image representation learning and unsupervised learning. This is my repo https://github.com/Samjoel3101/Self-Supervised-Learning-fastai2 check it out and tell me you views on it.

Dean-DAGs · October 25, 2020, 11:25am

Umm, actually I don’t really mind, I’d just be happy to join people working on something and pitching in as I can. To be completely transparent, I’m working on a platform for data science collaboration (https://dagshub.com), so this would be a great learning experience for me.

stefan-ai · October 25, 2020, 6:02pm

Sounds great. How should we get started?

morgan · October 26, 2020, 5:43pm

I can do the registration to get things started. You need a (free) account with OpenReview, and with that I can add you as a team member.

Discord Call

Then maybe we can organise a quick call on the Discord server to draw up a plan and divide responsibilities? Maybe on Thursday if it suits? That’ll also give us a bit of time to review the paper again in a bit more depth

Resources

For anyone interested in joining Reformer Reproducibility, be sure to get familiar with it:

Paper: https://openreview.net/pdf?id=rkgNKkHtvB
Authors ICLR video: https://iclr.cc/virtual_2020/poster_rkgNKkHtvB.html
Yannic K explainer: https://www.youtube.com/watch?v=i4H0kjxrias&t=1s
HuggingFace blog post: https://huggingface.co/blog/reformer