Reproducibility Challenge 2020 - fastai folks interested

Hey all,

Just wondering if anyone has any interest in joining up for the reproducibility challenge this year?

The primary goal of this event is to encourage the publishing and sharing of scientific results that are reliable and reproducible. In support of this, the objective of this challenge is to investigate reproducibility of papers accepted for publication at top conferences by inviting members of the community at large to select a paper, and verify the empirical results and claims in the paper by reproducing the computational experiments, either via a new implementation or using code/data or other information provided by the authors.

Theres plenty of time until submissions, (Dec 4th is early submission deadline, Jan 8th is late submission deadline). Could be fun to tackle a paper that has shown promise and would be a useful addition to fastai :slight_smile:

Comment below if you think you’ll have a bit of time to spare and what paper you think could be worth reproducing


Hi Morgan,

Interesting challenge. Are you thinking of reproducing through the fastai framework or some other or just pytorch?

1 Like

I would love to do that… but I dont have datasource or paper to replicate.

But Im interested in ASR (automatic speech recognition… is STT even a termn?) and TTS…

If you people know something that would be nice to reproduce, let me know (if possible but not restricted to main languages with little corpora [in audio or text… example no wikipedia source])… so that I can choose.

Hi Morgan,

Excellent idea! I would gladly join such project.

I was thinking about the Reformer, which was presented at ICLR 2020 and could be an interesting and challenging paper to replicate. But I’m also open to other ideas :slight_smile:

@DanielLam Probably keeping much of the boilerplate in fastai I think. For example if it is a new model architecture proposed then that can probably be dropped in to fastai pretty easily without too much concern that we don’t use the exact same Pytorch Dataloader, Dataset etc (as long as the preprocessing/augmentation etc are the same of course)

@tyoc213 ASR/TTS are super interesting alright. I’ve been focussed on NLP recently but would enjoy dipping into another area too

@stefan-ai Yes Reformer is very cool, it would be a lot of fun to implement alright. Just wondering when it comes to experiment replication would the GPU compute needed be too much? But happy to give it a shot if you think its manageable! HuggingFace also had a nice blog explaining it:

(@Richard-Wang you should definitely enter your ELECTRA work to the reproducibility challenge too, you’ve done phenomenal work on it so far!)

I’ll have a look around at a few NLP/ASR/TTS papers and try find some interesting, useful & low resource ones to replicate. I’ll share here when I do. Probably the Good Readings threads have a few contenders:


Right, that’s a good point :thinking: I will have a closer look at the paper these days keeping this in mind. Btw, course 4 of the new NLP specialization includes an implementation of Reformer in trax. It will certainly be very challenging from a implementation point of view too.

Do you have any other idea which recent NLP paper we could consider?

I m down for it currently working on classification of Covid Detetction using deep learning implemented in Fast AI and trying it to improve the results of a previous paper published on 26 April which had a baseline of F1 score of 97.31% so either gonna be getting closer to it… or gonna try to beat it :grin:

Encoding word order in complex embeddings

This has been in the back of my mind since ICLR, they improved transformer performances by using a new positional embedding type. There were a couple of varieties if I recall, the second also modified the entire Transformer to be able to deal with comlex numbers.

But the results seemed pretty impressive given that it was just a change to the positional embedding. This, or another positional embedding kind of paper, could be worth looking into.


DeLighT: Very Deep and Light-weight Transformer


This was another interesting one with a funky layer structure. Again maybe training duration might be an issue, but at least one of their results was trained on a 1080 GPU! But then others were trained on 16 V100s soo…worth a look at least!

1 Like

This sounds interesting! Most of the recent transformer papers are prohibitively compute intensive I guess. Lately I’ve been lookin into transformers for CV, such as this one for example. But not only do they use a ton of compute, but also a private dataset…

1 Like

Yes Vision Transformer is super interesting, a few of us on the Discord looked at it last week. Have you seen the pytorch implementation?

It works with a standard fastai vision workflow, just drop in the ViT model instead of a resnet etc…

1 Like

No, that was new to me - thanks! The code seems pretty understandable (gotta love einsum :slight_smile:), but I guess training it is difficult?

1 Like

From the little testing we did, it didn’t perform that well, from the paper it seems to only shine when given huge volumnes of data. However the architecture is super simple so I feel there is a lot to be improved


Alright, I had a closer look at the Reformer paper and here’s what I found.

Regarding compute:

  • They only specify vaguely that “Training for all experiments was parallelized across 8 devices (8 GPUs or 8 TPU v3 cores)”
  • Pricing for 8 TPU v3 cores on GCP is 8 USD per hour
  • Not sure which GPU type they use, but for 8 V100 GPUs it would be 19.84 USD per hour

Regarding data set sizes:

I guess compute requirements and data set sizes are not too large in the world of transformers, but might be still prohibitively large for us here. Not sure how far 300 USD credits on GCP would get us.

I recently came across this program here which gives free TPUs to certain research projects (no idea if this would qualify though):

That paper sounds very interesting too. Would be especially nice to replicate since we would need to work with different models, i.e. fasttext, LSTM, CNN and Transformer. Any idea what the resource requirements would be here?

1 Like

Hey @morgan, I’d love to try to help if I can.

Can I join. I’m trying out some self supervised learning papers…like Simclr and other image representation models like cgd.I am also making a fastai implementation of it as a open source repo. But my models are not working yet though😅. I’ll try my best to make it work.

1 Like

Sorry I got swamped with work!

@stefan-ai I’m actually doing a little work with Reformer at the moment (well getting a notebook to work with it), so maybe it might be worth giving it a shot. I have a 2080 ti and theres kaggle gpus and tpus and some GCP credits… With some careful checkpointing and a little patience we might be able to get it done.

Re the cost of the complex embeddings paper, the experiments with the smaller/older models should be fine, but they also have some experiments with Transformer XL, which might be prohibitive to try…

I’m happy to give Reformer a shot, if nothing else we’ll learn a huge amount with all of the new ideas they introduced!

@Dean-DAGs the more the merrier! Do you have a particular area or research or paper you are interested in?

@SamJoel have you seen the unsupervised learning code in the ImageWoof repo and also the “self-supervised” repo (


@morgan I have not seen the repo but I finished implementing some unsupervised and semi supervised models like SimClr and CGD for image representation learning and unsupervised learning. This is my repo check it out and tell me you views on it. :slightly_smiling_face: