Reducing labeled data needs - CPC 2.0 from DeepMind

LessW2020 · December 16, 2019, 6:40pm

DeepMind published a new paper called “Data Efficient Image Recognition” and introduced CPC 2.0. They accomplish new state of the art on object recognition via transfer learning a CPC trained ResNet and more importantly, set new milestones for training with 2-5x less data:

CPC for vision is basically taking an image, clipping it into overlapping patches, creating feature vectors from each patch and then training the NN by asking it to pick a feature vector from the bottom of the image amongst a series of negative feature vectors from other images.
In other words, it helps it build better representations of the objects in the image.

I wrote a summary article with more info here:

And full paper is here:

The authors indicate CPC 2.0 will be open sourced soon, so hoping we can look at integrating it into FastAI 2.0

Best regards,
Less

nirantk · January 8, 2020, 6:34pm

This is quite good. Thanks for sharing!

Can you please start a discussion here again when the official code is released?

LessW2020 · January 9, 2020, 12:37am

re: discussion when code is released - definitely

MicPie · January 9, 2020, 6:06am

In appendix A.2 on page 14 in the publication they have outlined the setup in pseudo code. Looks quite compact but you have to be careful to keep track of the tensor dimensions.

The setup reminds me of language model pretraining for image data.

(There was also recently posted a nice detailed summary of other self-supervised representation learning approaches. The first image in the article taken from a talk from LeCun is a great visual explanation.)

MicPie · January 13, 2020, 6:31pm

I was looking through the pseudo code in detail:

batch_dim = B
batch of images [B×7×7×4096]

pixelCNN = context network
latents [B×7×7×4096]
cres [B×7×7×4096]

Downsampling in the pixelCNN:
[B×7×7×4096] → [B×L×L×256] → [B×7×7×4096] (L = pixel size, not calculated for the example)

However, I am asking myself why the pixelCNN is going 5x through the for loop and adds c to cres?

CPC loss
col_dim = 7
row_dim = 7
target_dim = 64
targets [B×7×7×64] → [(B×7×7)×64]
col_dim_i = 7 - i - 1
preds_i [B×7×7×64] → [(B×7×7)×64]
logits [(B×7×7)×64] @ [64×(B×7×7)] → [(B×7×7)×(B×7×7)]

However, I am still struggling with the labels part below in the code, i.e., b, col, labels, and loss calculation. Maybe somebody else is also trying to make sense out of it and wants to discuss it?

(PS: cross post on reddit publication thread)

MicPie · January 14, 2020, 6:07pm

Fast.ai article in the direction of this topic:

LessW2020 · January 14, 2020, 7:23pm

Thanks for posting this @MicPie - now I’m very interested to checkout the ‘fine tune’ option in FastAI2.

There’s another paper out on using weakly labeled data first and then using less than 10% of labeled data, meet or beat SOTA. I’ll try and post that paper out shortly (need to find it again).

muellerzr · January 14, 2020, 7:40pm

@LessW2020 from what I can see it’s basically unfreezing mid training:

http://dev.fast.ai/callback.schedule#Learner.fine_tune

Looked at the source a bit more as I said that. The freezing runs at 2x the base learning rate chosen, and the training starts with the tail end of one-cycle before unfreezing

LessW2020 · January 14, 2020, 8:15pm

oh you’re right, thanks for pointing this out - I had visions of something much more intricate but it’s basically compressing a couple standard steps.

def fine_tune(self:Learner, epochs, base_lr=1e-3, freeze_epochs=1, lr_mult=100,
          pct_start=0.3, div=5.0, **kwargs):
"Fine tune with `freeze` for `freeze_epochs` then with `unfreeze` from `epochs` using discriminative LR"
self.freeze()
self.fit_one_cycle(freeze_epochs, slice(base_lr*2), pct_start=0.99, **kwargs)
self.unfreeze()
self.fit_one_cycle(epochs, slice(base_lr/lr_mult, base_lr), pct_start=pct_start, div=div, **kwargs)

MicPie · January 14, 2020, 8:44pm

Was it this one?

(GitHub repo)

It is hard to keep up with the output of publications from Google & Co.
(We need another thread for papers like that to share & discuss. #toolongreadinglist)!

MicPie · January 20, 2020, 6:10pm

A very nice PDF slide deck on Self-Supervised Learning with a lot of nice figures:

MicPie · January 22, 2020, 9:06pm

FixMatch, simpler and yet powerful (Re)MixMatch:

LessW2020 · January 25, 2020, 5:03am

Thanks @MicPie for posting this paper. FixMatch looks much more straightforward to implement while being more powerful. Great to see!

zlapp · January 30, 2020, 1:33pm

Interesting repository could be good reference for CPC in fastai2.

PyTorch implementation of Data-Efficient Image Recognition with Contrastive Predictive Coding

MicPie · February 1, 2020, 9:29am

A very nice blog post explaining contrastive self-supervised learning:

mrfabulous1 · February 5, 2020, 11:23am

Hi MicPie Hope your having a wonderful day!
Wow a very enlightening and informative post, I am convinced we need to change IP as a metaphor human brain.

Cheers mrfabulous1

zlapp · February 5, 2020, 2:28pm

MicPie · June 15, 2020, 12:56pm

This is the next level:
https://kdexd.github.io/virtex/index.html
No labels, just the text annotation needed to get very good embeddings.

muellerzr · June 15, 2020, 1:09pm

I was just reading about that today. I wonder if it has some applications in the text/tabular realm too (curious to see if anyone starts playing with it)

morgan · June 16, 2020, 7:55am

The DAIR paper reading meet-up is covering this paper this Saturday, worth joining if you have the time!