Part 2 Lesson 12 wiki

SHAR1 · April 17, 2018, 3:42am

Instead of taking a random vector, does it make sense to, load a pretrained vector from another model. I think this is the key part to implement transfer learning in gans.

harveyslash · April 17, 2018, 3:44am

the idea of the random vector is that you can create a random image based on the vector.
So every time you generate some random vector, you are assured that you can get a new image.

A_TF57 · April 17, 2018, 3:46am

cgan code that Jeremy’s talking about: https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix

SHAR1 · April 17, 2018, 3:49am

So, that means the weights in the deconvolution filter, makes this randomly generated noise close to the actual input. So, do you think that, substituting these weights from another model will allow us to learn faster/better?

harveyslash · April 17, 2018, 3:50am

the weights of the deconv filter are trained to convert some random noise into realistic images.

The problem is that there arent really models that train on this.

One thing that may work would be the decoder networks of auto encoders.

thunderingtyphoons · April 17, 2018, 3:53am

Another paper that does Unsupervised Machine Translation using Cycle Consistency Loss: https://arxiv.org/pdf/1710.11041.pdf

blakewest · April 17, 2018, 3:55am

WaveNet doesn’t actually use RNN’s. RNN’s (from what i saw in a WaveNet presentation) tend to max out their “memory” around 50+ steps (remember we used 70 steps for language data). But WaveNet is generating from audio samples, which is more like thousands, or many thousands of steps. Beating this challenge was part of what made it cool. And they actually use convolutions! You can see more here: https://www.youtube.com/watch?v=YyUXG-BfDbE&t=523s

ravivijay · April 17, 2018, 3:55am

Why are the discriminators not trained for longer time than Generator’s in this case like WGANs?

erinjerri · April 17, 2018, 3:56am

Thank you for posting this!

harveyslash · April 17, 2018, 3:57am

Fixed my post. Thanks for notifying!

A_TF57 · April 17, 2018, 4:00am

Multimodal Unsupervised Image-to-Image Translation: https://arxiv.org/abs/1804.04732

alessa · April 17, 2018, 4:05am

By far the best Lecture of Jeremy. Thank you!
saying that when the previous ones were already Damn Good

pnvijay · April 17, 2018, 4:08am

Amazing lectures that keep getting better every time. But I wonder how am I going to keep pace with three different concepts and be upto speed in a week on them so that I can concentrate for the new lecture next week. Are there others who feel the same? If so suggestions welcome.

gdc · April 17, 2018, 4:11am

it seems DCGANs do a great job of identifying/segmenting the parts of image to modify at least in some cases. Could we somehow transfer this learning to improve bounding boxes detection, or improve the accuracy of a classifier ?

Ducky · April 17, 2018, 4:14am

I am also having trouble keeping up, it’s not just you.

ananda_seelan · April 17, 2018, 5:00am

Yes. People in my team used CGans specifically to augment OCR data to introduce distortions so as to make the original distribution to be more robust.

sjcho · April 17, 2018, 6:26am

In this paper Are GANs Created Equal? A Large-Scale Study, they comapred performance of several GANs. They mentioned two metrics for evaluating performance of GAN (+ their own metric), and it looks interesting - use another NN for evaluation.

InceptionScore(IS) : How well image classification network(Inception Net trained on imagenet) can classify the class of GAN output.
Fr´echet Inception Distance (FID): Measure mean/covariance of feature space of Inception Net and compare to originals. This can measure the diversity of models, so it can detect mode-collapse.

DavideBoschetto · April 17, 2018, 7:30am

Woah, the result on “time-to-94%” challenge is amazing!
1cycle really is game-changing…

chunduri · April 17, 2018, 12:19pm

where can I find “cifar10-datknet.ipynb”, notebook?
it is not available in dl2 folder.

If we have to build it on by our self from Jeremy’s lecture, it is even better.

DavideBoschetto · April 17, 2018, 12:22pm

It will be shared tomorrow (or today, depending on your time zone!).

Quick general question regarding the CIFAR10 comment by Jeremy, saying that rotation is not ideal with such small dataset, but a nicer approach would be random flip + reflection padding + random/center crop. Do you think this is due to the fact that CIFAR10 does not have a clear “centered” object and clear background? If I had a dataset of 32x32 images with a single centered object and dark background all around it, would rotation (hence, interpolation) be harmful, considering that other augmentation might be applied anyway (zoom/shifts)?