Part 2 Lesson 12 wiki


(Sharwon Pius) #97

Instead of taking a random vector, does it make sense to, load a pretrained vector from another model. I think this is the key part to implement transfer learning in gans.


#98

the idea of the random vector is that you can create a random image based on the vector.
So every time you generate some random vector, you are assured that you can get a new image.


(Ankit Goila) #113

cgan code that Jeremy’s talking about: https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix


(Sharwon Pius) #114

So, that means the weights in the deconvolution filter, makes this randomly generated noise close to the actual input. So, do you think that, substituting these weights from another model will allow us to learn faster/better?


#115

the weights of the deconv filter are trained to convert some random noise into realistic images.

The problem is that there arent really models that train on this.

One thing that may work would be the decoder networks of auto encoders.


(Thundering Typhoons) #116

Another paper that does Unsupervised Machine Translation using Cycle Consistency Loss: https://arxiv.org/pdf/1710.11041.pdf


(blake west) #117

WaveNet doesn’t actually use RNN’s. RNN’s (from what i saw in a WaveNet presentation) tend to max out their “memory” around 50+ steps (remember we used 70 steps for language data). But WaveNet is generating from audio samples, which is more like thousands, or many thousands of steps. Beating this challenge was part of what made it cool. And they actually use convolutions! You can see more here: https://www.youtube.com/watch?v=YyUXG-BfDbE&t=523s


(Ravi Sekar Vijayakumar) #118

Why are the discriminators not trained for longer time than Generator’s in this case like WGANs?


(Erin Pangilinan) #119

Thank you for posting this!


#120

Fixed my post. Thanks for notifying!


(Ankit Goila) #121

Multimodal Unsupervised Image-to-Image Translation: https://arxiv.org/abs/1804.04732


(Aless Bandrabur) #122

By far the best Lecture of Jeremy. :slight_smile: Thank you!
saying that when the previous ones were already Damn Good :sunny:


(Vijay Narayanan Parakimeethal) #123

Amazing lectures that keep getting better every time. But I wonder how am I going to keep pace with three different concepts and be upto speed in a week on them so that I can concentrate for the new lecture next week. Are there others who feel the same? If so suggestions welcome.


#124

it seems DCGANs do a great job of identifying/segmenting the parts of image to modify at least in some cases. Could we somehow transfer this learning to improve bounding boxes detection, or improve the accuracy of a classifier ?


(Kaitlin Duck Sherwood) #125

I am also having trouble keeping up, it’s not just you.


(Ananda Seelan) #126

Yes. People in my team used CGans specifically to augment OCR data to introduce distortions so as to make the original distribution to be more robust.


(Sukjae Cho) #127

In this paper Are GANs Created Equal? A Large-Scale Study, they comapred performance of several GANs. They mentioned two metrics for evaluating performance of GAN (+ their own metric), and it looks interesting - use another NN for evaluation.

  1. InceptionScore(IS) : How well image classification network(Inception Net trained on imagenet) can classify the class of GAN output.
  2. Fr´echet Inception Distance (FID): Measure mean/covariance of feature space of Inception Net and compare to originals. This can measure the diversity of models, so it can detect mode-collapse.

(Davide Boschetto) #128

Woah, the result on “time-to-94%” challenge is amazing!
1cycle really is game-changing…


(chunduri) #129

where can I find “cifar10-datknet.ipynb”, notebook?
it is not available in dl2 folder.

If we have to build it on by our self from Jeremy’s lecture, it is even better.


(Davide Boschetto) #130

It will be shared tomorrow (or today, depending on your time zone!).

Quick general question regarding the CIFAR10 comment by Jeremy, saying that rotation is not ideal with such small dataset, but a nicer approach would be random flip + reflection padding + random/center crop. Do you think this is due to the fact that CIFAR10 does not have a clear “centered” object and clear background? If I had a dataset of 32x32 images with a single centered object and dark background all around it, would rotation (hence, interpolation) be harmful, considering that other augmentation might be applied anyway (zoom/shifts)?