Instead of taking a random vector, does it make sense to, load a pretrained vector from another model. I think this is the key part to implement transfer learning in gans.
the idea of the random vector is that you can create a random image based on the vector.
So every time you generate some random vector, you are assured that you can get a new image.
cgan code that Jeremy’s talking about: https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix
So, that means the weights in the deconvolution filter, makes this randomly generated noise close to the actual input. So, do you think that, substituting these weights from another model will allow us to learn faster/better?
the weights of the deconv filter are trained to convert some random noise into realistic images.
The problem is that there arent really models that train on this.
One thing that may work would be the decoder networks of auto encoders.
Another paper that does Unsupervised Machine Translation using Cycle Consistency Loss: https://arxiv.org/pdf/1710.11041.pdf
WaveNet doesn’t actually use RNN’s. RNN’s (from what i saw in a WaveNet presentation) tend to max out their “memory” around 50+ steps (remember we used 70 steps for language data). But WaveNet is generating from audio samples, which is more like thousands, or many thousands of steps. Beating this challenge was part of what made it cool. And they actually use convolutions! You can see more here: https://www.youtube.com/watch?v=YyUXG-BfDbE&t=523s
Why are the discriminators not trained for longer time than Generator’s in this case like WGANs?
Thank you for posting this!
Fixed my post. Thanks for notifying!
Multimodal Unsupervised Image-to-Image Translation: https://arxiv.org/abs/1804.04732
By far the best Lecture of Jeremy. Thank you!
saying that when the previous ones were already Damn Good
Amazing lectures that keep getting better every time. But I wonder how am I going to keep pace with three different concepts and be upto speed in a week on them so that I can concentrate for the new lecture next week. Are there others who feel the same? If so suggestions welcome.
it seems DCGANs do a great job of identifying/segmenting the parts of image to modify at least in some cases. Could we somehow transfer this learning to improve bounding boxes detection, or improve the accuracy of a classifier ?
I am also having trouble keeping up, it’s not just you.
Yes. People in my team used CGans specifically to augment OCR data to introduce distortions so as to make the original distribution to be more robust.
In this paper Are GANs Created Equal? A Large-Scale Study, they comapred performance of several GANs. They mentioned two metrics for evaluating performance of GAN (+ their own metric), and it looks interesting - use another NN for evaluation.
- InceptionScore(IS) : How well image classification network(Inception Net trained on imagenet) can classify the class of GAN output.
- Fr´echet Inception Distance (FID): Measure mean/covariance of feature space of Inception Net and compare to originals. This can measure the diversity of models, so it can detect mode-collapse.
Woah, the result on “time-to-94%” challenge is amazing!
1cycle really is game-changing…
where can I find “cifar10-datknet.ipynb”, notebook?
it is not available in dl2 folder.
If we have to build it on by our self from Jeremy’s lecture, it is even better.
It will be shared tomorrow (or today, depending on your time zone!).
Quick general question regarding the CIFAR10 comment by Jeremy, saying that rotation is not ideal with such small dataset, but a nicer approach would be random flip + reflection padding + random/center crop. Do you think this is due to the fact that CIFAR10 does not have a clear “centered” object and clear background? If I had a dataset of 32x32 images with a single centered object and dark background all around it, would rotation (hence, interpolation) be harmful, considering that other augmentation might be applied anyway (zoom/shifts)?