Part 2 Lesson 12 wiki

You might want to make that clear in your intro so that people who don’t have a study group don’t get discouraged and think they aren’t cut out for this. :wink:

3 Likes

Depends on the target I guess … whether motel6 or marriott :wink:

1 Like

IMHO, this was the best lesson of Part2, so far.

I’ve been playing with GANs in applications to Internet traffic generation. But, I have a more general question. Is there any known approach to build a binary classifier to detect if an entry (e.g., audio, video, image) is real or if it comes from a GAN (fake)? For instance, how to detect if the bedroom image came from the DCGAN? Or if a certain audio is really Obama’s voice?

Research papers or Github codes are welcome :slight_smile:

1 Like

I did say that in lesson 8 - but I’ll mention it on the lesson page too when we do the MOOC.

3 Likes

Just a standard binary classifier of the kinds we’ve used throughout the course should be fine.

I got the answer for the tanh question of yesterday for the last activation of the DCGAN generator.
As usual, our images have been normalized, so their pixel values don’t go between 0 and 1 anymore. This is why we want values going from -1 to 1 otherwise we wouldn’t give a correct input for the discriminator.

3 Likes

If they’re normalized to have mean zero and std one, then tanh won’t be ideal, because it can’t create outputs with abs value >1. Wonder if we need to look at this again…

In the DCGAN paper they say they normalize their images to have a range of pixel from -1 to 1. I’ll try that sometimes this week to see if this yields better results.

Also, I believe there’s a small bug in the WGAN notebook, when we define the ConvBlock and the DeconvBlock: the parameter bn is never used to test if we should add it or not. But then bn=False is used twice (at the places indicated in the DCGAN article).

Don’t know if I should do a PR or if you plan to change this notebook soon, Jeremy.

I’ll take a look now.

OK double-checked and I am doing that already.

Oh, yes indeed. I though inception_stats were some numbers like image_net stats or the cifar10 stats but they’re exactly normalizing from -1 to 1.

I’m so biased bc GANs and creativeAI is like my favorite, even though all things CV (lectures heavy on the MOOC) are like gateway drug for me to get more into AI (like all of CS231N and image segmentation by Andrej Karpathy.

Idk if this was already posted: http://videolectures.net/deeplearning2017_courville_generative_models/
Its a very good video to understand WGAN and the Improved WGAN.

3 Likes

Hey, has anyone downloaded the kaggle dataset on their cloud machine? I’m trying to use the kaggle-cli but I’m coming up empty for some reason. Specifically, I’ve done kg dataset -o jhoward -d lsun_bedroom, which according to the docs, should work in this case. I’m logged in through kg’s global config, but when I run the above command, it pretty much immediately finishes and nothing happens. No message. No files in my folder. I’m not really sure what to do. I could try to curl it I suppose, and add authentication through curl, which is kind of a hassle, but how have other people done this?

Try the cliget firefox extension.

that worked. thanks!

Does anybody know why I am getting wrong number for trn_dl length? I am getting 293 as the length, but when I check the dataset length it is 37469, which is the correct value. The resize operation is also resizing correct number of files.

The length of trn_dl is equal to the number of batches in one epoch. You can verify this is correct by dividing 37469 by your batch size (128) and rounding up.

Thanks, I thought when I call train(1, False) it is going to use whole dataset. Right now it is using only 293 images. So, how do I train using full dataset?