Part 2 Lesson 12 wiki

my understanding is that, the 32x32 images would have been transformed images from some other higher size, any more complicated augmentations would lead to loss of image context.

I don’t think it is about location of objects in the image, since this is not a object detection or localisation problem.

If u have good 32x32 images, may be we can apply more transforms. Its about what we might loose by applying transforms. My understanding completely stands on the problem not being a detection/localization problem, and hence we can apply rotation/flips on good images of 32x32 size.

I try to spend enogh time to ‘understand’ then implement each new thing, but my implementations are never really deep enogh, because writing and debugging good code takes days or weeks, not hours. Last week for example I spent Mon-Fri on the translate, and only Sat-mon on Devise. When it was suggested to use ‘Beam Search’ on translate, I ran out of time before I could explore it, so I ‘m working on that Today. So far I’m spending on average 45-50 hours per week and don’t feel that I’m keeping up. This course is a firehose of information. I understand that trying to cover all this stuff in 7 weeks is at best difficult and at worst impossible so @jeremy is doing the best he can with limited resouces timewise. But the good news is you can always refer to the videos and the forums for help.

12 Likes

In case you want to download Jeremy’s 20% sample lsun dataset from kaggle on Google colab and wondering how to use kaggle api on colab this is a nice post to refer

3 Likes

Hopefully there’s enough material to keep you all amused until next year’s course :slight_smile: I don’t expect anyone to master all the material in these 7 weeks - just try to pick up the key bits each week and hopefully find some time to dig a bit more into one or two key areas that interest you now.

17 Likes

You might want to make that clear in your intro so that people who don’t have a study group don’t get discouraged and think they aren’t cut out for this. :wink:

3 Likes

Depends on the target I guess … whether motel6 or marriott :wink:

1 Like

IMHO, this was the best lesson of Part2, so far.

I’ve been playing with GANs in applications to Internet traffic generation. But, I have a more general question. Is there any known approach to build a binary classifier to detect if an entry (e.g., audio, video, image) is real or if it comes from a GAN (fake)? For instance, how to detect if the bedroom image came from the DCGAN? Or if a certain audio is really Obama’s voice?

Research papers or Github codes are welcome :slight_smile:

1 Like

I did say that in lesson 8 - but I’ll mention it on the lesson page too when we do the MOOC.

3 Likes

Just a standard binary classifier of the kinds we’ve used throughout the course should be fine.

I got the answer for the tanh question of yesterday for the last activation of the DCGAN generator.
As usual, our images have been normalized, so their pixel values don’t go between 0 and 1 anymore. This is why we want values going from -1 to 1 otherwise we wouldn’t give a correct input for the discriminator.

3 Likes

If they’re normalized to have mean zero and std one, then tanh won’t be ideal, because it can’t create outputs with abs value >1. Wonder if we need to look at this again…

In the DCGAN paper they say they normalize their images to have a range of pixel from -1 to 1. I’ll try that sometimes this week to see if this yields better results.

Also, I believe there’s a small bug in the WGAN notebook, when we define the ConvBlock and the DeconvBlock: the parameter bn is never used to test if we should add it or not. But then bn=False is used twice (at the places indicated in the DCGAN article).

Don’t know if I should do a PR or if you plan to change this notebook soon, Jeremy.

I’ll take a look now.

OK double-checked and I am doing that already.

Oh, yes indeed. I though inception_stats were some numbers like image_net stats or the cifar10 stats but they’re exactly normalizing from -1 to 1.

I’m so biased bc GANs and creativeAI is like my favorite, even though all things CV (lectures heavy on the MOOC) are like gateway drug for me to get more into AI (like all of CS231N and image segmentation by Andrej Karpathy.

Idk if this was already posted: http://videolectures.net/deeplearning2017_courville_generative_models/
Its a very good video to understand WGAN and the Improved WGAN.

3 Likes

Hey, has anyone downloaded the kaggle dataset on their cloud machine? I’m trying to use the kaggle-cli but I’m coming up empty for some reason. Specifically, I’ve done kg dataset -o jhoward -d lsun_bedroom, which according to the docs, should work in this case. I’m logged in through kg’s global config, but when I run the above command, it pretty much immediately finishes and nothing happens. No message. No files in my folder. I’m not really sure what to do. I could try to curl it I suppose, and add authentication through curl, which is kind of a hassle, but how have other people done this?

Try the cliget firefox extension.