Part 2 Lesson 12 wiki

chunduri · April 17, 2018, 12:34pm

my understanding is that, the 32x32 images would have been transformed images from some other higher size, any more complicated augmentations would lead to loss of image context.

I don’t think it is about location of objects in the image, since this is not a object detection or localisation problem.

If u have good 32x32 images, may be we can apply more transforms. Its about what we might loose by applying transforms. My understanding completely stands on the problem not being a detection/localization problem, and hence we can apply rotation/flips on good images of 32x32 size.

Interogativ · April 17, 2018, 2:52pm

I try to spend enogh time to ‘understand’ then implement each new thing, but my implementations are never really deep enogh, because writing and debugging good code takes days or weeks, not hours. Last week for example I spent Mon-Fri on the translate, and only Sat-mon on Devise. When it was suggested to use ‘Beam Search’ on translate, I ran out of time before I could explore it, so I ‘m working on that Today. So far I’m spending on average 45-50 hours per week and don’t feel that I’m keeping up. This course is a firehose of information. I understand that trying to cover all this stuff in 7 weeks is at best difficult and at worst impossible so @jeremy is doing the best he can with limited resouces timewise. But the good news is you can always refer to the videos and the forums for help.

NitinP · April 17, 2018, 4:47pm

In case you want to download Jeremy’s 20% sample lsun dataset from kaggle on Google colab and wondering how to use kaggle api on colab this is a nice post to refer

jeremy · April 17, 2018, 5:41pm

Hopefully there’s enough material to keep you all amused until next year’s course I don’t expect anyone to master all the material in these 7 weeks - just try to pick up the key bits each week and hopefully find some time to dig a bit more into one or two key areas that interest you now.

Ducky · April 17, 2018, 5:56pm

You might want to make that clear in your intro so that people who don’t have a study group don’t get discouraged and think they aren’t cut out for this.

Deb · April 17, 2018, 6:21pm

Depends on the target I guess … whether motel6 or marriott

stenio · April 17, 2018, 6:38pm

IMHO, this was the best lesson of Part2, so far.

I’ve been playing with GANs in applications to Internet traffic generation. But, I have a more general question. Is there any known approach to build a binary classifier to detect if an entry (e.g., audio, video, image) is real or if it comes from a GAN (fake)? For instance, how to detect if the bedroom image came from the DCGAN? Or if a certain audio is really Obama’s voice?

Research papers or Github codes are welcome

jeremy · April 17, 2018, 6:44pm

I did say that in lesson 8 - but I’ll mention it on the lesson page too when we do the MOOC.

jeremy · April 17, 2018, 6:45pm

Just a standard binary classifier of the kinds we’ve used throughout the course should be fine.

sgugger · April 17, 2018, 8:22pm

I got the answer for the tanh question of yesterday for the last activation of the DCGAN generator.
As usual, our images have been normalized, so their pixel values don’t go between 0 and 1 anymore. This is why we want values going from -1 to 1 otherwise we wouldn’t give a correct input for the discriminator.

jeremy · April 17, 2018, 8:29pm

If they’re normalized to have mean zero and std one, then tanh won’t be ideal, because it can’t create outputs with abs value >1. Wonder if we need to look at this again…

sgugger · April 17, 2018, 8:30pm

In the DCGAN paper they say they normalize their images to have a range of pixel from -1 to 1. I’ll try that sometimes this week to see if this yields better results.

sgugger · April 17, 2018, 8:39pm

Also, I believe there’s a small bug in the WGAN notebook, when we define the ConvBlock and the DeconvBlock: the parameter bn is never used to test if we should add it or not. But then bn=False is used twice (at the places indicated in the DCGAN article).

Don’t know if I should do a PR or if you plan to change this notebook soon, Jeremy.

jeremy · April 17, 2018, 8:58pm

I’ll take a look now.

jeremy · April 17, 2018, 9:29pm

OK double-checked and I am doing that already.

sgugger · April 17, 2018, 9:42pm

Oh, yes indeed. I though inception_stats were some numbers like image_net stats or the cifar10 stats but they’re exactly normalizing from -1 to 1.

erinjerri · April 17, 2018, 11:33pm

I’m so biased bc GANs and creativeAI is like my favorite, even though all things CV (lectures heavy on the MOOC) are like gateway drug for me to get more into AI (like all of CS231N and image segmentation by Andrej Karpathy.

renato · April 18, 2018, 1:00am

Idk if this was already posted: http://videolectures.net/deeplearning2017_courville_generative_models/
Its a very good video to understand WGAN and the Improved WGAN.

blakewest · April 18, 2018, 3:36am

Hey, has anyone downloaded the kaggle dataset on their cloud machine? I’m trying to use the kaggle-cli but I’m coming up empty for some reason. Specifically, I’ve done kg dataset -o jhoward -d lsun_bedroom, which according to the docs, should work in this case. I’m logged in through kg’s global config, but when I run the above command, it pretty much immediately finishes and nothing happens. No message. No files in my folder. I’m not really sure what to do. I could try to curl it I suppose, and add authentication through curl, which is kind of a hassle, but how have other people done this?

jeremy · April 18, 2018, 4:05am

Try the cliget firefox extension.