How to autoencode your Pokemon

Hi everyone,

I’m learning a ton about deep learning from our course here, and a few other resources. This week I decided to heed @rachel 's advice and blog about it. I realized I’ve been putting this off for so long because I was afraid that my understanding of the topic was insufficient and that I might explain things wrong.

I think this might be a common problem among freshers in this area, so I was wondering if we could create a “Peer review for your articles” section where we can help each other out.

You can find my article here: https://medium.com/@niazangels/how-to-autoencode-your-pokémon-6b0f5c7b7d97

All suggestions are welcome :slight_smile:

2 Likes

Congratulations for blogging! I really like your post. I have three minor suggestions:

  • Mention upfront that you’re a student in this course, and that you’re interested in feedback. As well as hopefully getting useful feedback, you’re also setting expectations appropriately, and you make it easier for us to promote your work
  • In part 2 you’ll learn about GANs, which give much better results for this kind of thing. You may want to start looking into Wasserstein GANs to see how they work and what they can do

PS: A minor nitpick, but your Mandarin characters are the “traditional” form only used in Hong Kong and Taiwan (and most expat communities); the billion+ people on the mainland have standardized on the “simplified” character set.

1 Like

You should definitely check out convolutional autoencoders and GANS as Jeremy said.

My one critique is that with such a small dataset and so many parameters, you really need to have separate training and validation datasets. Your first layer has enough parameters to memorize 1024 Pokémon images so if you didn’t split your dataset for training and validation your model can basically cheat. Maybe you did this, but it wasn’t clear in the post.

You also need to make sure that you split by Pokémon and not by image (otherwise you’ll get similar images in the training and validation datasets).

1 Like

Thanks a lot @jeremy !
I’ll keep this in mind when I ask for feedback on my next post :slight_smile:
I’m still doing part 1, and hope to finish in time to catch up you when part 2 is released.

P.S Wow, I didn’t realize this. The simplified character Shù sure looks complicated to me. I hope you don’t mind me sticking with the traditional version for simplicity!

That’s a wealth of information, @davecg- thank you for sharing your time!
Here are my responses:

  1. I was under the impression it wouldn’t turn out well at all, and hence didn’t go for a separate train/test/val set. Going by your words, I clearly underestimated it’s capabilities. I’ll try it out with a split dataset and see how it goes.

  2. I guess I didn’t think this through, and picked a shuffled set of images instead of going by Pokemon.

  3. Will definitely be looking at GANs and convolutional autoencoders next as both of you suggested.

This is valuable feedback!
Thank you so much!

What?.. It has the same stroke count (which is unusual, normally has fewer than traditional), but is the much more standard radical “rice” in the top-left (which is the only difference): simplified 数 vs traditional 數 . Anyways, please use whatever you prefer, but just figured I’d let you know. :slight_smile:

Back on topic… Regarding @davecg’s most useful caveat, you may want to look at this paper: https://arxiv.org/abs/1703.00573 . It shows how you can think about and measure the level of truly novel image generation your network is doing. It’s a bit heavy and theoretical in parts, but the key metrics and recommendations are actually fairly simple.

So with AE or VAEs, is it possible to learn a specific feature and generate a new image with it? For example, if you train a bunch of face images with a blond hair attribute, take a new input image of someone with black hair, and simply change it to blond without distorting the other facial features? I’ve worked on two git repos:


And in both cases, when applying say the “smile” vector, the rest of the face generates poorly. How can you apply just a learned vector without regenerating the entire image to apply it?

Anyone have any advice?

Thanks so much!
Doug