Hello! I’ve just launched the public repo for this project I’ve been obsessing over for the last two months called DeOldify. Yes the name’s stupid but the tech I think is really cool.
Gist is this:
This is a deep learning based model. More specifically, what I’ve done is combined the following approaches:
- Self-Attention Generative Adversarial Network (https://arxiv.org/abs/1805.08318) . Except the generator is a pretrained Unet , and I’ve just modified it to have the spectral normalization and self attention. It’s a pretty straightforward translation. I’ll tell you what though- it made all the difference when I switched to this after trying desperately to get a Wasserstein GAN version to work. I liked the theory of Wasserstein GANs but it just didn’t pan out in practice. But I’m in love with Self-Attention GANs.
- Training structure inspired by (but not the same as) Progressive Growing of GANs (https://arxiv.org/abs/1710.10196). The difference here is the number of layers remain constant- I just changed the size of the input progressively and adjusted learning rates to make sure that the transitions between sizes happened successfully. It seems to have the same basic end result- training is faster, stable, and generalizes better.
- Two Time-Scale Update Rule (https://arxiv.org/abs/1706.08500). This is also very straightforward- it’s just one to one generator/critic iterations and higher critic learning rate.
- Generator Loss is two parts: One is a basic Perceptual Loss (or Feature Loss) based on VGG16- this basically just biases the generator model to replicate the input image. The second of course is the loss score from the critic. For the curious- Perceptual Loss isn’t sufficient by itself to produce good results. It tends to just encourage a bunch of brown/green/blue- you know, cheating to the test, basically, which neural networks are really good at doing! Key thing to realize here is that GANs essentially are learning the loss function for you- which is really one big step closer to toward the ideal that we’re shooting for in machine learning. And of course you generally get much better results when you get the machine to learn something you were previously hand coding. That’s certainly the case here.
Sample result images:
There’s more examples and details at the repo. I plan on continuing work on this project for the foreseeable future, with the goal of making this super easy to use, more memory efficient, and adding more models as needed to make photos even better (DeFade model I’m working on concurrently, for example).
Basically what I aimed to do with this whole project was to take what I learned in Parts I and II of the Fast.AI course and see just how far I could go with it. And this is the result! It’s funny- quite a few times I did think “Oh maybe Jeremy wasn’t quite right” so I’d try doing something a bit different, but I wound up learning the hard way that no- he was definitely right.