@part2 : This week’s assignment to implement perceptual losses for style transfer is, in some ways, the trickiest we’ve had in any lesson! So, for those that want them, here are some tips. (And don’t forget to look at the lesson 9 wiki topic for links to more information.)
The top tip is this: start with the perceptual losses for super-resolution code from last week, which already works. This demonstrates most of the key pieces you’ll need, so try to modify this just a little at a time, and make sure it keeps working at each step.
Here’s some more tips, hidden behind “spoilers” so you don’t have to look at them if you don’t need them yet.
First thing to try
One thing you could do as a next step from the working super-resolution code is to edit it so that it uses activations from multiple layers. This may well improve the super-resolution output too! Try to minimize your code changes, so that you get it working without having the chance to introduce many new bugs in the process.
For extra credit: give the different layers different weights
Avoid unnecessary complications
Don’t worry about the reflection padding or resnet-with-cropping (from the supplemental materials) until everything else is working. It’ll still work just fine if you use valid convolutions, and makes things much easier.
Getting rid of the checkerboard effect
Here’s a block that avoids deconvolutions, and therefore gets rid of the checkerboard effect:
def up_block(x, filters, size): x = keras.layers.UpSampling2D()(x) x = Convolution2D(filters, size, size, border_mode='same')(x) x = BatchNormalization(mode=2)(x) return Activation('relu')(x)
Speeding up style transfer training
You don’t want to keep your style transfer targets in RAM, and copy them to the GPU every batch, but you also don’t want to recalculate them redundantly every batch. The secret to avoiding this?: Pre-compute them, and them use
K.variable to copy them to the GPU and keep them there:
style_targs = [K.variable(o) for o in vgg_content.predict(np.expand_dims(style,0))]
Batchwise gram matrix
You will, of course, want to be able to train a batch at a time. That means you’ll need a
gram_matrix function that can handle a 4-d tensor of a batch of activations. Here’s a function that takes a batch of activations, and returns a batch of gram matrices:
def gram_matrix_b(x): x = K.permute_dimensions(x, (0, 3, 1, 2)) s = K.shape(x) feat = K.reshape(x, (s, s, s*s)) return K.batch_dot(feat, K.permute_dimensions(feat, (0, 2, 1)) ) / K.prod(K.cast(s[1:], K.floatx()))