Lesson 8 Discussion

Keras in the current example uses tensorflow graphs . When you ask Keras to output a particular variable , it uses the information in the graph to calculate the output . Say you pass a bunch of dogs & cats images through a network architecture and calculate the below.

  1. Predicted labels.
  2. Loss = loss_fn(Predicted labels,Actual_labels)
  3. Gradients = Keras calculates automatically the gradients by which the weights have to change to reduce the loss.

In our example , we have created the variables for loss and gradients which contains the graph on how to calculate them. We use K.function() to create a new function which takes inputs and applies the network and outputs loss and grads of the network.


For the Neural Styl Transfer, I am able to get the Content extraction working in Keras 2 but not the Style Extraction. I end up with a random image, even through the loss gets progressively reduced.

Can anyone point to any Keras 2 code that I can try? I cannot see what mistake I am making…Please advice


# I  am using VGG16 with max pooling, not average. Should work, according to forums..
vgg16_style=keras.applications.vgg16.VGG16(include_top=False, weights='imagenet', input_tensor=None, input_shape=(224,224,3))

style_layers=[vgg16_style.get_layer(name='block{}_conv1'.format(o)).output for o in range(1,3)] 
style_layer_model = Model(vgg16_style.input, style_layers)
style_targets=[K.variable(o) for o in style_layer_model.predict(starry.reshape(shp))]

# Same as Jeremy's code
def gram_matrix(x):
     # We want each row to be a channel, and the columns to be flattened x,y locations
     features = K.batch_flatten(K.permute_dimensions(x, (2,0,1)))
     # The dot product of this with its transpose shows the correlation 
     # between each pair of channels
      return K.dot(features, K.transpose(features)) / x.get_shape().num_elements()  

 # Using K.mean to get a single number loss
def style_loss(x, targ): 
    return K.mean(metrics.mse(gram_matrix(x), gram_matrix(targ)))

sloss = sum([style_loss(style_layer[0], style_target[0]) for style_layer, style_target in zip(style_layers, style_targets)])
sgrads = K.gradients(sloss, style_layer_model.input)
style_fn = K.function([style_layer_model.input], [sloss]+sgrads)
evaluator = Evaluator(style_fn, shp)

rand_img = lambda shape: np.random.uniform(-2.5, 2.5, shape)/1

x = rand_img(shp)

x = solve_image(evaluator, iterations, x)

we use [0] just to keep the shapes correct

e.g. the output of VGG16 block5_conv1, is (1, 14, 14, 512) , whereas what you need for the loss function is a (14,14,512) shape.

if you use l1 directly, the shapes won’t match and you may get an error. Instead, you can do a l1[0], and it now aligns properly for loss calculation

Also, you may have to use a K.mean because you have two layers of different dimensions, and you need to add the losses from each layer. K.mean will convert the loss from each of your layers to a single number, and you can now add them up (in that code loop)

1 Like

I am getting the same error, also with Keras 2 (TensorFlow 1.3)

Have you tried using K.Mean() within the style_loss function?

return K.mean(metrics.mse(gram_matrix(x), gram_matrix(targ)))

instead of

 return metrics.mse(gram_matrix(x), gram_matrix(targ))

That should take care of the dimension incompatibility error, by reducing it to a single number


That worked, thanks! So the semantics of metrics.mse changed in keras 2, it seems. Odd, I didn’t notice it when I read the release notes.

do let me know if you get the Style extraction part working in Keras 2.0. I am struggling with this, can’t see what’s my undoing :frowning:

Will do! Are you getting errors?

Nope. The loss function drops from 10,000 to 20 over10 iterations. But the random image input still looks random, can’t see the style being extracted

Thank you!

You can use VGG-16 directly from Keras.

Here is an example on how to use it with tensorflow

Please let me know if you have any questions,

1 Like

Note: the complete collection of Part 2 video timelines is available in a single thread for keyword search.
Part 2: complete collection of video timelines

I did a timeline of the Lesson 8 video as I found them very practical in Part 1 wiki.
There are many links, probably there was a lot of “happy noise” in the first class :upside_down:
I expect future timelines to be shorter.

Lesson 8 video timeline:

0:00 : Intro and review of Part 1

08:00 : moving to Python 3

10:30 : moving to Tensorflow and TF Dev Summit videos

22:15 : moving to PyTorch

27:30 : from Part 1 “best practices” to Part 2 “new directions”

31:40 : time to build your own box

36:20 : time to start reading papers

39:30 : time to start writing about your work in this course

41:30 : what we’ll study in Part 2

40:40 : artistic style (or neural style) transfer

52:10 : neural style notebook

54:15 : Mendeley Desktop

56:15 : arXiv-Sanity.com

59:00 : Jeremy on twitter.com and reddit.com/r/MachineLearning/

1:01:15 : neural style notebook (continued)

1:04:05 : broadcasting, APL as “A Programming Language”, and Jsoftware

1:07:15 : broadcasting with Keras

1:12:00 : recreate input with a VGG model

1:22:45 : optimize the loss function with a deterministic approach

1:33:25 : visualize the iterations through a short video

1:37:30 : recreate a style

1:44:05 : transfer a style


Hi all! Just wanted to point out few issues that I had with first part of lesson 8.

Task: Get the bird out of noise, using only tensorflow 1.2 and its contrib.keras without keras .That is, content transfer. (afaik from TF Dev videos, standalone keras will be deprecated some time in the future).

I had to replace the inports on vgg_16_average.py, and handcode the loss keras.backend.mean(keras.backend.square(content_model.output - content_target)). Other than that, most of the “porting” is straight forward and/or easily solved with a bit of doc surfing.

However, I fought a lot with a nasty bug (feature) in scikit-image. I used skimage.transform.rescale() to get my image size down. Default, this function also scales your image to [0,1]. VERY nasty feature, nothing worked. I got my MSE down but the out image is pure noise. It took me a while to track down this issue. So, use preserve_range=True parameter.

Lessons learned:

  • LL1: VGG is VERY sensitive both to centering the values (subtracting mean) and with the “standard deviation” of the input. It expects the range between -128 – 128 (ish)
  • LL2: The Evaluator class is needed so you don’t have to run the network twice (once for the loss and second for the gradients). If smb has a more elegant (pythonic) way of doing this please post!
  • LL3: Accidentally noted that the range of the initial random image does not matter much. Smaller values give smoother images and larger values (close to the input dynamic range) yields interesting images.

Huh, time for style transfer!

1 Like

why we never preprocessed our data when we used vgg16 in part 1 and now it is necessery?

why we needed to make k.function to loss and gradiant and then seperate them?
why couldnt we do 2 k.functions? one for the loss and one for the gradient

is their any laptop available that have light at bottom of it. I mean it gives glowing look same as blue light under my car.best laptop in travel with perfect vlogging.

K.mean should be lower than K.sum because mean divides the sum by the number of elements. So the lower loss isn’t meaningful; it’s just scaled differently.

Worked for me too, thanks!

In the Super Resolution Network part of the neural-style.ipynb, it appears that a bunch of code has been added since the video. For instance, there is now a up_block defined that makes use of keras UpSampling2D() layer. The deconv_block is defined but no longer used in creating the up-sampled network. I cannot tell from keras’ documentation if the UpSampling2D() layer followed by Convolution2D() layers does the same thing as a Deconvolution2D() layer.

I think it is helpful for @jeremy to print out the keras version at the start of the notebook such that we will save time to debug due to keras version changes.