@ericm and @justinho, I believe I solved the problem. It is most likely a Keras 2 issue, due to the fact that the dot multiplication was not summing over the x, y of the matrix. Calling the backend and performing an explicit sum solved it. Here are the two updated loss functions (they are more general than the ones in the nb and able to handle layer lists directly). They can be called, using the variables as in the nb, as:
c_loss = content_loss(layer, targ)
s_loss = style_loss(layers, targs, style_wgts)
def content_loss(computed, target, weight_ls=None):
if isinstance(computed, list):
if not weight_ls:
weight_ls = [1.0 for layer in computed]
#end
c_loss = sum([K.sum(metrics.mse(comp[0], targ[0]) * w \
for comp, targ, w in zip(computed, target, weight_ls))])
_, height, width, channels = map(lambda i: i, K.int_shape(computed[0]))
else:
c_loss = K.sum(metrics.mse(computed, target))
_, height, width, channels = K.int_shape(computed)
#end
c_loss = c_loss #/ (height * width * channels)
return c_loss
#end
def style_loss(computed, target, weight_ls=None):
if isinstance(computed, list):
if not weight_ls:
weight_ls = [1.0 for layer in computed]
#end
s_loss = sum([K.sum(metrics.mse(gram_matrix(comp[0]), gram_matrix(targ[0]))) * w \
for comp, targ, w in zip(computed, target, weight_ls)])
_, height, width, channels = map(lambda i: i, K.int_shape(computed[0]))
else:
s_loss = K.sum(metrics.mse(gram_matrix(computed), gram_matrix(target)))
_, height, width, channels = K.int_shape(computed)
#end
s_loss = s_loss #/ (height * width * channels)
return s_loss
#end
I hope this helps.
For those interested, I wrote a blog post on style transfer - with credit and links to the MOOC of course - here.