Lesson 8 Discussion


(Romano) #82

I met the same problem. Additionally, when I try to run the bit to reconstruct the input, the loss function returns an array whose dimensions are the dimensions of the convolutional block I am using. It looks like it evaluates the MSE across channels, but not across the actual image matrix. Are you seeing the same thing? I believe this is ultimately the reason for the dimensions mismatch (as the style loss tries to sum losses from different layers and they have a different number of channels).

I suspect that it has to do with Keras 2, but I have been unable to locate the issue yet. Any input will be greatly appreciated.


(Romano) #83

@ericm and @justinho, I believe I solved the problem. It is most likely a Keras 2 issue, due to the fact that the dot multiplication was not summing over the x, y of the matrix. Calling the backend and performing an explicit sum solved it. Here are the two updated loss functions (they are more general than the ones in the nb and able to handle layer lists directly). They can be called, using the variables as in the nb, as:

c_loss = content_loss(layer, targ)
s_loss = style_loss(layers, targs, style_wgts)

def content_loss(computed, target, weight_ls=None):
    if isinstance(computed, list):
        if not weight_ls:
            weight_ls = [1.0 for layer in computed]
        #end
        c_loss = sum([K.sum(metrics.mse(comp[0], targ[0]) * w \
                      for comp, targ, w in zip(computed, target, weight_ls))])
        _, height, width, channels = map(lambda i: i, K.int_shape(computed[0]))
    else:
        c_loss = K.sum(metrics.mse(computed, target))
        _, height, width, channels = K.int_shape(computed)
    #end
    c_loss = c_loss #/ (height * width * channels)
    return c_loss
#end

def style_loss(computed, target, weight_ls=None):
    if isinstance(computed, list):
        if not weight_ls:
            weight_ls = [1.0 for layer in computed]
        #end
        s_loss = sum([K.sum(metrics.mse(gram_matrix(comp[0]), gram_matrix(targ[0]))) * w \
                      for comp, targ, w in zip(computed, target, weight_ls)])
        _, height, width, channels = map(lambda i: i, K.int_shape(computed[0]))
    else:
        s_loss = K.sum(metrics.mse(gram_matrix(computed), gram_matrix(target)))
        _, height, width, channels = K.int_shape(computed)
    #end
    s_loss = s_loss #/ (height * width * channels)
    return s_loss
#end

I hope this helps.

For those interested, I wrote a blog post on style transfer - with credit and links to the MOOC of course - here.


(Eric Mulvihill) #84

@Romano Thank you so much! I will try this code as soon as I have some time. So this sounds like it is not actually platform related but strictly Keras version related?


(stella wu) #85

@RiB Thanks Romano! I have the same problem and your code solved it. However with the code in your post. the function wasn’t able to converge, the loss value stayed the same at every step. Have you encountered this at any point? Am I missing something ? Thanks!


(Romano) #86

@stella Yes, I have encountered that issue from time to time. Strangely, it occurs in some circumstances when I resize images to larger size (meaning, it converges if I set, for instance, 200x300 pixels but does not if I set 400x600, even though I would use the exact same content and style images as input). I suspect it has to do with gradients initialization or something like that. I tinkered with it and found that not scaling the losses solves the problem in most cases (that is why the line (height * width * channels) was commented out in the code. Did you put it back in? I suggest you try playing with that scaling parameter and the size of the input images and see if this helps.

I would be interested in understanding the reason for such behaviour myself, so if you think you figure out why this happens, I would be happy to hear it.


(zipp) #87

In my effort to try to find how to not apply a style on a specific color (i.e. apply the style on all color but ignore black) of a content I came across this paper that I believe is a nice follow up to the class:
https://arxiv.org/abs/1602.07188
https://arxiv.org/abs/1610.07629
By the way, I was not able to really find out how to not apply a style on certain color (other than manually reverting the area of the initial image to the initial color which I did not want to do because you lose the shape).


(stella wu) #88

@RiB That makes a lot of sense, thanks a lot ! I tried taking out the scaling and it works ! Now i see the beautiful decreasing in the loss value! What was the purpose of scaling the loss function back to the heightwidthchannel? i agreed with you that it probably is the gradients initialization. As we are using a L-BFGS-B algorithm, the variables has constrains for the initial approximation. My guess is that the scaled loss value are so small so as the gradients of the inputs, the initialization exceeded the constrains and didn’t happen properly. I tried add a Gaussian filter on the initial random image, didn’t help. I am trying other minimize algorithms.


(Romano) #89

@stella The purpose of the scaling was to have loss values independent from the size of the input images and that could, in principle, be compared across different use cases to see how the algorithm was learning. Evidently, that conflicts with something and is better to have it out of the way. Feel free to share the results of other minimization algorithms (check out this beautifully written post for some inspiration).


(Ravi Teja Gutta) #90

Hi @RiB, @justinho, @ericm, I encountered the same problem as well. I think I know why we are getting the error.

We are trying to add the losses from conv1_block1 and conv1_block2 which are of different tensor-shapes.just using K.mean seems to work for me as below

def style_loss(x, targ): return K.mean(metrics.mse(gram_matrix(x), gram_matrix(targ)))

Thank You


(Ravi Teja Gutta) #91

Hi @karthik_k314, I think the random image that we start with has to be of same size as content image.Because content loss requires both of them to be same shape.But in case of style loss, the random image need not have the same size as style image as long as they are considered from the same layer.This is because we are using gram matrix which will have dimensions of num_filtersxnum_filters


(Zao Yang) #92

I see your blog. Do you have a github where you have the source code for this published?


(Romano) #93

I did not put it on GitHub because the blog has already all the source code (well, now that I see it, I forgot to include the imports, but they are all from standard packages, perhaps I will write an update). Do you need anything in particular?


(Zao Yang) #94

I just wanted to see the entire thing running in a ipynb file. I got to a Keras 2 issue and wanted to fix it. Would’ve preferred to have an easier integration between your code and the fast.ai code. If you can provide the code as a ipynb, it would be super helpful. It’s always easier to see the entire thing and read the blog rather than copy and paste the functions and not knowing why it doesn’t run.


#95

I am trying to run neural-style.ipynb but it seems that my data directory has not all the files. First I had to create fnames.pkl (as it was not in imagenet-sample-train.tar.gz13)

fnames = list(glob.iglob(path+'*/*/*.JPEG'))
pickle.dump(fnames, open(path+'fnames.pkl', 'wb'))

And now I am missing

FileNotFoundError: [Errno 2] No such file or directory: [...]/imagenet/sample//results/res_at_iteration_0.png

But I can’t find those res_at_iteration_ files anywhere. Any ideas how I can get a working data dir for the notebook?


(Ravi Teja Gutta) #96

The path seems to be wrong @gai , ‘[…]/imagenet/sample//results/res_at_iteration_0.png’ should be ‘[…]/imagenet/sample/results/res_at_iteration_0.png’. An extra ‘/’ is the culprit


#97

@rteja1113 the double slash in the path wasn’t the problem (// has no special meaning on Linux, unlike windows), but that sample didn’t have a results directory. I created it and now I got one step further!!!


(Zao Yang) #98

Hi Romano,

I tried your code here: https://github.com/zaoyang/fast_ai_course/blob/master/redone/neural_style_transfer%20.ipynb

It didn’t work. I used your images from the blog and got something like this even with 20 or 50 iterations. Any ideas?


(Romano) #99

@zaoyang First, thanks for replicating my results, I am sorry I was out of touch, but I was travelling last week and I got back just yesterday.

Second, I know what the problem is and it was addressed in my comments from May 22 to May 25 in response to stella. The issue is due to gradient calculation when the loss function is defined. In particular, depending on the size of the input and on the image, gradients are rounded up to zeros in Keras and the procedure gets stuck at a certain loss value (in your case 6.44). The way to solve it, although it is a just a workaround, is to avoid to scale the content loss value (if you see my code on May 22, the incriminated line is commented out and the reason is explained in a subsequent post). Without scaling, the function should converge within a few iterations.

I hope this helps and sorry again for not replying sooner. I am also updating the blog with the imports so that everything is reproducible.


#100

C++ guy here really confused about the Evaluator() class in lesson 8.

The optimiser fmin_l_bfgs_b() function takes the returns of the Evaluator() class’s loss() and grad() member functions. That makes sense I guess. However in the Evaluator class those functions have arguments ‘self, x’ which are not inputted:

fmin_l_bfgs_b(eval_obj.loss, x.flatten(), fprime=eval_obj.grads, maxfun=20)

The eval_obj.loss function is being called as if it was a member variable and not a function (which is what it is right?). How are the loss(self, x) and grad(self, x) functions working if we aren’t passing anything in to them?

Shouldn’t it look something like this instead:

fmin_l_bfgs_b(eval_obj.loss(x), x.flatten(), fprime=eval_obj.grads(x), maxfun=20)

Confused :smiley:


(Romano) #101

@machinedrum If you look at the documentation of fmin_l_bfgs_b, you will see that it takes a callable as first argument (the loss function), as well as third (the gradient function). So it will be the fmin_l_bfgs that will do the appropriate “calling” of the functions.

In principle, it is like doing:

def foo(bar, x):
   return bar(x)
#end
foo(np.log, 3)