Lesson 8 in-class

(Alex Izvorski) #121

@resdntalien I tried minimizing f_original using the original image as a starting point instead of noise (results below), it diverges a bit to loss=0.128 (why?) then goes back to much lower loss values. All of these iterations are almost visually indistinguishable from the original though.

Clearly, bfgs does not find anywhere close to a global optimum if it starts from noise… where it starts from determines where it ends up.

This got me to wondering: what happens if we start from something in between the original and pure noise? Turns out we end up in a different local minimum, much closer to the original, but still not all that close (loss=0.373). Also it takes a lot more iterations before it gets stuck in that local minimum.

Results: https://gist.github.com/aizvorski/6dee41c61376200014b92ef48480ab15

Food for thought:

  • How to go to loss=0 starting from pure noise or from any other initialization?
  • Can a better initialization be constructed without knowing anything about the original image?
  • Are there other images (different from the original) which have loss=0? What is the most pixel-wise-different image like that?
  • What does the family / manifold of those images look like?

(sravya8) #122

Text data is also ordered data, so one example is sentiment analysis using CNN

(Vishnu Subramanian) #123

As soon as I update the theme to grade3 , collapsable heading fails to work. I am using it on mac and chrome os. Any idea why.

(Sourav Dey) #124

@alex_izvorski Super cool man. I think this is the nature of the problem being non-convex and we’re using an iterative “line search” solver… it will move a bit I guess.

Your question gets at the core of what the CNN is doing. I bet we could construct some weird images that have the same loss (e.g. 0), but probably very few that look “natural”. That’s why the CNN is so good… cause it’s capturing the the fact that natural images look a certain way.

(Jeremy Howard) #125

This suggests that trying different starting points for style transfer could be interesting. e.g what if you start with the style image? What if you start with the content image? What if you use multiple layers in the content loss?

(Mariya) #126

Downloaded these data sets but not entirely sure if they’re the right ones to use (or how to use) with the Neural Style jupyter notebook:

  1. http://www.platform.ai/data/imagenet-sample-train.tar5
  2. http://www.platform.ai/data/trn_resized_288.tar4
  3. http://www.platform.ai/data/trn_resized_72.tar3

Unzipped imagenet-sample-train.tar to /data/imagenet/sample/train and replaced both variables path and dpath (under Setup) with ‘/data/imagenet/sample/train/’ but getting error:

FileNotFoundError: [Errno 2] No such file or directory: ‘/data/imagenet/sample/train/fnames.pkl’

Error is occuring with the .open function in this line:
fnames = pickle.load(open(dpath+‘fnames.pkl’, ‘rb’))

“data” folder is in the same parent directory as the Neural Style notebook, so in theory the path should be correct, but perhaps I’m using the wrong dataset or syntax?


In the neural style notebook,I think you need only few images. I did this by randomly selecting few images from the image net train directory.
Case 1

1 image as input for playing with the noise image
Case 2

1 image for styling the input image ( in the note book there are three style images )

(Mariya) #128

Thanks, @renjithmadhavan. I think my problem is even more basic (i.e. possible syntax / parameter mismatch problem), b/c even when I just point it to a smaller folder with just a few images, I still get the File Not Found error.

Can you share with me how you set up your data paths and fnames initialization?

(Xinxin) #129

do you mean define the loss with multiple layers in the content loss as we did in style loss?

loss = sum(style_loss(l1[0], l2[0]) for l1,l2 in zip(layers, targs))


I put my directories like this below:

path = '/home/renjith/datascience/kaggle/input/fastai2/data/'
dpath = ‘/home/renjith/datascience/kaggle/input/fastai2/lesson1/’

I would say do not get confused by “fnames.pkl”. I am not sure exactly how that file was created. I guess its a pickle file,

However it is just a file with the filenames of your jepg images.

I created that file as below:

renjith@dlp20db:/lesson1/train$ find ‘pwd’ -name “*.JPEG” > …/fnames.pkl

There was carriage return at the line ending so I cleaned the file like below:
f = open(dpath +‘fnames.pkl’)
filenames = f.readlines()
fnames = [name.rstrip() for name in filenames]

Then I copied the first image and saved it to “dpath”.

There might be better way to do this.

If this is not what you are looking for at which line do you get the error.


I was playing with the notebook and tried to apply a Vangogh self portrait to my Son’s picture. I guess it turned out pretty good.

(Matthew Kleinsmith) #132


(Jeremy Howard) #133

It’s a good idea to re-watch the video as you try to follow along with the notebooks, because I try to mention issues like this as they arrive. In this case, I mentioned in the class that you need to create this file yourself, and it should simply contain a list of the filenames in the imagenet sample. (In this part of the course, I’ll be leaving more and more steps for you to do yourself! :slight_smile: )

@renjithmadhavan you approach is a good one - but just FYI generally the ‘.pkl’ suffix is used for python pickle files. Just to avoid confusion I’d suggest using a different extension such as ‘.txt’ since you’re creating a standard text file.

Note also that you can use utils.get_classes from part 1 to easily grab the file names of an image dataset that’s structured in the usual way.

(Jeremy Howard) #134

@Matthew @renjithmadhavan these are looking good! Perhaps you can try out some experiments now such as:

  • Use multiple layers for content loss
  • Try different weights for style vs content
  • Try different weights for different layers of style loss
  • Try using the photo (maybe with noise) as a starting point
  • Of anything else you think might be interesting

If you (or anyone else) get some experiment results, feel free to create a new thread to show them (and maybe even draft a longer post) so we can discuss in more detail.

(Mariya) #135

Got it. I’ll carefully rewatch the videos from now on to avoid missing these tips. Thanks for the note on utils.get_classes as well, which I wasn’t aware of.

(Cody) #136

Does anyone have a link to the recent paper that Jeremy mentioned addressing the principles behind how/why the style loss function works?

(Bulat Suleymanov) #137

Deep Learning requires a big amount of data and a lot of computational resources.

Let’s look at Kaggle competition platform, a lot of solution on top of the leaderboards have Gradient Boosting as the main algorithm, not Deep Neural Networks. But it may be a good idea to use neither Deep NNs nor stacked XGBoost predictions in production solution.

As you can see, there are a lot of constraints for solution in dependence of the problem statement: amount of data, it’s format, computational resources, type of the problem, whether it’s offline/online learning, etc.

(Aman Madaan) #138

This seems like the paper.

(Sourav Dey) #139

I just did this. I tried to do MC Escher painting a picture of a dog (from the redux competition) Here’s what it looks like with random start:

Here’s what it looks like with the original image as start:

I actually like the painting with the original image better. And as @alex_izvorski said – we can try all sorts of stuff in between.

(Xinxin) #140

nice test, btw, I sense a GEB theme here :wink: