@resdntalien I tried minimizing f_original using the original image as a starting point instead of noise (results below), it diverges a bit to loss=0.128 (why?) then goes back to much lower loss values. All of these iterations are almost visually indistinguishable from the original though.
Clearly, bfgs does not find anywhere close to a global optimum if it starts from noise… where it starts from determines where it ends up.
This got me to wondering: what happens if we start from something in between the original and pure noise? Turns out we end up in a different local minimum, much closer to the original, but still not all that close (loss=0.373). Also it takes a lot more iterations before it gets stuck in that local minimum.
@alex_izvorski Super cool man. I think this is the nature of the problem being non-convex and we’re using an iterative “line search” solver… it will move a bit I guess.
Your question gets at the core of what the CNN is doing. I bet we could construct some weird images that have the same loss (e.g. 0), but probably very few that look “natural”. That’s why the CNN is so good… cause it’s capturing the the fact that natural images look a certain way.
This suggests that trying different starting points for style transfer could be interesting. e.g what if you start with the style image? What if you start with the content image? What if you use multiple layers in the content loss?
Unzipped imagenet-sample-train.tar to /data/imagenet/sample/train and replaced both variables path and dpath (under Setup) with ‘/data/imagenet/sample/train/’ but getting error:
FileNotFoundError: [Errno 2] No such file or directory: ‘/data/imagenet/sample/train/fnames.pkl’
Error is occuring with the .open function in this line:
fnames = pickle.load(open(dpath+‘fnames.pkl’, ‘rb’))
“data” folder is in the same parent directory as the Neural Style notebook, so in theory the path should be correct, but perhaps I’m using the wrong dataset or syntax?
Thanks, @renjithmadhavan. I think my problem is even more basic (i.e. possible syntax / parameter mismatch problem), b/c even when I just point it to a smaller folder with just a few images, I still get the File Not Found error.
Can you share with me how you set up your data paths and fnames initialization?
There was carriage return at the line ending so I cleaned the file like below:
f = open(dpath +‘fnames.pkl’)
filenames = f.readlines()
fnames = [name.rstrip() for name in filenames]
Then I copied the first image and saved it to “dpath”.
img=Image.open(fnames[0])
img.save(dpath+‘bird.jpg’)
img
There might be better way to do this.
If this is not what you are looking for at which line do you get the error.
It’s a good idea to re-watch the video as you try to follow along with the notebooks, because I try to mention issues like this as they arrive. In this case, I mentioned in the class that you need to create this file yourself, and it should simply contain a list of the filenames in the imagenet sample. (In this part of the course, I’ll be leaving more and more steps for you to do yourself! )
@renjithmadhavan you approach is a good one - but just FYI generally the ‘.pkl’ suffix is used for python pickle files. Just to avoid confusion I’d suggest using a different extension such as ‘.txt’ since you’re creating a standard text file.
Note also that you can use utils.get_classes from part 1 to easily grab the file names of an image dataset that’s structured in the usual way.
@Matthew@renjithmadhavan these are looking good! Perhaps you can try out some experiments now such as:
Use multiple layers for content loss
Try different weights for style vs content
Try different weights for different layers of style loss
Try using the photo (maybe with noise) as a starting point
Of anything else you think might be interesting
If you (or anyone else) get some experiment results, feel free to create a new thread to show them (and maybe even draft a longer post) so we can discuss in more detail.
Got it. I’ll carefully rewatch the videos from now on to avoid missing these tips. Thanks for the note on utils.get_classes as well, which I wasn’t aware of.
Deep Learning requires a big amount of data and a lot of computational resources.
Let’s look at Kaggle competition platform, a lot of solution on top of the leaderboards have Gradient Boosting as the main algorithm, not Deep Neural Networks. But it may be a good idea to use neither Deep NNs nor stacked XGBoost predictions in production solution.
As you can see, there are a lot of constraints for solution in dependence of the problem statement: amount of data, it’s format, computational resources, type of the problem, whether it’s offline/online learning, etc.