Lesson 10 Discussion

hi @jeremy ,
I’m working on the neural-sr.ipynb, I noticed that this code is using the trn_resized_72_r.bc and trn_resized_288_r.bc dataset, but in the files.fast.ai/data, I can only find the trn_resized_72.bc and trn_resized_288.bc, I think they are different from those with “_r” tails right?

The reason I think they are different is that when I tried to replace the trn_resized_72_r.bc and trn_resized_288_r.bc with trn_resized_72.bc and trn_resized_288.bc in the neural-sr.ipynb, the code raised an error:

ValueError: batch_size needs to be a multiple of X.chunklen

Then I check the X.chunklen:

The arr_lr.chunklen is 64, in other words, it seems I have to change the batch_size in train(), so that solving this error. So I change the batch_size as 64, train(64, 18000), but I got “Resource Exhausted”, my GTX 1080Ti ran out of memory!

Therefore I’m wondering your original arr_lr.chunklen is just 16, not 64, right?

Which data set are you actually using? Where I can download it ?

Thanks jeremy.

EDIT:
Jeremy had released the “imagenet-sample-train” dataset, I think I can use these sample dataset to do the resize job, so I re-implement the imagenet_process.ipynb with “imagenet-sample-train” dataset, there are 19439 images in it, resizing them as 288 x 288 for high-resoluton and 72 x 72 for low-resolution, got pretty good effect.

My cpu is i7-6850K, the processing time is:

pretty fast ! And the size of “trn_resized_72_r.bc” is 342 MB, the size of “trn_resized_288_r.bc” is 5.0 GB.

I think maybe I can use these dataset for super-resolution training!

But i still don’t figure out are you using the whole imagenet dataset to create the high-res and low-res dataset? The whole imagenet data is about 134GB! That’s huge file, maybe the resized .bc file will get to 500GB or more! @jeremy

2 Likes

A little question here, what is this get_slice() for ? Since we don’t need the resize images with black border right? But I didn’t see any black border in the resized image.

Use image hash?opencv contribute got a module called img_hash, maybe it can help you. I always use image hash and VP tree to help me filter similar images collect from webs.

By the way, I found decent results of cat generator trained by WGAN, you can find them at here. The author use 10000 cat images to train the network, most of the images are ice cream to my eye :slight_smile:

The magic here is actually bcolz, not threading.local :slight_smile:

1 Like

Do anyone use WGAN to generate training data?

What I intent to do : Generate disgust emotion

1 : Use 2000~3000 disgust face to train the WGAN
2 : Use trained network to generate several thousands of disgust face
3 : Remove duplicate images(by image hash)
4 : Back to step 1

Is this idea feasible?Any paper talk about this kind of approach?Maybe 2000~3000 images are small, but disgust emotion(with color info) is hard to collect.

By the way, anyone tried to generate a stunning comics generator by WGAN?

Thanks again for the amazing lessons.

Yes, thank you @jeremy, I just realized that

I have similar issues as @sravya8 and @rkoppula face. Part of the validate images work, part of them do not work. Some images after upscale looks more ‘yellow’.

My notebook, different things are

1 : I change the input shape from fixed size to Input(shape=(None,None,3)), now this model can accept input image with arbitrary size.
2 : I train the network with 80000 images from imagenet, center crop and without black border
3 : Name of some variables are different
4 : Add the codes and link of histogram matching

Following are the experiments(trained with 2 epoch)

Work example(72*72 image from imagenet, center crop, no black border):

Work but looks yellow(random image from google search, 256256, no black border, down sample from 10241024 images), this could be fixed by histogram matching:

I put low resolution image of the car at here.

Fail example(download from google, 256256, no black border, down sample from 10241024 images)

Let me test with Set5 and Set14 images, they work very well as the paper show, from left to right are image upsample by bicubic, image upsample by super resolution, image upsample by super resolution and post processing by histogram matching.

I guess the implementation is correct, but this algorithm do not work for every type of images.Maybe RAISR could work for more images?I do not know

Enhance! Super Resolution From Google | Two Minute Papers #124

Edit : I write a blog about Perceptual losses for super resolution.

Hi all, I just realized in the wgan notebook, jeremy creates a variable
input = Var(bs, 3, sz, nz)
where bs=batch_size(64), sz=size_of_image(64), nz = size_of_noise(100)

In the training loop he resizes input variable to the real image size. Is there any particular reason to do this ?

wouldn’t it be the same to create input
input = Var(bs, 3, sz, sz)
and remove the resizing part in the training loop ?

Note: the complete collection of Part 2 video timelines is available in a single thread for keyword search.
Part 2: complete collection of video timelines

Lesson 10 video timeline:

00:00:10 Picking an optimizer for Style Transfer (student post on Medium)
Plus other student posts and tips on class project.

00:07:30 Use Excel to understand Deep Learning concepts

00:09:20 ImageNet Processing (continued from Lesson 9)
& Tips to speed up your model (simd & parallel processing)

00:26:45 Adding Preprocessing to Keras ResNet50

00:28:30 Transfer Learning with ResNet in Keras: difficulty #1

00:33:40 Transfer Learning with ResNet in Keras: difficulty #2

00:38:00 Use batches to overcome RAM “Out of Memory”

00:42:00 Final layers to our ResNet model

00:47:00 Nearest Neighbors to look at examples

00:55:00 Fine-Tuning our models and more “Out of Memory” fixes

01:03:00 Find images similar to a word or phrase &
Find images similar to an image !

01:08:15 Homework discussion

01:16:45 How to: multi-input models on large datasets

01:23:15 Generative Adversarial Networks (GAN) in Keras

01:32:00 Multi-Layer-Perceptron (MLP)

01:37:10 Deep Convolutional GAN (DCGAN)

01:40:15 Wasserstein GAN in Pytorch

01:46:30 Introduction to Pytorch

01:55:20 Wasserstein GAN in Pytorch (cont.)
& LSUN dataset

02:05:00 Examples of generated images

02:09:15 Lesson 10 conclusion and assignments for Lesson 11

1 Like

For anyone interested, I wrote a short blog about my experiment training a WGAN to generate images of bicycles. I found it helpful to use a higher learning rate for the generator than the discriminator (at least initially): https://science4performance.com/2017/08/20/deep-learning-faking-it/

1 Like

I wrote some code that lets you reuse the discriminator of the GAN to do image similarity. I worked on that before WGAN was invented and the generated images look like crap but the image similarity part is ok. Maybe someone will find it interesting.

Struggling to download the imagenet_class_index.json file. Has it been moved?

I can’t get it to load correctly.

from

fpath = get_file('imagenet_class_index.json', 
             'http://www.platform.ai/models/imagenet_class_index.json', 
             cache_subdir='models')
class_dict = json.load(open(fpath))

##in

https://github.com/fastai/courses/blob/master/deeplearning2/imagenet_process.ipynb

##Output
--------------------------------------------------------------------------

----> 4 class_dict = json.load(open(fpath)) 
/home/kevin/anaconda2/envs/ai/lib/python3.6/json/__init__.py in load(fp, cls, object_hook, 
parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
297         cls=cls, object_hook=object_hook,
298         parse_float=parse_float, parse_int=parse_int,
--> 299         parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)

JSONDecodeError: Expecting value: line 2 column 1 (char 1)

Doesn’t appear to be a JSON file:

f=open(fpath)
f.readlines()

['\n',
 '<!DOCTYPE html>\n',
 '<html lang="en">\n',
 '  <head>\n',
 '  <meta charset="utf-8">\n',
 '  <meta content="width=300, initial-scale=1" name="viewport">\n',
 '  <meta name="google-site-verification" content="LrdTUW9psUAMbh4Ia074-BPEVmcpBxF6Gwf0MSgQXZs">\n',
 '  <title>Sign in - Google Accounts</title>\n',
 '  <style>\n',
 '  @font-face {\n',
 "  font-family: 'Open Sans';\n",
....

solved. files moved to files.fast.ai

fpath = get_file('imagenet_class_index.json', 
                 'http://files.fast.ai/models/imagenet_class_index.json', 
                 cache_subdir='models')

Note you’ll have to clear the cache before trying again. see https://keras.io/utils/#get_file

This worked for me:

rm ~/.keras/models/imagenet_class_index.json

1 Like

add

netG, device_ids=range(torch.cuda.device_count())
)
cudnn.benchmark = True```
after 
```netG = DCGAN_G(sz, nz, 3, 64, 1, 1).cuda()```
make the same operation to netD, then the time cost should be cut in half.

Thanks @jeremy & everyone!

A few months ago I founded a boutique AI consulting company working with Fortune 500 companies to implement AI. One of our biggest challenges is helping clients figure out what they can do with AI.

So we’re starting to create “AI recipes” to give people examples.

We just released the first one today on Medium:
Boost Ecommerce Sales by 9% with AI Image Enhancement

I draw on many techniques in this lesson and came up with a few new ones. I link to my Jupyter Notebook if you want to copy this approach.

Two questions concerning the multi-threading:

  1. In the notebook showing the multi-threading solution for image resize – I was under the impression that the appends to the bcolz array can be out of order (due to the multi-threading). Then if that is the case, fnames is out of order with respect to “arr”, the bcolz array of resized images. Then the labels “vecs” generated from fnames will be out of order, and when we run predict, there will be a mismatch between x and y.
  2. It does not appear that bcolz is “thread safe”, from some quick googling. Are we lucky that it works?

I think I can answer my own question – the map returns the results in the right order and I did not notice that the bcolz append call happens in the main thread, so it’s fine.

1 Like

Hello,

Anybody know how to set the jupyter notebook that the content can be stacked and extended like it was done in videos?

Hi,

You can actually install an extension for the jupyter notebooks :

The stacked and extended sections can be then done be enabling the Collapsible Headings extension.

(While installing the above package, you can also install “https://github.com/Jupyter-contrib/jupyter_nbextensions_configurator” which gives a nice UI where you can simply use checkboxes to enable and disable extensions).

Has anyone run into this kind of error with the “imagenet_process” notebook file. I ran the notebook in 2 different locations. One was in my local machine(no gpu) and other one in a machine having gpu.

The code worked fine in the machine without gpu whereas I am getting error in the one with gpu

rn_top_avg = Sequential([rn_top, AveragePooling2D((7,7))])

ValueError: Variable bn_conv1/moving_mean/biased already exists, disallowed. Did you mean to set reuse=True in VarScope? Originally defined at:

File “C:\Users\EnY\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py”, line 1228, in init
self._traceback = _extract_stack()

I noticed that I am having the same problem in all places Sequential is used in that particular model. Any idea why this is happening?