Lesson 10 Discussion

The magic here is actually bcolz, not threading.local :slight_smile:

1 Like

Do anyone use WGAN to generate training data?

What I intent to do : Generate disgust emotion

1 : Use 2000~3000 disgust face to train the WGAN
2 : Use trained network to generate several thousands of disgust face
3 : Remove duplicate images(by image hash)
4 : Back to step 1

Is this idea feasible?Any paper talk about this kind of approach?Maybe 2000~3000 images are small, but disgust emotion(with color info) is hard to collect.

By the way, anyone tried to generate a stunning comics generator by WGAN?

Thanks again for the amazing lessons.

Yes, thank you @jeremy, I just realized that

I have similar issues as @sravya8 and @rkoppula face. Part of the validate images work, part of them do not work. Some images after upscale looks more ‘yellow’.

My notebook, different things are

1 : I change the input shape from fixed size to Input(shape=(None,None,3)), now this model can accept input image with arbitrary size.
2 : I train the network with 80000 images from imagenet, center crop and without black border
3 : Name of some variables are different
4 : Add the codes and link of histogram matching

Following are the experiments(trained with 2 epoch)

Work example(72*72 image from imagenet, center crop, no black border):

Work but looks yellow(random image from google search, 256256, no black border, down sample from 10241024 images), this could be fixed by histogram matching:

I put low resolution image of the car at here.

Fail example(download from google, 256256, no black border, down sample from 10241024 images)

Let me test with Set5 and Set14 images, they work very well as the paper show, from left to right are image upsample by bicubic, image upsample by super resolution, image upsample by super resolution and post processing by histogram matching.

I guess the implementation is correct, but this algorithm do not work for every type of images.Maybe RAISR could work for more images?I do not know

Enhance! Super Resolution From Google | Two Minute Papers #124

Edit : I write a blog about Perceptual losses for super resolution.

Hi all, I just realized in the wgan notebook, jeremy creates a variable
input = Var(bs, 3, sz, nz)
where bs=batch_size(64), sz=size_of_image(64), nz = size_of_noise(100)

In the training loop he resizes input variable to the real image size. Is there any particular reason to do this ?

wouldn’t it be the same to create input
input = Var(bs, 3, sz, sz)
and remove the resizing part in the training loop ?

Note: the complete collection of Part 2 video timelines is available in a single thread for keyword search.
Part 2: complete collection of video timelines

Lesson 10 video timeline:

00:00:10 Picking an optimizer for Style Transfer (student post on Medium)
Plus other student posts and tips on class project.

00:07:30 Use Excel to understand Deep Learning concepts

00:09:20 ImageNet Processing (continued from Lesson 9)
& Tips to speed up your model (simd & parallel processing)

00:26:45 Adding Preprocessing to Keras ResNet50

00:28:30 Transfer Learning with ResNet in Keras: difficulty #1

00:33:40 Transfer Learning with ResNet in Keras: difficulty #2

00:38:00 Use batches to overcome RAM “Out of Memory”

00:42:00 Final layers to our ResNet model

00:47:00 Nearest Neighbors to look at examples

00:55:00 Fine-Tuning our models and more “Out of Memory” fixes

01:03:00 Find images similar to a word or phrase &
Find images similar to an image !

01:08:15 Homework discussion

01:16:45 How to: multi-input models on large datasets

01:23:15 Generative Adversarial Networks (GAN) in Keras

01:32:00 Multi-Layer-Perceptron (MLP)

01:37:10 Deep Convolutional GAN (DCGAN)

01:40:15 Wasserstein GAN in Pytorch

01:46:30 Introduction to Pytorch

01:55:20 Wasserstein GAN in Pytorch (cont.)
& LSUN dataset

02:05:00 Examples of generated images

02:09:15 Lesson 10 conclusion and assignments for Lesson 11

1 Like

For anyone interested, I wrote a short blog about my experiment training a WGAN to generate images of bicycles. I found it helpful to use a higher learning rate for the generator than the discriminator (at least initially): https://science4performance.com/2017/08/20/deep-learning-faking-it/

1 Like

I wrote some code that lets you reuse the discriminator of the GAN to do image similarity. I worked on that before WGAN was invented and the generated images look like crap but the image similarity part is ok. Maybe someone will find it interesting.

Struggling to download the imagenet_class_index.json file. Has it been moved?

I can’t get it to load correctly.


fpath = get_file('imagenet_class_index.json', 
class_dict = json.load(open(fpath))




----> 4 class_dict = json.load(open(fpath)) 
/home/kevin/anaconda2/envs/ai/lib/python3.6/json/__init__.py in load(fp, cls, object_hook, 
parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
297         cls=cls, object_hook=object_hook,
298         parse_float=parse_float, parse_int=parse_int,
--> 299         parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)

JSONDecodeError: Expecting value: line 2 column 1 (char 1)

Doesn’t appear to be a JSON file:


 '<!DOCTYPE html>\n',
 '<html lang="en">\n',
 '  <head>\n',
 '  <meta charset="utf-8">\n',
 '  <meta content="width=300, initial-scale=1" name="viewport">\n',
 '  <meta name="google-site-verification" content="LrdTUW9psUAMbh4Ia074-BPEVmcpBxF6Gwf0MSgQXZs">\n',
 '  <title>Sign in - Google Accounts</title>\n',
 '  <style>\n',
 '  @font-face {\n',
 "  font-family: 'Open Sans';\n",

solved. files moved to files.fast.ai

fpath = get_file('imagenet_class_index.json', 

Note you’ll have to clear the cache before trying again. see https://keras.io/utils/#get_file

This worked for me:

rm ~/.keras/models/imagenet_class_index.json

1 Like


netG, device_ids=range(torch.cuda.device_count())
cudnn.benchmark = True```
```netG = DCGAN_G(sz, nz, 3, 64, 1, 1).cuda()```
make the same operation to netD, then the time cost should be cut in half.

Thanks @jeremy & everyone!

A few months ago I founded a boutique AI consulting company working with Fortune 500 companies to implement AI. One of our biggest challenges is helping clients figure out what they can do with AI.

So we’re starting to create “AI recipes” to give people examples.

We just released the first one today on Medium:
Boost Ecommerce Sales by 9% with AI Image Enhancement

I draw on many techniques in this lesson and came up with a few new ones. I link to my Jupyter Notebook if you want to copy this approach.

Two questions concerning the multi-threading:

  1. In the notebook showing the multi-threading solution for image resize – I was under the impression that the appends to the bcolz array can be out of order (due to the multi-threading). Then if that is the case, fnames is out of order with respect to “arr”, the bcolz array of resized images. Then the labels “vecs” generated from fnames will be out of order, and when we run predict, there will be a mismatch between x and y.
  2. It does not appear that bcolz is “thread safe”, from some quick googling. Are we lucky that it works?

I think I can answer my own question – the map returns the results in the right order and I did not notice that the bcolz append call happens in the main thread, so it’s fine.

1 Like


Anybody know how to set the jupyter notebook that the content can be stacked and extended like it was done in videos?


You can actually install an extension for the jupyter notebooks :

The stacked and extended sections can be then done be enabling the Collapsible Headings extension.

(While installing the above package, you can also install “https://github.com/Jupyter-contrib/jupyter_nbextensions_configurator” which gives a nice UI where you can simply use checkboxes to enable and disable extensions).

Has anyone run into this kind of error with the “imagenet_process” notebook file. I ran the notebook in 2 different locations. One was in my local machine(no gpu) and other one in a machine having gpu.

The code worked fine in the machine without gpu whereas I am getting error in the one with gpu

rn_top_avg = Sequential([rn_top, AveragePooling2D((7,7))])

ValueError: Variable bn_conv1/moving_mean/biased already exists, disallowed. Did you mean to set reuse=True in VarScope? Originally defined at:

File “C:\Users\EnY\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py”, line 1228, in init
self._traceback = _extract_stack()

I noticed that I am having the same problem in all places Sequential is used in that particular model. Any idea why this is happening?

In the imagenet_process notebook we have this line:

features_mid = bcolz.open(path+‘results/features_mid_1c_r.bc’)

yet this folder was not created previously. Does anyone know how to create this folder. I assume it is the output of the rn_top_avg model?

1 Like

I think that performance of pillow-simd is limited by the read times from the hard disk.
I have seen this happen in case of my cuda as well (it is starved of data at times inspite of having multiple num_workers in play).
jeremy is using a ssd for his data…so the read times are super-fast and hence pillow-simd is effective.

I think it would be interesting to try this idea on mathematics! I don’t think this has been done yet but if I am wrong I would love to read about it. A language model that is capable of completing maths could be useful for many things; classifying correct and incorrect reasoning in human problem solving of for example math questions or accounting and such. I am sure there would be many other creative applications of such a model, and to me, it seems extremely similar to any other language model creation.