Part 2 Lesson 13 Wiki

This thread is a wiki - please add any links etc that you think may be useful.

<<< Wiki: Lesson 12Wiki: Lesson 14 >>>

Miscellaneous:

GANs

AI and Ethics

Timeline

  • (0:00:01) Image enhancement
  • (0:00:40) Deep painterly harmonization paper - Style transfer
  • (0:01:10) Stochastic weight averaging william horton
  • (0:02:05) Train Phase API
  • (0:03:35) Training phase api explanation
  • (0:03:41) Picture of iterations - step learning rate decay
  • (0:04:30) Training Phases explanation
  • (0:05:50) lr decay examples
  • (0:07:52) Adding your own schedulers - example SGDR
  • (0:08:22) Example to do 1cycle
  • (0:08:58) discriminative learning rates
  • (0:09:23) LARS paper - form of discriminative learning rates
  • (0:10:05) Customized LR finders
  • (0:11:10) Change the optimizer
  • (0:11:50) Change the data during training
  • (0:12:50) Dawn bench competition for imagenet
  • (0:15:16) CIFAR result on DAWN bench
  • (0:17:05) Conv architecture gap - Inception Resnet
  • (0:19:35) concat in inception
  • (0:22:43) Basic idea of Inception networks
  • (0:23:20) Instead of A x A use A x 1 followed by 1xA - Lower rank approximation
  • (0:27:00) factored convolutions
  • (0:27:30) Stem in backbone
  • (0:30:00) Image enhancement paper - Progressive GANs
  • (0:30:40) Inner network - irrelevant
  • (0:31:10) Progressive GAN - increase image size
  • (0:34:02) 1024 images
  • (0:34:30) Obama fake video
  • (0:35:30) Questions and Ethics in AI
  • (0:36:55) Face recognition from various companies
  • (0:38:40) Women vs. men bias
  • (0:40:08) Google Translate men vs. women
  • (0:40:40) Machine learning can amplify bias
  • (0:42:15) Facebook examples
  • (0:45:15) Face detection
  • (0:46:15) meetup.com example men going more
  • (0:47:50) Bias black vs white
  • (0:52:46) Responsibilities in hiring
  • (0:54:07) IBM’s impact on Nazi Germany
  • (0:56:50) Dropout patent
  • (0:57:19) Artistic style transfer - Patent
  • (1:02:08) Code style transfer
  • (1:07:35) content loss and style loss
  • (1:11:20) Compare activations - perceptual loss
  • (1:13:15) Code style transfer
  • (1:15:25) random image
  • (1:17:22) Using mid layer activations
  • (1:19:05) optimizer
  • (1:20:25) LBFGS optimizer
  • (1:21:15) LBFGS algorithm works well
  • (1:21:40) Limited memory optimizer
  • (1:22:30) Diagram - optimizer explanation how it works
  • (1:25:05) Keeping track of every step takes lot of step so keep only few gradients
  • (1:26:52) Code for optimizer
  • (1:28:16) content loss
  • (1:29:32) pytorch hooks - forward hooks
  • (1:31:41) vgg activations
  • (1:36:42) single precision floating point, half precision
  • (1:38:22) Pictures from paper
  • (1:39:35) Create Style loss
  • (1:38:50) Grab activations of some layer
  • (1:40:35) Look at painting from wikipedia
  • (1:41:15) Comparing activations - throw away spatial information
  • (1:43:00) Dot product of channels - intuition
  • (1:52:00) save features for all blocks
  • (1:57:16) Style transfer combined
  • (2:00:05) Google magenta - music project
  • (2:01:25) Putting shield in
  • (2:02:35) probabilistic programming
  • (2:05:00) Pre-training for generic style transfer
  • (2:05:40) Pictures from paper
  • (2:06:45) Maths in the paper

Other tips and resources

For cyclegan notebook

  • data source: !wget https://people.eecs.berkeley.edu/~taesung_park/CycleGAN/datasets/horse2zebra.zip
  • modify the following code to get start:
opt = TrainOptions().parse(['--dataroot', '/data0/datasets/cyclegan/horse2zebra', '--nThreads', '8', '--no_dropout', '--niter', '100', '--niter_decay', '100', '--name', 'nodrop', '--gpu_ids', '2'])
  • '--dataroot', '/data0/datasets/cyclegan/horse2zebra': horse2zebra.zip file path
  • '-–nThreads', '8': lower the no. of threads if kernels die
  • '–-gpu_ids', '2': set ‘0’ if you only have one GPU

Data for style-transfer notebook, ImageNet Object Detection Challenge

12 Likes

Can you please discuss intuitions for when you’d use one kind of learning schedule vs. another?

The customized learning rate finder is in a pull request right now, so you’ll have to wait a bit to use that specific feature.

7 Likes

Can you please explain the best practice for finding the best clr_div,cut_div in clr?

1 Like

Great work on those features, that will be awesome to be able to explore.

1 Like

You’re welcome! ^^

2 Likes

does anyone has the reference to “concat pooling” trick for the 2nd place cifar10 competition?

If I’m right, details of “concat pooling” can be found in Jeremy’s paper.

which paper? thanks in advance.

Fine-tuned Language Models for Text Classification

Basically it means that instead of using the final hidden vector from an RNN, you just concatenate the Max & Avg pool of all the hidden states along with the final hidden vector to be passed on to the next layer.

6 Likes

1x1conv is usually called “network within network” in the literature, what is the intuition of such name?

3 Likes

Throwback post by Chris Olah I was just re-reading this weekend that talks about dimensionality reduction: http://colah.github.io/posts/2014-10-Visualizing-MNIST/

I appreciate Jeremy’s attention to visualizing the convnets so that we can understand dimensionality reduction with his drawings/visual aids. So helpful. =)

Here you go.

4 Likes

@sgugger, do you have a link to the notebook Jeremy went through?

3 Likes

@sgugger’s work here: https://github.com/sgugger/Deep-Learning/blob/master/Understanding%20the%20new%20fastai%20API%20for%20scheduling%20training.ipynb

6 Likes

is it safe to say nn.embedding is the same as “low rank approximation”?

1 Like

Interesting. The 5x1 and 1x5 transform seems like the opposite operation from mixture of softmax where we’re trying to increase the rank of the matrices.

I’m surprised that the rank 1 representation is rich enough to represent the problem without degrading performance significantly.

Progressive GANs paper:

Progressive Growing of GANs for Improved Quality, Stability, and Variation

11 Likes

Some interesting Progressive GANs links:

http://research.nvidia.com/publication/2017-10_Progressive-Growing-of

5 Likes

Why is there such a big variation on face type/structure (and hair!!!) when the faces are been GANZed?