Part 2 Lesson 13 Wiki


(Jeremy Howard (Admin)) #1

This thread is a wiki - please add any links etc that you think may be useful.

<<< Wiki: Lesson 12Wiki: Lesson 14 >>>

Miscellaneous:

GANs

AI and Ethics

Timeline

  • (0:00:01) Image enhancement
  • (0:00:40) Deep painterly harmonization paper - Style transfer
  • (0:01:10) Stochastic weight averaging william horton
  • (0:02:05) Train Phase API
  • (0:03:35) Training phase api explanation
  • (0:03:41) Picture of iterations - step learning rate decay
  • (0:04:30) Training Phases explanation
  • (0:05:50) lr decay examples
  • (0:07:52) Adding your own schedulers - example SGDR
  • (0:08:22) Example to do 1cycle
  • (0:08:58) discriminative learning rates
  • (0:09:23) LARS paper - form of discriminative learning rates
  • (0:10:05) Customized LR finders
  • (0:11:10) Change the optimizer
  • (0:11:50) Change the data during training
  • (0:12:50) Dawn bench competition for imagenet
  • (0:15:16) CIFAR result on DAWN bench
  • (0:17:05) Conv architecture gap - Inception Resnet
  • (0:19:35) concat in inception
  • (0:22:43) Basic idea of Inception networks
  • (0:23:20) Instead of A x A use A x 1 followed by 1xA - Lower rank approximation
  • (0:27:00) factored convolutions
  • (0:27:30) Stem in backbone
  • (0:30:00) Image enhancement paper - Progressive GANs
  • (0:30:40) Inner network - irrelevant
  • (0:31:10) Progressive GAN - increase image size
  • (0:34:02) 1024 images
  • (0:34:30) Obama fake video
  • (0:35:30) Questions and Ethics in AI
  • (0:36:55) Face recognition from various companies
  • (0:38:40) Women vs. men bias
  • (0:40:08) Google Translate men vs. women
  • (0:40:40) Machine learning can amplify bias
  • (0:42:15) Facebook examples
  • (0:45:15) Face detection
  • (0:46:15) meetup.com example men going more
  • (0:47:50) Bias black vs white
  • (0:52:46) Responsibilities in hiring
  • (0:54:07) IBM’s impact on Nazi Germany
  • (0:56:50) Dropout patent
  • (0:57:19) Artistic style transfer - Patent
  • (1:02:08) Code style transfer
  • (1:07:35) content loss and style loss
  • (1:11:20) Compare activations - perceptual loss
  • (1:13:15) Code style transfer
  • (1:15:25) random image
  • (1:17:22) Using mid layer activations
  • (1:19:05) optimizer
  • (1:20:25) LBFGS optimizer
  • (1:21:15) LBFGS algorithm works well
  • (1:21:40) Limited memory optimizer
  • (1:22:30) Diagram - optimizer explanation how it works
  • (1:25:05) Keeping track of every step takes lot of step so keep only few gradients
  • (1:26:52) Code for optimizer
  • (1:28:16) content loss
  • (1:29:32) pytorch hooks - forward hooks
  • (1:31:41) vgg activations
  • (1:36:42) single precision floating point, half precision
  • (1:38:22) Pictures from paper
  • (1:39:35) Create Style loss
  • (1:38:50) Grab activations of some layer
  • (1:40:35) Look at painting from wikipedia
  • (1:41:15) Comparing activations - throw away spatial information
  • (1:43:00) Dot product of channels - intuition
  • (1:52:00) save features for all blocks
  • (1:57:16) Style transfer combined
  • (2:00:05) Google magenta - music project
  • (2:01:25) Putting shield in
  • (2:02:35) probabilistic programming
  • (2:05:00) Pre-training for generic style transfer
  • (2:05:40) Pictures from paper
  • (2:06:45) Maths in the paper

Other tips and resources

For cyclegan notebook

  • data source: !wget https://people.eecs.berkeley.edu/~taesung_park/CycleGAN/datasets/horse2zebra.zip
  • modify the following code to get start:
opt = TrainOptions().parse(['--dataroot', '/data0/datasets/cyclegan/horse2zebra', '--nThreads', '8', '--no_dropout', '--niter', '100', '--niter_decay', '100', '--name', 'nodrop', '--gpu_ids', '2'])
  • '--dataroot', '/data0/datasets/cyclegan/horse2zebra': horse2zebra.zip file path
  • '-–nThreads', '8': lower the no. of threads if kernels die
  • '–-gpu_ids', '2': set ‘0’ if you only have one GPU

Data for style-transfer notebook, ImageNet Object Detection Challenge


About the Part 2 & Alumni category
Part 2 Lesson 14 Wiki
(blake west) #13

Can you please discuss intuitions for when you’d use one kind of learning schedule vs. another?


#16

The customized learning rate finder is in a pull request right now, so you’ll have to wait a bit to use that specific feature.


(YangLu) #17

Can you please explain the best practice for finding the best clr_div,cut_div in clr?


(Kevin Bird) #18

Great work on those features, that will be awesome to be able to explore.


#19

You’re welcome! ^^


#20

does anyone has the reference to “concat pooling” trick for the 2nd place cifar10 competition?


(Ananda Seelan) #21

If I’m right, details of “concat pooling” can be found in Jeremy’s paper.


#22

which paper? thanks in advance.


(Ananda Seelan) #23

Fine-tuned Language Models for Text Classification

Basically it means that instead of using the final hidden vector from an RNN, you just concatenate the Max & Avg pool of all the hidden states along with the final hidden vector to be passed on to the next layer.


#24

1x1conv is usually called “network within network” in the literature, what is the intuition of such name?


(Erin Pangilinan) #25

Throwback post by Chris Olah I was just re-reading this weekend that talks about dimensionality reduction: http://colah.github.io/posts/2014-10-Visualizing-MNIST/

I appreciate Jeremy’s attention to visualizing the convnets so that we can understand dimensionality reduction with his drawings/visual aids. So helpful. =)


(Phani Srikanth) #45

Here you go.


(Kevin Bird) #46

@sgugger, do you have a link to the notebook Jeremy went through?


(Phani Srikanth) #48

@sgugger’s work here: https://github.com/sgugger/Deep-Learning/blob/master/Understanding%20the%20new%20fastai%20API%20for%20scheduling%20training.ipynb


#49

is it safe to say nn.embedding is the same as “low rank approximation”?


(Even Oldridge) #50

Interesting. The 5x1 and 1x5 transform seems like the opposite operation from mixture of softmax where we’re trying to increase the rank of the matrices.

I’m surprised that the rank 1 representation is rich enough to represent the problem without degrading performance significantly.


(Pavel Surmenok) #53

Progressive GANs paper:

Progressive Growing of GANs for Improved Quality, Stability, and Variation


(Britt Selvitelle) #54

Some interesting Progressive GANs links:

http://research.nvidia.com/publication/2017-10_Progressive-Growing-of


(YangLu) #55

Why is there such a big variation on face type/structure (and hair!!!) when the faces are been GANZed?