Lesson 10 Discussion & Wiki (2019)

rachel · April 4, 2019, 4:00am

I have to leave now, but wanted to post another reminder to not publicly share about new research until the MOOC is released.

ThomM · April 4, 2019, 4:02am

Could we learn the mom parameter as well? Even use (learn) a seperate mom for each measure? Might be hard to keep reasonable/comparable I guess.

paul · April 4, 2019, 4:04am

Could this running batch norm be modified so it can also properly implement batchnorm with the accumulate gradients callback?

champs.jaideep · April 4, 2019, 4:04am

stuff shown now looks complex for me atleast and hard to follow…
Any suggestion to get grab of it offline ?

RogerS49 · April 4, 2019, 4:06am

FYI

The latest 07_batchnorm in github has issues ScriptModule does not get imported
also there is a type error when running

with Hooks
v = x.var((0,2,3)), keepdim=True)
TypeError var(): argument ‘dim’ (posistion1) must be int not tuple

That maybe an issue with the pytorch-nightly. I did git pull and conda update fastai and install nightly

sgugger · April 4, 2019, 4:09am

That because you don’t have the nightly build from pytorch.

PierreO · April 4, 2019, 4:09am

Thanks for the great lessons! Lots of things to mull over once again

jcatanza · April 4, 2019, 4:22am

No, variance is defined for any distribution.

RogerS49 · April 4, 2019, 4:23am

The version of nightly is 1.0.0.dev20190403 py3.6_cuda10.0.130_cudnn7.4.2.0 pytorch

I had a situation were pytorch release version was also installed
it happened when I did

conda update -c fastai fastai

so I have removed it but now 07 or 06 won’t

import torch.nn.functional
fastai=1.0.51=1

I removed 1.0.51 build 1 and replaced with fastai=1.0.50.post1

That clears the

with Hooks
v = x.var((0,2,3)), keepdim=True)
TypeError var(): argument ‘dim’ (posistion1) must be int not tuple
import torch.nn.functional issue

but not the ScriptModule missing
There is a cell

from torch.jit import ScriptModule, script_method, script
from typing import *

which was missing from the git pull version at 18.30 PDT

Now that I have changed back to previous version of fastai and added the cell to 07_batchnorm.ipynb
the notebook is running but has come to a pause at cell 19 get_learn_run.

6.33am I am back to bed. Thanks for you help

RE ScriptModule replace with nn.Module as per later post from jph00

simonjhb · April 4, 2019, 5:06am

Weirdly there are actually distributions that have an undefined variance (and mean for that matter) https://en.wikipedia.org/wiki/Cauchy_distribution#Explanation_of_undefined_moments

jeremy · April 4, 2019, 5:28am

No, the purpose of mom (and eps) is to make training more stable, rather than to decrease the loss for a particular batch. So their gradients don’t help with the task they’re there for!

jeremy · April 4, 2019, 5:29am

It’s still just the filter dimension. Remember that all layers of a neural net have a number of “channels” or “filters” - it doesn’t matter what type of data was in the input.

jeremy · April 4, 2019, 5:31am

Apologies - it should have said nn.Module. I’ve fixed it in the repo now.

jeremy · April 4, 2019, 5:32am

No, because ‘mults’ scales the overall activations - which was the purpose of the init scaling. So we init ‘mults’ to 1.0.

jeremy · April 4, 2019, 5:33am

You should probably use our shifted init (i.e. the GeneralRelu defaults shown in the lesson), or else the very similar ELU.

jeremy · April 4, 2019, 5:35am

Ah you caught me! No we didn’t.

But… we did show that conv is just a matrix multiply, with some tied weights and zeros, and we’ve already done that from scratch; so I figured we don’t gain much doing conv from scratch too. And it would be soooooo slooooow.

But for folks still feeling a little unsure about what a conv does - you absolutely should write it yourself!

jeremy · April 4, 2019, 5:36am

It’s fine to have a negative class for a binary problem (NLP, vision, or anything else) since it’s simply sigmoid activation and we don’t have this same issue.

But we don’t have a negative class for multi-class NLP problems IIRC…

jeremy · April 4, 2019, 5:37am

Yes, if you know you have one and exactly one class represented in each data item, then softmax is best, since you’re helping the model by giving it one less thing to learn.

jeremy · April 4, 2019, 5:38am

nano doesn’t really do enough to be useful. I wouldn’t suggest spending time learning it. Use vim or emacs. Emacs is a little easy to get started with, although vim is better for manipulating datasets (although there are emacs extensions to help there).

jeremy · April 4, 2019, 5:39am

Yes that’s what I was using. It’s pretty basic but it’s ok.