Lesson 12 (2019) discussion and wiki

rachel · April 18, 2019, 12:12am

Lesson resources

Lesson video
Swift for TF lesson signup (share with your friends! Signup not required for people in the full course already.)

Software requirements

Some nbs of this lesson also currently require pytorch-nightly if yours got de-installed see this.
Notebook 10c (and subsequent) requires the NVIDIA apex python library. In the environment you created for fastai go to the fastai directory and run pip install git+https://github.com/NVIDIA/apex.

Papers

Blogs

Notes and other resources

AMA about Swift at the end of class

Post your questions in this separate thread:

PierreO · April 18, 2019, 1:36am

Kind of a tangential question but since Jeremy mentioned it: do you have any tips on debugging Deep Learning models?

paul · April 18, 2019, 1:43am

Has mixup been successfully used in NLP yet?

sgugger · April 18, 2019, 1:43am

Not that I know of, but you should definitely try

Vishucyrus · April 18, 2019, 1:44am

The Audio module won’t be covered today ???

rachel · April 18, 2019, 1:44am

Last week we shared an updated schedule. The audio module will be covered in an extra session that will be livestreamed once the course ends. We had more material than will fit in the 7 weeks of the course.

Edited to add: the dates of the extra sessions have not been set yet.

yeldarb · April 18, 2019, 1:45am

Has anyone tried mixup and normal augmentation like rotation/zoom?

Seems like they’d be different augmentations that could be used together.

mcleavey · April 18, 2019, 1:45am

I tried playing around with mixup on NLP embeddings this past week and from early experimenting it seems to work well (maybe someone else has spent more time on it already!)

sgugger · April 18, 2019, 1:45am

We tried yes, and it’s the same as mixup without normal augmentation in our experiments.

yeldarb · April 18, 2019, 1:46am

Do you have an intuition for why that might be?

edwardeasling · April 18, 2019, 1:46am

How broadly can we apply mixup? Could you use it on an image regression problem?

sgugger · April 18, 2019, 1:47am

Mixup is much more powerful in terms of data augmentation, so it more or less erases everything else.

benjmann · April 18, 2019, 1:47am

on mac you can just type ctrl cmd space and type the name of the letter (ie ‘gamma’)

sgugger · April 18, 2019, 1:47am

As long as there is a way to mixup your labels, you can try. It has been wildly experimented in single label classification, not so much in other areas.

alenas · April 18, 2019, 1:48am

it seems mixup forces the model to behave linearly between training classes, why is this desirable?

sgugger · April 18, 2019, 1:50am

I’d reverse it into, why is this not desirable?

alenas · April 18, 2019, 1:50am

adversarial attacks proofing I guess

iyersathya · April 18, 2019, 1:52am

What about backprop with new loss?

sgugger · April 18, 2019, 1:52am

Some researchers actually found it helps against adversarial attacks (for a generalized version of Mixup): here.