Lesson 12 (2019) discussion and wiki

devforfu · April 18, 2019, 3:35am

Why do we need torch.cuda.syncronize()? Is it kind of lock to synchronize CUDA threads or something?

sgugger · April 18, 2019, 3:35am

For timing, it’s best because some of the operations are asynchronous.

sparalic · April 18, 2019, 3:36am

Ahh. Thanks!!

ste · April 18, 2019, 3:39am

Maybe we can use LLVM to convert python in c++ instead of pytorch jit…

…joking

tanyaroosta · April 18, 2019, 3:42am

why would you drop weights or embeddings? What are the advantages?

sgugger · April 18, 2019, 3:43am

Try without and see

KarlH · April 18, 2019, 3:44am

It’s another form of regularization. For the impact it has, you can look at the ablation table in the paper.

tanyaroosta · April 18, 2019, 3:44am

I should know better to try and then ask

benjmann · April 18, 2019, 3:45am

don’t we have to implement gradient clipping before we can use the pytorch version?

sgugger · April 18, 2019, 3:45am

Eh eh, he should have but it’s late

tanyaroosta · April 18, 2019, 3:53am

did you do any gradual unfreezing of layers when you fit the imdb data using the wiki model?

sgugger · April 18, 2019, 3:55am

Yes, it is in the notebooks.

harikrishnanrajeev · April 18, 2019, 3:55am

whats the best way to start learning swift ?

sgugger · April 18, 2019, 3:57am

They just answered that question. Here is a swift tour online but if you have an iPad or a Mac, download it in Playgrounds to read it interactively.

RogerS49 · April 18, 2019, 3:58am

Swift on older mac systems is not feasible ??

abbiepopa · April 18, 2019, 4:00am

Feel better soon Jeremy and Rachel! (And welcome Chris!)

PierreO · April 18, 2019, 4:00am

Thank you fastai team for this lesson! Jeremy, we saw your health made it harder than usual for you so thank you for pushing through it!

gamino · April 18, 2019, 4:00am

Great class as always! Thank you for all that are making this possible.

Farah · April 18, 2019, 4:06am

Couldn’t agree more, the ability to integrate machine learning with applications seamlessly is one of the biggest opportunities of S4TF and the reason I am interested in learning Swift/S4TF

rsrivastava · April 18, 2019, 4:07am

Want to know about Mixup look here:https://www.inference.vc/mixup-data-dependent-data-augmentation/
From Link:
We take pairs of datapoints (x1,y1)(x1,y1) and (x2,y2)(x2,y2), then choose a random mixing proportion λλ from a Beta distribution, and create an artificial training example (λx1+(1−λ)x2,λy1+(1−λ)y2)(λx1+(1−λ)x2,λy1+(1−λ)y2). We train the network by minimizing the loss on mixed-up datapoints . This is all.

This is better Mixup data augmentation