Lesson 9 Discussion & Wiki (2019)

KarlH · March 26, 2019, 3:04am

Is there a simple way to do gradient accumulation in fastai? Or would the best way be to write a custom fit loop?

tigarcia · March 26, 2019, 3:05am

Wouldn’t the super resolution example that Jermey just mentioned still work if you combined the step and the zero into a single function? Does anyone else have another example where it wouldn’t work?

sgugger · March 26, 2019, 3:06am

There is a Callback for that

sgugger · March 26, 2019, 3:07am

You never know when someone might not want to zero the grad all the time, and the fastai training loop zeroes them for you, so that’s not a real hassle

KevinB · March 26, 2019, 3:08am

Do you have a good place to learn about callbacks? They seem so powerful, but I haven’t quite gotten my head wrapped around them.

sgugger · March 26, 2019, 3:08am

I think I gave a talk about that recently

wgpubs · March 26, 2019, 3:08am

haha … nice meme btw!

Moody · March 26, 2019, 3:09am

It was a great talk. I enjoyed it very much.

init_27 · March 26, 2019, 3:09am

I had done benchmarking of FP16 using fastai on 2080Ti Vs 1080Ti (With help from @Ekami) with a gentle intro to MP training.

Not sure if its good enough for wiki so I’ll leave it here

gmohandass · March 26, 2019, 3:09am

if(i%2) == 0 -> add callback

champs.jaideep · March 26, 2019, 3:10am

callback is some thing at following points which are methods in Callback…

on epoch begin
on batch end,begin etc

You have to implement a new class for custom call back
Class a ( CallBack)

above methods defined

KevinB · March 26, 2019, 3:10am

Did you guys create the callbacks or are they native to pytorch

sgugger · March 26, 2019, 3:11am

No they’re our own. PyTorch has hooks, but no callbcacks.

gkk · March 26, 2019, 3:11am

Watching Dataloaders impl reminded me of questiion about DataLoaders API design.
DataLoaders (and Datasets) in PyTorch return one element at the time before they are put together into a mini-batch. I’m training a large model (similar to FAIR’s StarSpace) and it’s mostly about embedding lookups. The inputs to the model are embedding indices and I suspect the overhead of Python iteration over examples separately is so high that PyTorch can’t feed GPU with data fast enough. I get low GPU utilization as a result.

Is DataLoaders overhead (not being vectorized) something others have run into?
cc @jeremy (tried asking this q during the break but it was too long)

Gabriel_Syme · March 26, 2019, 3:11am

Was the GANModule in Part 1? Anyone remembers the lecture, might have missed it.

styld · March 26, 2019, 3:11am

Are fastai callbacks synchronous or async?

champs.jaideep · March 26, 2019, 3:12am

sg,
what is diff between LearnerCb and TrackerCb ?

gmohandass · March 26, 2019, 3:12am

Most likely sync

sgugger · March 26, 2019, 3:12am

Not presented, no. It was used behind the scenes.

sgugger · March 26, 2019, 3:12am

Synchronous.