Is there a simple way to do gradient accumulation in fastai? Or would the best way be to write a custom fit loop?
Wouldnāt the super resolution example that Jermey just mentioned still work if you combined the step and the zero into a single function? Does anyone else have another example where it wouldnāt work?
There is a Callback for that
You never know when someone might not want to zero the grad all the time, and the fastai training loop zeroes them for you, so thatās not a real hassle
Do you have a good place to learn about callbacks? They seem so powerful, but I havenāt quite gotten my head wrapped around them.
I think I gave a talk about that recently
haha ā¦ nice meme btw!
It was a great talk. I enjoyed it very much.
I had done benchmarking of FP16 using fastai on 2080Ti Vs 1080Ti (With help from @Ekami) with a gentle intro to MP training.
Not sure if its good enough for wiki so Iāll leave it here
if(i%2) == 0 -> add callback
callback is some thing at following points which are methods in Callbackā¦
- on epoch begin
- on batch end,begin etc
You have to implement a new class for custom call back
Class a ( CallBack)
above methods defined
Did you guys create the callbacks or are they native to pytorch
No theyāre our own. PyTorch has hooks, but no callbcacks.
Watching Dataloaders impl reminded me of questiion about DataLoaders API design.
DataLoaders (and Datasets) in PyTorch return one element at the time before they are put together into a mini-batch. Iām training a large model (similar to FAIRās StarSpace) and itās mostly about embedding lookups. The inputs to the model are embedding indices and I suspect the overhead of Python iteration over examples separately is so high that PyTorch canāt feed GPU with data fast enough. I get low GPU utilization as a result.
Is DataLoaders overhead (not being vectorized) something others have run into?
cc @jeremy (tried asking this q during the break but it was too long)
Was the GANModule in Part 1? Anyone remembers the lecture, might have missed it.
Are fastai callbacks synchronous or async?
sg,
what is diff between LearnerCb and TrackerCb ?
Most likely sync
Not presented, no. It was used behind the scenes.
Synchronous.