Interesting article on NVIDIA DALI: Data Augmentation Library
There’s a little starter for using DALI in the course repo BTW. It is just enough to give you a sense of how to get started writing your own data-blocks-style API using DALI. I’ll probably come back to it and flesh it out in the coming weeks.
in sgd_step
we say p.data.add_(-lr, p.grad.data)
. Why do we use two arguments instead of multiplying?
I’d probably not use type
and to(device=..., dtype=...)
instead.
Best regards
Thomas
So
%timeit -n 10 grid = F.affine_grid(theta.cuda(), x.size())
would become the (more verbose, unfortunately)
theata_cuda = theta.cuda()
def time_fn():
grid = F.affine_grid(theta_cuda, x.size())
torch.cuda.synchronize()
time_fn() # mini warm-up and synchronize
%timeit -n 10 time_fn()
The warm-up seems to be done generally and it gives us a torch.cuda.synchronize()
so everything before out function is done when we call the function.
Then the time_fn() synchronizes to make sure we don’t read off the time before the kernel is actually done.
I guess one could make a %cuda_timeit
magic to get back to the nice, short way of calling it.
Would it simplify the code in Learner’s init and remove the need for cb_funcs if our callbacks were just passed to Learner through the cb kwarg as an array of class instances e.g.
cbs = [Recorder(), AvgStatsCallback(accuracy)]
instead of cbfs = [Recorder, partial(AvgStatsCallback,accuracy)]
Not if the callback is a LearnerCallback
subsclass, and requires learn
at instantiation, e.g. see examples here: https://docs.fast.ai/metrics.html#Creating-your-own-metric
OK, thank you Jeremy.
I didn’t know because it is not again in the Official Callback documentation. I will try it.
They start by defining beta and zeta values (in the model definition).
Then, through the annealing algorithm applied to these values (parabolic and exponential annealing), they proceed to loss reduction during the training.
It is not yet applicable in the model definition in one step as dropout, but I am happy to find this implementation. Congratulations to the authors!
I discovered the delta rule last year with my PhD in AI at BIU.
Thanks for sharing @Kaspar!
Does it make sense to normalize images after data augmentation in case this step introduces too big distortions, especially to colors?
Usually (e.g. for ImageNet) the normalization is fixed based on the entire training dataset data statistics and not from per image statistics. (And it makes sense, consider Jeremy’s fog vs. sunny thought experiment from the other day.) As such, the normalization is not dependent on the augmentation and it doesn’t matter as much.
Thank you Stas!
All about magic methods in Python:
http://minhhh.github.io/posts/a-guide-to-pythons-magic-methods
Check the docs for add
and tell us what you find
None of the things we’ve built in this course are in the docs. We’re building everything from scratch, remember!
No, because the point of such augmentation would be lost if you then normalized it out again!
I’m getting an assertion error for test_eq(setify('aa'), {'aa'})
. It looks like (‘aa’) is getting listified as [‘a’, ‘a’], which then turns into the set {‘a’}. My listify is coming from nb04 – any suggestions?
OK, I got it Jeremy. Thank you!
I don’t understand why you shouldn’t normalize after augmentation. Augmentation extends the training set with “new” images, exactly as if we had gathered such images in the field as part of our data set. In the latter case, we would normalize the whole data set, after its collection. Why wouldn’t we treat the data set containing the augmented images the same way?
You might need to git pull the course repo.