Not sure if this matters in this domain, but (speaking as a relativity guy), usually in relativity the Einstein summation convention is not just when an index letter is repeated, but when it appears repeated in different dimensions.
Yes, it’s not allowed to be repeated in the same dimension AFAIK.
Interesting that we’re using pytorch for doing the “real” C execution. Kind of contrary to the “from basics”. I’m assuming this is mostly to set us up to understand why the prospect of S4TF is so appealing, because we don’t have to hit this barrier between python and what’s “actually” happening.
Edit: haha that’s kind of what he’s saying right now.
Totally. I’ve used them with languages like Scala and R. You code execution speeds up too for the work I’ve done so far. Thanks for raising it.
That’s also because there is no other way if you want to be fast with python (apart from numpy).
Einsum seems interesting but I don’t quite follow why there is a performance improvement by using it in this case.
Why is einsum
faster? Didn’t quite understand that part
As Dennis mentioned above, Numpy/C is also taking advantage of vectorization/SIMD (single instruction-multiple data) to speed things up.
Because it has been optimized in pytorch and lets us get rid of the last for loop in python (did we tell you it’s a slow language? I can’t quite remember )
So if an index letter is repeated in the same dimension it will just leave that dimension alone right ? For example in ‘bik,bkj->bij’ b is repeated but in the same dimension so it’s not used in the multiplication ?
Exactly.
is “@” a pytorch shorthand?
We cover some similar themes to today’s lesson as part of lesson 1 of the fast.ai computational linear algebra course:
If you speak math here’s the wikipedia for the Einstein convention. Basically it allows you to drop summation signs, which was its original purpose. Physicists got sick of having to write them all the time.
No, @ is matrix multiplication in Python (introduced with Python 3.5)
But wait. We don’t have matrix multiplication in Python. We just learned that we have to use pytorch’s matmul
. Or did I miss something there?
You have to use pytorch to be fast, especially if you want to use the GPU. We didn’t profile PyTorch again python’s @ but they are probably at the same speed on CPU.
Roger that. Just making sure Jeremy didn’t somehow recreate a matmul function in that notebook, which he then exported, but I missed. We now know exactly why using the matmul
/@
operations are the only way to go.
i use this practice for transformations .Please let me know if this complies with what jeremy said about train/valid norm
data = (src.transform((trn_tfms,trn_tfms), size=324,resize_method=ResizeMethod.SQUISH)
.databunch(bs=40,num_workers=0).normalize(imagenet_stats))