This is a wiki post - feel free to edit to add links from the lesson or other useful info.
Thanks for a great lesson, there is a lot to digest in this one, I think it might take me a while but great to see it all coming together
There is one downside to using einops
and torch.einsum
. They’ve historically had performance issues compared to using the transpose method.
Out of the box using einops
with PyTorch 2.0 and torch.compile
will also decrease performance, since torch._dynamo
doesn’t recognize einops
code as traceable. This will cause a break in the model’s compiled graph kicking computation for rearrange
back to the slower eager mode. This may be fixed in future PyTorch versions.
There is a workaround using torch._dynamo.allow_in_graph
which I have reproduced from chilli in the EleutherAI discord:
from einops import rearrange
from torch._dynamo import allow_in_graph
allow_in_graph(rearrange)
This will tell dynamo that rearrange
is traceable, allowing PyTorch to compile rearrange
into the model graph. I believe this will work for most, if not all, einops
methods.
allow_in_graph
is not required for torch.einsum
since dynamo is aware of it.
I noticed that we have an imports.py in the miniai folder. Is this manually created or was it autogenerated by nbdev. I am not so familiar with nbdev and in my own version of the repro have been avoiding manually changing things in this folder.
I manually created imports.py
FYI, there was an error in this video (h/t @ste for spotting) where I accidentally drew the heads in multi-headed attention on the wrong axis of the matrix. I’ve uploaded a new video where I’ve added some text to explain this now.
@jeremy btw the video embedded in Practical Deep Learning for Coders - 24: Attention & transformers doesn’t work. My guess is that the embedded link is wrong, currently it is:
src="https://www.youtube-nocookie.com/embed/https://www.youtube.com/watch?v=DH5bp6zTPB4?modestbranding=1"
it should be
src="https://www.youtube-nocookie.com/embed/DH5bp6zTPB4?modestbranding=1"
Update on rearrange
and torch.compile
:
einops 0.6.1 added torch.compile
support. If you use an einops layer (Rearrange
) it will work out of the box. If you use an einops function (rearrange
) as Jeremy does in the lesson, then you still need to register the function to prevent it from breaking the graph. However, einops now has a function to do this automatically:
from einops._torch_specific import allow_ops_in_compiled_graph
allow_ops_in_compiled_graph()
For anyone that got this error: ImportError: cannot import name ‘AttentionBlock’ from ‘diffusers.models.attention’
Apparently AttentionBlock has been replaced by just Attention source
try this: from diffusers.models.attention import Attention as AttentionBlock
But the results won’t be the same because they seem to have changed how the class works.