Let’s try to break this down together. Here’s the equation(that I typed in using Latex markup, hopefully copied correctly)
h_g = tanh \left( \sum_{v \in V} \sigma \left( i(h_v^{(T)}, x_v) \right) \odot tanh \left( j(h_v^{(T)}, x_v) \right) \right)
I’m rather rusty at reading heavy math equations these days(uni. long time ago), so I’ll try my best. Please do correct my if my understanding is wrong.
In plain english,
- the
tanh
operation
- on the result of
- the summation of
- ⊙ operation on (assuming this is elementwise multiplication like you said)
- two tensors (indexed with v)
- the summation ranging on elements in V indexed with v
The above doesn’t really feel intuitive, so I think it’s better to start from the inside and walk outside of the equation.
There’s two tensors at the innermost part, both of these have elements indexed with v. Since the summation ranges over V(indexed with v), you can get rid of the confusion that summation only applies to the first tensor. It’s applied to the result of the ⊙ operation(else the v in second portion wouldn’t make sense).
So, basically you have the following two tensors (v being in range of V).
Assume the A, B only for simplification. I do not know if this representation would be mathematically correct.
A_v = \sigma \left( i(h_v^{(T)}, x_v) \right)
and
B_v = tanh \left( j(h_v^{(T)}, x_v) \right)
Then you perform the ⊙ operation between them, sum it up(over the vs) and perform tanh on it.
h_g = tanh \left( \sum_{v \in V} A_v \odot B_v \right)
(Of course, in Pytorch that’s not the order of execution for the code, I only wrote it down here in the inside → out format to help motivate the explanation)
Hopefully, that helps to some extent.
And even more hopefully, this explanation is correct. 