This is super cool.
I briefly tried to implement this in pytorch, but found that
torch.eig() doesn’t have a derivative implemented. Then I tried to implement the closed form calculation in the paper referenced:
M1 tends to have negative values so taking the square root of it leads to complex values. I assume this is why an eigenvalue approach was used in the first place. Clamping the values of
M1 to a minimum of 0 results is the final term in the expression becoming overwhelmingly negative relative to the other terms, so the entire expression evaluates to the square root of a negative number. I suppose it makes sense that clamping the matrix values causes issues as it fundamentally changes the math.
Also on this subject - I’m assuming the square root operation is element-wise. Am I correct in thinking this? Or does it refer to decomposing
M1 into a matrix
X such that
X @ X = M1?
Can anyone think of a way forward for doing this in pytorch other than waiting for
torch.eig() to get a