@jeremy So based on your suggestion I have tried to break it down. Here is what I’ve observed:
Now I created a second bias tensor of rank 2 and tried to add that to the product of input_ and weight. And this works!!
So if we look at the size of bias and bias 1 we can see that bias is actually rank 1 tensor of size whereas bias 1 is rank 2 tensor 5*1. If we follow the broadcasting rules it is clear that bias will not work as the trailing dimension has to be either 1 or the same as the other tensor.
input@weight : 5x3
bias : 5
bias 1: 5x1
But the doubt still remains then why is it working in the custom network that we are using below:
def get_weights(*dims): return nn.Parameter(torch.randn(dims)/dims)
def softmax(x): return torch.exp(x)/(torch.exp(x).sum(dim=1)[:,None])
self.l1_w = get_weights(28*28, 10) # Layer 1 weights
self.l1_b = get_weights(10) # Layer 1 bias
def forward(self, x):
x = x.view(x.size(0), -1)
x = (x @ self.l1_w) + self.l1_b # Linear Layer
x = torch.log(softmax(x)) # Non-linear (LogSoftmax) Layer
Because when I create a network using this class and look at the size of bias it is a rank 1 tensor. Am I missing something here?