Torch Linear weight

YY1401 · September 10, 2023, 5:30pm

If we do

import torch
nn.Linear(10, 20).weights.shape

we can see the shape of weights is [20, 10]. But I thought for input shape [batch, 10] input is multiplied by weights to get output. However, weights shape [20, 10] cant be multiplied by input shape [batch, 10] because the “inner dimension” is not equal. What’s going on?

BobMcDear · September 11, 2023, 12:10pm

Hello,

As is described in the PyTorch docs, the input is multiplied by the transpose of the weight matrix. Thus, in your example, the weights are first transposed, yielding shape [10, 20], which indeed is compatible with an input size of [batch, 10]. Regarding the efficiency of this method, please refer to this GitHub issue.