we can see the shape of weights is [20, 10]. But I thought for input shape [batch, 10] input is multiplied by weights to get output. However, weights shape [20, 10] cant be multiplied by input shape [batch, 10] because the “inner dimension” is not equal. What’s going on?

As is described in the PyTorch docs, the input is multiplied by the transpose of the weight matrix. Thus, in your example, the weights are first transposed, yielding shape [10, 20], which indeed is compatible with an input size of [batch, 10]. Regarding the efficiency of this method, please refer to this GitHub issue.