The code for the swift matmul in the matmul_01 notebook was:
// a and b are the flattened array elements, aDims/bDims are the #rows/columns of the arrays.
func swiftMatmul(a: [Float], b: [Float], aDims: (Int,Int), bDims: (Int,Int)) -> [Float] {
assert(aDims.1 == bDims.0, “matmul shape mismatch”)var res = Array(repeating: Float(0.0), count: aDims.0 * bDims.1) for i in 0 ..< aDims.0 { for j in 0 ..< bDims.1 { for k in 0 ..< aDims.1 { res[i+aDims.0*j] += a[i+aDims.0*k] * b[k+bDims.0*j] } } } return res
}
I ended up writing this:
func myMatmul(a: [Float], b: [Float],aDims: (Int,Int),bDims: (Int,Int)) -> ([Float],(Int,Int)) {
var res = Array(repeating: Float(0.0), count: aDims.0*bDims.1)
for i in 0…<aDims.0 {
for j in 0…<bDims.1 {
for k in 0…<aDims.1 {
res[i * bDims.1 + j]+=a[i * aDims.1 + k] * b[k * bDims.1 + j]
}
}
}
return (res,(aDims.0,bDims.1))
}
Obviously these are different, and it seems to be what we switching columns/rows. So I multiplied this by an array of all ones to make sure my hypothesis of this being an understanding of rows/columns was correct:
let (aDims,bDims) = ((5, 784), (784, 10))
let flatA = xTrain[0…<5].scalars//Array(repeating: Float(1.0), count: aDims.0 * aDims.1)
let flatB = Array(repeating: Float(1.0), count: bDims.0 * bDims.1)//weights.scalars
print(myMatmul(a:flatA,b:flatB,aDims:aDims,bDims:bDims))
print(swiftMatmul(a:flatA,b:flatB,aDims:aDims,bDims:bDims))
Here are the results:
([25.925554, 25.925554, 25.925554, 25.925554, 25.925554, 25.925554, 25.925554, 25.925554, 25.925554, 25.925554, 38.106697, 38.106697, 38.106697, 38.106697, 38.106697, 38.106697, 38.106697, 38.106697, 38.106697, 38.106697, -33.34089, -33.34089, -33.34089, -33.34089, -33.34089, -33.34089, -33.34089, -33.34089, -33.34089, -33.34089, 2.2452376, 2.2452376, 2.2452376, 2.2452376, 2.2452376, 2.2452376, 2.2452376, 2.2452376, 2.2452376, 2.2452376, -0.38203776, -0.38203776, -0.38203776, -0.38203776, -0.38203776, -0.38203776, -0.38203776, -0.38203776, -0.38203776, -0.38203776], (5, 10))
[13.410015, 25.3511, 6.9443583, 4.9880714, -18.138931, 13.410015, 25.3511, 6.9443583, 4.9880714, -18.138931, 13.410015, 25.3511, 6.9443583, 4.9880714, -18.138931, 13.410015, 25.3511, 6.9443583, 4.9880714, -18.138931, 13.410015, 25.3511, 6.9443583, 4.9880714, -18.138931, 13.410015, 25.3511, 6.9443583, 4.9880714, -18.138931, 13.410015, 25.3511, 6.9443583, 4.9880714, -18.138931, 13.410015, 25.3511, 6.9443583, 4.9880714, -18.138931, 13.410015, 25.3511, 6.9443583, 4.9880714, -18.138931, 13.410015, 25.3511, 6.9443583, 4.9880714, -18.138931]
Notice that my results repeat in groups of 10 5 times, and the notebook’s results repeat the 5 different results 10 times [13.410015, 25.3511, 6.9443583, 4.9880714, -18.138931]
Admittedly I have gotten a bit confused on what the proper row/column configuration here is… as a thought the shape was (aDims:(row:5,column:784),bDims:(row:784,column:10)) = results:(row:5,column:10)
To me the class notebook seems to have switched the deminsions of the rows/columns for A and B, otherwise multiplying by an array of 1s should have created the pattern you see in mymatmul. I could of course have gotten my understanding of the deminsions mixed up as well, but I think I am right?