Lesson 8 notes

You can normalize the data while creating dataloaders. You can refer to the documentation to see how its done! Regards

Palaash, thank you for your response, but I have struggled with this for a while so please forgive my skepticism. I know that if I normalize the data my validation loss is lower, but I can attribute that to the smaller range of the data. When I denormalize the data the loss between the denormalized predicted data and the original dataset is still large. I will try it again though, since it has been a while since I have done that.

I recommend you create a new topic for your issue.
For now, can you please elaborate your project? Is it a classification or a regression problem?
Either way, outputs arenā€™t normalized, only the inputs are! So, I canā€™t undersand why you would need to denormalize the data. If youā€™ve done part 1 of this course, you would remember a similar problem, where the outputs are processed on a scaled sigmoid function.
Regards

Iā€™ve thought about this for a bit and I can now explain why I was trying to denormalize the data. I wanted to analyze the results. If I load prenormalized data into a model, then train it, and I get a low validation loss I want to denormalize the data to see how far the denormalized predicted data lies from the original data. I have found with my data set that the I can get a high Cohenā€™s d coefficient between the original and predicted, but the r squared is close to zero. Basically Iā€™m trying to see if the loss function is valid for this data set. I finished Lesson 8 and see that if I can built an algorithm from scratch I can swap out the loss function and compare results.
As a side note, and to address your last reply directly how do I create a topic that gets viewed and replied to? I have submitted a couple queries, but donā€™t get good responses.

there must a button that says ā€œ+ New Topicā€ at the top of the home page of these forums. Cheers

I have implemented broadcasting in python, just to understand how it works. It might be useful for someone starting to learn this stuff.

def broadcast(a, b, op):

    if isinstance(a, Number) and isinstance(b, Number):
        return op(a, b)

    result = []
    if a.ndim == b.ndim:
        if a.shape[0] != b.shape[0]:
            if a.shape[0] == 1:
                a = cycle(a)
            elif b.shape[0] == 1:
                b = cycle(b)
            else:
                raise ValueError(
                    f"Could not broadcast together with shapes {a.shape} {b.shape}")
    elif a.ndim < b.ndim:
        a = cycle([a])
    else:
        b = cycle([b])

    for a_in, b_in in zip(a, b):
        result.append(broadcast(a_in, b_in, op))

    return np.array(result)

You can find the whole code plus some tests here.

Hi there,

I just started with part 2 of the 2019 course, since I heard from @muellerzr that itā€™s one of the best courses in his opinion.

After watching Lesson8, and implementing the matmul versions myself I noted that we go from 3 loops, to 2 loops, to 1 loop and then use einstein summation for the final improvement.

So, I wondered whether we can actually also get rid of the last loop without resorting to einstein summation. And in fact we can:

def matmul(a, b):
    ar, ac = a.shape
    br, bc = b.shape
    assert ac == br
    
    out = (a[:,:,None] * b).sum(dim=1)
    return out

For matrix a of shape [4,3] and b of shape [3,2] we get:

> a[:,:,None].shape
torch.Size([4, 3, 1])
> b.shape
torch.Size([3, 2])

So the multiplication will result in a [4,3,2] tensor according to broadcasting. Next we can sum over the middle dimension so arrive at a [4,2] shaped tensor which is what we would also expect based on the dimensions of a and b.

I think this is pretty cool, especially because it runs in the same time as einstein summation:

> %timeit -n 100 matmul(a,b)
24.5 Āµs Ā± 10.4 Āµs
> %timeit -n 100 torch.einsum('ij,jk->ik', a,b)
22.6 Āµs Ā± 8.01 Āµs
2 Likes