Lesson 8 notes

PalaashAgrawal · July 20, 2020, 4:39pm

You can normalize the data while creating dataloaders. You can refer to the documentation to see how its done! Regards

dtaualii · July 20, 2020, 5:02pm

Palaash, thank you for your response, but I have struggled with this for a while so please forgive my skepticism. I know that if I normalize the data my validation loss is lower, but I can attribute that to the smaller range of the data. When I denormalize the data the loss between the denormalized predicted data and the original dataset is still large. I will try it again though, since it has been a while since I have done that.

PalaashAgrawal · July 20, 2020, 6:40pm

I recommend you create a new topic for your issue.
For now, can you please elaborate your project? Is it a classification or a regression problem?
Either way, outputs aren’t normalized, only the inputs are! So, I can’t undersand why you would need to denormalize the data. If you’ve done part 1 of this course, you would remember a similar problem, where the outputs are processed on a scaled sigmoid function.
Regards

dtaualii · July 23, 2020, 6:47pm

I’ve thought about this for a bit and I can now explain why I was trying to denormalize the data. I wanted to analyze the results. If I load prenormalized data into a model, then train it, and I get a low validation loss I want to denormalize the data to see how far the denormalized predicted data lies from the original data. I have found with my data set that the I can get a high Cohen’s d coefficient between the original and predicted, but the r squared is close to zero. Basically I’m trying to see if the loss function is valid for this data set. I finished Lesson 8 and see that if I can built an algorithm from scratch I can swap out the loss function and compare results.
As a side note, and to address your last reply directly how do I create a topic that gets viewed and replied to? I have submitted a couple queries, but don’t get good responses.

PalaashAgrawal · July 23, 2020, 8:14pm

there must a button that says “+ New Topic” at the top of the home page of these forums. Cheers

ghamarian · June 6, 2022, 7:11am

I have implemented broadcasting in python, just to understand how it works. It might be useful for someone starting to learn this stuff.

def broadcast(a, b, op):

    if isinstance(a, Number) and isinstance(b, Number):
        return op(a, b)

    result = []
    if a.ndim == b.ndim:
        if a.shape[0] != b.shape[0]:
            if a.shape[0] == 1:
                a = cycle(a)
            elif b.shape[0] == 1:
                b = cycle(b)
            else:
                raise ValueError(
                    f"Could not broadcast together with shapes {a.shape} {b.shape}")
    elif a.ndim < b.ndim:
        a = cycle([a])
    else:
        b = cycle([b])

    for a_in, b_in in zip(a, b):
        result.append(broadcast(a_in, b_in, op))

    return np.array(result)

You can find the whole code plus some tests here.

gist.github.com

https://gist.github.com/ghamarian/bdd790c8c4880eb1752aa7626a971a60

broadcasting.py

import operator
from itertools import cycle
from numbers import Number

import numpy as np


def broadcast(a, b, op):

    if isinstance(a, Number) and isinstance(b, Number):

This file has been truncated. show original

lucasvw · March 2, 2023, 12:00pm

Hi there,

I just started with part 2 of the 2019 course, since I heard from @muellerzr that it’s one of the best courses in his opinion.

After watching Lesson8, and implementing the matmul versions myself I noted that we go from 3 loops, to 2 loops, to 1 loop and then use einstein summation for the final improvement.

So, I wondered whether we can actually also get rid of the last loop without resorting to einstein summation. And in fact we can:

def matmul(a, b):
    ar, ac = a.shape
    br, bc = b.shape
    assert ac == br
    
    out = (a[:,:,None] * b).sum(dim=1)
    return out

For matrix a of shape [4,3] and b of shape [3,2] we get:

> a[:,:,None].shape
torch.Size([4, 3, 1])
> b.shape
torch.Size([3, 2])

So the multiplication will result in a [4,3,2] tensor according to broadcasting. Next we can sum over the middle dimension so arrive at a [4,2] shaped tensor which is what we would also expect based on the dimensions of a and b.

I think this is pretty cool, especially because it runs in the same time as einstein summation:

> %timeit -n 100 matmul(a,b)
24.5 µs ± 10.4 µs
> %timeit -n 100 torch.einsum('ij,jk->ik', a,b)
22.6 µs ± 8.01 µs