you are combining two cos functions smoothly and the condition is that you want it start at 0.3 and go up to 0.6(for 30%) and then back down to 0.2(for 70%). (these are the y values on the graph)
So you basically have those three points. But you donât know where the corresponding x values for these points are. This is controlled by what percentage you allocate to each of the schedulers.
pcts = [0.3,0.7] - first 30% sched_cos(0.3, 0.6) and last 70% sched_cos(0.6, 0.2)
pcts = [0.0,0.3,1.0] , because of the cumsum
starting point - sp; ending point - ep
for scheduler1: sp =0, ep =0.3
for scheduler2: sp =0.3, ep =1.0
actual_pos =( x - starting_point) / (ending_point - staring_point)
Hope that helps.
i have a question on the last iteration of the loop:
for i,o in enumerate(p):
idx = (o >= pcts).nonzero().max()
actual_pos = (o-pcts[idx]) / (pcts[idx+1]-pcts[idx])
print(i,o, idx,actual_pos)
it should return idx =2. Which should break the code as pcts[3] doesnât exist. i donât get how it works ?
i feel it is the equal comparison of floating point numbers that is causing it(but that fells odd, i think iâm wrong. missing something!!! )
Any help
@jeremy Here is a nitpick. I noticed that in the video at 33 minutes, that the Excel version of NLL is using Log10. Pytorch uses the natural log. I tried to replicate the functions from the spreadsheet and they didnât match.
Hi
I was going through the lesson 9. I wanted to know how the no of iterations i.e. (n-1)//bs + 1 was derived. The expression is correct for all the cases but I am trying to know how the expression came in the first place. It holds true for both even and odd numbers
> for epoch in range(epochs):
> for i in range((n-1)//bs + 1):
> start_i = i*bs
Iâm used to thinking of âlikelihoodâ as being the probability of the data given the parameters, yet the second line uses the true target for the calculation. Why is this? Can someone help clarify whatâs going on here?
Loss functions for classification problems need the target labels and the predicted probabilities as inputs, because their computation compares the actual vs. predicted distribution of labels.
Here, the nll (negative log likelihood) loss function takes inputs sm_pred (the predicted labels) and y_train (the target labels).
Hi @cbenett could you please post a snippet showing code you are referring to? In general, .super() refers to the parent class. So the code is referring to the setattr method whichever class DummyModule() inherits from. But if DummyModule() doesnât explicitly inherit from another class, Iâm as confused as you, and I second your question!
Hi @cbenett. The super().setattr(k,v) sets all attributes of the DummyModule() object to their corresponding values (as given in init). If you comment that line and then try to create a DummyModule() object it will throw an error!
So here,super() refers to the âobjectâ itself in python. To make it clearer you can create your DummyModule class as this : class DummyModule(object) . This is same as class DummyModule().
We canât say self.setattr(k,v) since this command will lead to infinite recursion.
Now, you may be wondering that why doesnât the init method does the job of registering the values to their respective attribute. In this case it doesnât, because whenever setattr method is explicitly written in a class, it is called instead of the normal mechanism(of setting the value of attributes)
In this lesson we implemented negative log likelihood (nll), but I wonder how backpropagation is calculated using our function, because in our function we just did array look up and mean of those variables, how is gradient calculated on this function and how PyTorch handles it since we just gave it a array look up
Have you modified something up the chain, as when i look at y_train in my notebook it comes up as torch.LongTensor. While in your case, it is FloatTensor which is why you need to convert to LongTensor before processing it to mean function.
Your previous post had y_train coming up as Float Tensor, so something has changed between that part of the code and the code shared per your latest post? I cannot see all the code so canât comment accurately.