Full Batch Gradient Descent Issue

I was able to solve this issue. Basically my issues were misunderstanding how everything was implemented. I needed to do my backwards computation in the one_batch still, but only do my step and zero_grad steps up in _do_epoch. Here is the code that I believe is working now:

def _do_epoch_BGD(self):
    self._do_epoch_train()
    self._step()
    self('after_step')
    self.opt.zero_grad()
    self._do_epoch_validate()
def _do_one_batch_BGD(self):
    self.pred = self.model(*self.xb)
    self('after_pred')
    if len(self.yb): self.loss = self.loss_func(self.pred, *self.yb)
    self('after_loss')
    if not self.training or not len(self.yb): return
    self('before_backward')
    self._backward()
    self('after_backward')

Here is what my result looks like now:

image

2 Likes