New optimizer, training loop and callbacks



I was really hoping that fastaiV1 would provide support for observational weights. Based on what’s above, I can’t see how this would be easily integrated since only the features and target batches are being requested from the data.train_dl. Would it make sense to have the API always accept observational weights and if the user doesn’t provide any, just use pass through an array of 1.0’s to represent the equality of information in each observation?




I’m not entirely familiar with observational weights but from what I see in your PR on fastai it’s entirely doable in a callback without changing the API:

  • the weights should be added as parameters of the models so that get trained by the current API
  • on the on_batch_begin you can change the inputs by using those weights

Other solution, you can stack the fastai_v1 model (to be coming later) on top of a custom pre_model that deals with the observational weights and will change the input.



Observational weights just tells your loss function that certain observations matter more than others (i.e. should be “up-weighted” relative to other observations) because there’s more information contained in that observation than other observations. They are not trainable parameters. They are artifacts of the data and the practitioner’s prior beliefs.

I admittedly do not understand the callbacks. I looked at the repo for an EyeOfSauron.ipynb to understand better but didn’t see it anywhere. Did I miss it?



Oh if they aren’t trainable then it’s even easier. You just have to create a custom dataloader for your training set and that’s pretty much it.
The callbacks are all in the notebook of the same name (004_callbacks.ipynb).

1 Like

(Arka Sadhu) #26

@sgugger Just wondering, shouldn’t the line on be

for (*xb, yb) in progress_bar(data.trn_dl, parent=pbar)

*xb would allow one to have multiple argument outputs from the dataset. Take the example of siamese nets where you need to pass two images instead of one. At least, this the line on the fastai/fastai repo.

1 Like

(Jeremy Howard (Admin)) #27

Yup probably - we haven’t got to that bit yet.


(Stephen Johnson) #28

I believe that torch.argmax could be used instead of torch.max and then no need to use [1] which simplifies it just a bit.


def accuracy(out, yb):
   preds = torch.max(out, dim=1)[1]
   return (preds==yb).float().mean()


def accuracy(out, yb):
   preds = torch.argmax(out, dim=1)
   return (preds==yb).float().mean()
1 Like


Indeed! Feel free to suggest a PR to change this.


(Stephen Johnson) #30

PR has been submitted.


(Fred Guth) #31

Very interesting. I tried using your code, but there was a lot of warnings and eventually OverflowError.