New optimizer, training loop and callbacks

Patrick · September 10, 2018, 4:26pm

Hi,

I was really hoping that fastaiV1 would provide support for observational weights. Based on what’s above, I can’t see how this would be easily integrated since only the features and target batches are being requested from the data.train_dl. Would it make sense to have the API always accept observational weights and if the user doesn’t provide any, just use pass through an array of 1.0’s to represent the equality of information in each observation?

Cheers,
Patrick

sgugger · September 10, 2018, 4:51pm

I’m not entirely familiar with observational weights but from what I see in your PR on fastai it’s entirely doable in a callback without changing the API:

the weights should be added as parameters of the models so that get trained by the current API
on the on_batch_begin you can change the inputs by using those weights

Other solution, you can stack the fastai_v1 model (to be coming later) on top of a custom pre_model that deals with the observational weights and will change the input.

Patrick · September 10, 2018, 8:45pm

Observational weights just tells your loss function that certain observations matter more than others (i.e. should be “up-weighted” relative to other observations) because there’s more information contained in that observation than other observations. They are not trainable parameters. They are artifacts of the data and the practitioner’s prior beliefs.

I admittedly do not understand the callbacks. I looked at the repo for an EyeOfSauron.ipynb to understand better but didn’t see it anywhere. Did I miss it?

sgugger · September 10, 2018, 10:04pm

Oh if they aren’t trainable then it’s even easier. You just have to create a custom dataloader for your training set and that’s pretty much it.
The callbacks are all in the notebook of the same name (004_callbacks.ipynb).

TheShadow29 · September 11, 2018, 12:38am

@sgugger Just wondering, shouldn’t the line on https://github.com/fastai/fastai_v1/blob/master/fastai/basic_train.py#L37 be

for (*xb, yb) in progress_bar(data.trn_dl, parent=pbar)

*xb would allow one to have multiple argument outputs from the dataset. Take the example of siamese nets where you need to pass two images instead of one. At least, this the line on the fastai/fastai repo.

jeremy · September 11, 2018, 4:36pm

Yup probably - we haven’t got to that bit yet.

stephenjohnson · September 14, 2018, 11:11pm

I believe that torch.argmax could be used instead of torch.max and then no need to use [1] which simplifies it just a bit.

Change

def accuracy(out, yb):
   preds = torch.max(out, dim=1)[1]
   return (preds==yb).float().mean()

To

def accuracy(out, yb):
   preds = torch.argmax(out, dim=1)
   return (preds==yb).float().mean()

sgugger · September 14, 2018, 11:21pm

Indeed! Feel free to suggest a PR to change this.

stephenjohnson · September 15, 2018, 1:07am

PR has been submitted.

fredguth · October 11, 2018, 3:29pm

Very interesting. I tried using your code, but there was a lot of warnings and eventually OverflowError.