Allow for more than one output for loss and metric

Thanks. (almost) Just what I needed.

I took the example a bit further. I wanted to have a bit more flexible way of handling in the Loss-class what to track and also to specify if we wan’t to track both train and validation or only validation.

Hello,

I have more or less the same issues as everyone on this topic.

I have 1 input with 4 channels. I have a segmentation problem. But I apply the Spatial Transformer Network in the 4th channel of the image, and I return the matrix of displacement and x. Now I have to optimize 2 ouputs. I more or less copy/paste the code from sgugger from his last post with the notebook. But I have now this error :slight_smile:

/pytorch/torch/csrc/autograd/python_anomaly_mode.cpp:57: UserWarning: Traceback of forward call that caused the error:
  File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py", line 16, in <module>
    app.launch_new_instance()
  File "/usr/local/lib/python3.6/dist-packages/traitlets/config/application.py", line 658, in launch_instance
    app.start()
  File "/usr/local/lib/python3.6/dist-packages/ipykernel/kernelapp.py", line 477, in start
    ioloop.IOLoop.instance().start()
  File "/usr/local/lib/python3.6/dist-packages/tornado/ioloop.py", line 888, in start
    handler_func(fd_obj, events)
  File "/usr/local/lib/python3.6/dist-packages/tornado/stack_context.py", line 277, in null_wrapper
    return fn(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/zmq/eventloop/zmqstream.py", line 450, in _handle_events
    self._handle_recv()
  File "/usr/local/lib/python3.6/dist-packages/zmq/eventloop/zmqstream.py", line 480, in _handle_recv
    self._run_callback(callback, msg)
  File "/usr/local/lib/python3.6/dist-packages/zmq/eventloop/zmqstream.py", line 432, in _run_callback
    callback(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tornado/stack_context.py", line 277, in null_wrapper
    return fn(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/ipykernel/kernelbase.py", line 283, in dispatcher
    return self.dispatch_shell(stream, msg)
  File "/usr/local/lib/python3.6/dist-packages/ipykernel/kernelbase.py", line 235, in dispatch_shell
    handler(stream, idents, msg)
  File "/usr/local/lib/python3.6/dist-packages/ipykernel/kernelbase.py", line 399, in execute_request
    user_expressions, allow_stdin)
  File "/usr/local/lib/python3.6/dist-packages/ipykernel/ipkernel.py", line 196, in do_execute
    res = shell.run_cell(code, store_history=store_history, silent=silent)
  File "/usr/local/lib/python3.6/dist-packages/ipykernel/zmqshell.py", line 533, in run_cell
    return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/IPython/core/interactiveshell.py", line 2718, in run_cell
    interactivity=interactivity, compiler=compiler, result=result)
  File "/usr/local/lib/python3.6/dist-packages/IPython/core/interactiveshell.py", line 2828, in run_ast_nodes
    if self.run_code(code, result):
  File "/usr/local/lib/python3.6/dist-packages/IPython/core/interactiveshell.py", line 2882, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-41-cdff26b0dafb>", line 5, in <module>
    learn.fit_one_cycle(2, 3e-3, wd=0.4, div_factor=10, pct_start=0.8)
  File "/usr/local/lib/python3.6/dist-packages/fastai/train.py", line 22, in fit_one_cycle
    learn.fit(cyc_len, max_lr, wd=wd, callbacks=callbacks)
  File "/usr/local/lib/python3.6/dist-packages/fastai/basic_train.py", line 202, in fit
    fit(epochs, self, metrics=self.metrics, callbacks=self.callbacks+callbacks)
  File "/usr/local/lib/python3.6/dist-packages/fastai/basic_train.py", line 101, in fit
    loss = loss_batch(learn.model, xb, yb, learn.loss_func, learn.opt, cb_handler)
  File "/usr/local/lib/python3.6/dist-packages/fastai/basic_train.py", line 26, in loss_batch
    out = model(*xb)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 547, in __call__
    result = self.forward(*input, **kwargs)
  File "<ipython-input-31-ac0d26232299>", line 13, in forward
    alpha, theta = self.stn_model(alpha)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 547, in __call__
    result = self.forward(*input, **kwargs)
  File "<ipython-input-29-36dfe738bfd8>", line 52, in forward
    x, theta = self.stn(x) # shape (2, 10, 3, 3)
  File "<ipython-input-29-36dfe738bfd8>", line 46, in stn
    x = F.grid_sample(x, grid)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py", line 2656, in grid_sample
    return torch.grid_sampler(input, grid, mode_enum, padding_mode_enum)

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-41-cdff26b0dafb> in <module>()
      3 # fastai.callback.CallbackHandler.on_loss_begin = custom_on_loss_begin
      4 learn.callback_fns.append(HandleDualLoss)
----> 5 learn.fit_one_cycle(2, 3e-3, wd=0.4, div_factor=10, pct_start=0.8)

5 frames
/usr/local/lib/python3.6/dist-packages/torch/autograd/__init__.py in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables)
     91     Variable._execution_engine.run_backward(
     92         tensors, grad_tensors, retain_graph, create_graph,
---> 93         allow_unreachable=True)  # allow_unreachable flag
     94 
     95 

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [8, 1, 256, 256]] is at version 1; expected version 0 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!

The example sgugger posted in Oct '18 doesn’t work with the current version of fast.ai, so I modified it to work with the current version:

class HandleDualLoss(LearnerCallback):
    _order = -20 #Needs to run before the recorder
    
    def on_train_begin(self, **kwargs):
        self.learn.recorder.add_metric_names(['train_loss1', 'train_loss2', 'val_loss1', 'val_loss2'])
    
    def on_epoch_begin(self, **kwargs):
        self.train_loss1, self.train_loss2, self.train_nums = 0., 0., 0
        self.val_loss1, self.val_loss2, self.val_nums = 0., 0., 0
    
    def on_batch_end(self, last_target, train, **kwargs):
        if train:
            bs = last_target.size(0)
            self.train_loss1 += bs * learn.loss_func.loss1.detach()
            self.train_loss2 += bs * learn.loss_func.loss2.detach()
            self.train_nums += bs
        else:
            bs = last_target.size(0)
            self.val_loss1 += bs * learn.loss_func.loss1.detach()
            self.val_loss2 += bs * learn.loss_func.loss2.detach()
            self.val_nums += bs
    
    def on_epoch_end(self, last_metrics, **kwargs):
        return add_metrics(last_metrics, 
                           [self.train_loss1/self.train_nums, self.train_loss2/self.train_nums, 
                            self.val_loss1/self.val_nums, self.val_loss2/self.val_nums])
3 Likes