Thanks. (almost) Just what I needed.
I took the example a bit further. I wanted to have a bit more flexible way of handling in the Loss-class what to track and also to specify if we wan’t to track both train and validation or only validation.
Thanks. (almost) Just what I needed.
I took the example a bit further. I wanted to have a bit more flexible way of handling in the Loss-class what to track and also to specify if we wan’t to track both train and validation or only validation.
Hello,
I have more or less the same issues as everyone on this topic.
I have 1 input with 4 channels. I have a segmentation problem. But I apply the Spatial Transformer Network in the 4th channel of the image, and I return the matrix of displacement and x. Now I have to optimize 2 ouputs. I more or less copy/paste the code from sgugger from his last post with the notebook. But I have now this error
/pytorch/torch/csrc/autograd/python_anomaly_mode.cpp:57: UserWarning: Traceback of forward call that caused the error:
File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py", line 16, in <module>
app.launch_new_instance()
File "/usr/local/lib/python3.6/dist-packages/traitlets/config/application.py", line 658, in launch_instance
app.start()
File "/usr/local/lib/python3.6/dist-packages/ipykernel/kernelapp.py", line 477, in start
ioloop.IOLoop.instance().start()
File "/usr/local/lib/python3.6/dist-packages/tornado/ioloop.py", line 888, in start
handler_func(fd_obj, events)
File "/usr/local/lib/python3.6/dist-packages/tornado/stack_context.py", line 277, in null_wrapper
return fn(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/zmq/eventloop/zmqstream.py", line 450, in _handle_events
self._handle_recv()
File "/usr/local/lib/python3.6/dist-packages/zmq/eventloop/zmqstream.py", line 480, in _handle_recv
self._run_callback(callback, msg)
File "/usr/local/lib/python3.6/dist-packages/zmq/eventloop/zmqstream.py", line 432, in _run_callback
callback(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/tornado/stack_context.py", line 277, in null_wrapper
return fn(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/ipykernel/kernelbase.py", line 283, in dispatcher
return self.dispatch_shell(stream, msg)
File "/usr/local/lib/python3.6/dist-packages/ipykernel/kernelbase.py", line 235, in dispatch_shell
handler(stream, idents, msg)
File "/usr/local/lib/python3.6/dist-packages/ipykernel/kernelbase.py", line 399, in execute_request
user_expressions, allow_stdin)
File "/usr/local/lib/python3.6/dist-packages/ipykernel/ipkernel.py", line 196, in do_execute
res = shell.run_cell(code, store_history=store_history, silent=silent)
File "/usr/local/lib/python3.6/dist-packages/ipykernel/zmqshell.py", line 533, in run_cell
return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/IPython/core/interactiveshell.py", line 2718, in run_cell
interactivity=interactivity, compiler=compiler, result=result)
File "/usr/local/lib/python3.6/dist-packages/IPython/core/interactiveshell.py", line 2828, in run_ast_nodes
if self.run_code(code, result):
File "/usr/local/lib/python3.6/dist-packages/IPython/core/interactiveshell.py", line 2882, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-41-cdff26b0dafb>", line 5, in <module>
learn.fit_one_cycle(2, 3e-3, wd=0.4, div_factor=10, pct_start=0.8)
File "/usr/local/lib/python3.6/dist-packages/fastai/train.py", line 22, in fit_one_cycle
learn.fit(cyc_len, max_lr, wd=wd, callbacks=callbacks)
File "/usr/local/lib/python3.6/dist-packages/fastai/basic_train.py", line 202, in fit
fit(epochs, self, metrics=self.metrics, callbacks=self.callbacks+callbacks)
File "/usr/local/lib/python3.6/dist-packages/fastai/basic_train.py", line 101, in fit
loss = loss_batch(learn.model, xb, yb, learn.loss_func, learn.opt, cb_handler)
File "/usr/local/lib/python3.6/dist-packages/fastai/basic_train.py", line 26, in loss_batch
out = model(*xb)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 547, in __call__
result = self.forward(*input, **kwargs)
File "<ipython-input-31-ac0d26232299>", line 13, in forward
alpha, theta = self.stn_model(alpha)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 547, in __call__
result = self.forward(*input, **kwargs)
File "<ipython-input-29-36dfe738bfd8>", line 52, in forward
x, theta = self.stn(x) # shape (2, 10, 3, 3)
File "<ipython-input-29-36dfe738bfd8>", line 46, in stn
x = F.grid_sample(x, grid)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py", line 2656, in grid_sample
return torch.grid_sampler(input, grid, mode_enum, padding_mode_enum)
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-41-cdff26b0dafb> in <module>()
3 # fastai.callback.CallbackHandler.on_loss_begin = custom_on_loss_begin
4 learn.callback_fns.append(HandleDualLoss)
----> 5 learn.fit_one_cycle(2, 3e-3, wd=0.4, div_factor=10, pct_start=0.8)
5 frames
/usr/local/lib/python3.6/dist-packages/torch/autograd/__init__.py in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables)
91 Variable._execution_engine.run_backward(
92 tensors, grad_tensors, retain_graph, create_graph,
---> 93 allow_unreachable=True) # allow_unreachable flag
94
95
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [8, 1, 256, 256]] is at version 1; expected version 0 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!
The example sgugger posted in Oct '18 doesn’t work with the current version of fast.ai, so I modified it to work with the current version:
class HandleDualLoss(LearnerCallback):
_order = -20 #Needs to run before the recorder
def on_train_begin(self, **kwargs):
self.learn.recorder.add_metric_names(['train_loss1', 'train_loss2', 'val_loss1', 'val_loss2'])
def on_epoch_begin(self, **kwargs):
self.train_loss1, self.train_loss2, self.train_nums = 0., 0., 0
self.val_loss1, self.val_loss2, self.val_nums = 0., 0., 0
def on_batch_end(self, last_target, train, **kwargs):
if train:
bs = last_target.size(0)
self.train_loss1 += bs * learn.loss_func.loss1.detach()
self.train_loss2 += bs * learn.loss_func.loss2.detach()
self.train_nums += bs
else:
bs = last_target.size(0)
self.val_loss1 += bs * learn.loss_func.loss1.detach()
self.val_loss2 += bs * learn.loss_func.loss2.detach()
self.val_nums += bs
def on_epoch_end(self, last_metrics, **kwargs):
return add_metrics(last_metrics,
[self.train_loss1/self.train_nums, self.train_loss2/self.train_nums,
self.val_loss1/self.val_nums, self.val_loss2/self.val_nums])