Learner.show_results with fp16

My learner is trained with fp16
I run

learner.show_results(rows=3, figsize=(8,9))

to get this error:
RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.cuda.HalfTensor) should be the same

1 Like

I think this is just a rough edge, since fp16 is relatively new. The trick I’ve found to get around this is:

  1. Save your learner’s trained model (while in fp16 mode)- learner.save('my_learner')
  2. Re-instantiate your learner exactly as you did to training, but without the to_fp16() method call
  3. Load your learner’s trained model- learner.load('my_learner')

Then something like show_results() should work.

6 Likes

This shouldn’t happen anymore on master by the way.

thanks

thanks @edave

1 Like

Hi sgugger.

It still happens, I got a “RuntimeError: expected type torch.FloatTensor but got torch.HalfTensor” while running “lesson3-camvid.ipynb”.

1 Like

:confused: Doesn’t work for me. Still got the same error.

But this method works.

Please give us a minimal example that reproduces the bug, it’s hard to fix it without that.

I ran into the same error message … but the method above worked! Cheers Jeremy :grin:

Hi! Sorry to reply so late. I put up a minimal example to reproduces that:

%reload_ext autoreload
%autoreload 2
%matplotlib inline

from fastai.vision import *
from fastai.metrics import error_rate

bs = 64

path = untar_data(URLs.PETS)

path_anno = path/'annotations'
path_img = path/'images'

fnames = get_image_files(path_img)

pat = r'/([^/]+)_\d+.jpg$'

data = ImageDataBunch.from_name_re(path_img, fnames, pat, ds_tfms=get_transforms(), size=224, bs=bs
                                  ).normalize(imagenet_stats)

learn = create_cnn(data, models.resnet34, metrics=error_rate).to_fp16()

learn.fit_one_cycle(4)

learn.show_results(rows=3, figsize=(8,9))

raised:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-24-a59cd8a428af> in <module>
----> 1 learn.show_results(rows=3, figsize=(8,9))

~/anaconda3/envs/fastai/lib/python3.7/site-packages/fastai/basic_train.py in show_results(self, ds_type, rows, **kwargs)
    338         norm = getattr(self.data,'norm',False)
    339         if norm:
--> 340             x = self.data.denorm(x)
    341             if norm.keywords.get('do_y',False):
    342                 y     = self.data.denorm(y, do_x=True)

~/anaconda3/envs/fastai/lib/python3.7/site-packages/fastai/vision/data.py in denormalize(x, mean, std, do_x)
     59 def denormalize(x:TensorImage, mean:FloatTensor,std:FloatTensor, do_x:bool=True)->TensorImage:
     60     "Denormalize `x` with `mean` and `std`."
---> 61     return x.cpu()*std[...,None,None] + mean[...,None,None] if do_x else x.cpu()
     62 
     63 def _normalize_batch(b:Tuple[Tensor,Tensor], mean:FloatTensor, std:FloatTensor, do_x:bool=True, do_y:bool=False)->Tuple[Tensor,Tensor]:

RuntimeError: expected type torch.FloatTensor but got torch.HalfTensor

Hope this would help.

1 Like

It does. It also has been fixed in master yesterday :wink:

Great! Thanks! I just tried conda install fastai -c fastai -c pytorch but got nothing, I just need to wait for it to push to conda a few days later, right?

You’ll have to wait for next release (probably sometime next week). Otherwise you have to do a dev install:

pip install -e .[dev]

inside the fastai cloned repo.

Got it, thanks!

Is it possible to save the show_results plot to disk? I’m training through a docker container rather than directly in jupyter.