FastAI throwing a runtime error when using custom train & test sets

oo92 · May 2, 2020, 10:26pm

Hi.

I’m working on the Food-101 dataset and as you may know, the dataset comes with both train and test parts. Because the dataset could no longer be found on the ETH Zurich link, I had to divide them into partitions < 1GB each and clone them into Colab and reassemble. Its very tedious work but I got it working. I will omit the Python code but the file structure looks like this:

Food-101
      images
            train
               ...75750 train images
            test
               ...25250 test images
      meta
            classes.txt
            labes.txt
            test.json
            test.txt
            train.json
            train.txt
      README.txt
      license_agreement.txt

The following code is what’s throwing the runtime error

train_image_path = Path('images/train/')
test_image_path = Path('images/test/')
path = Path('../Food-101')

food_names = get_image_files(train_image_path)

file_parse = r'/([^/]+)_\d+\.(png|jpg|jpeg)$'

data = ImageDataBunch.from_folder(train_image_path, test_image_path, valid_pct=0.2, ds_tfms=get_transforms(), size=224)
data.normalize(imagenet_stats)

My guess is that ImageDataBunch.from_folder() is what’s throwing the error but I don’t know why its getting caught up on the data types as (I don’t think) I’m supplying it with any data that has a specific type.

/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:2854: UserWarning: The default behavior for interpolate/upsample with float scale_factor will change in 1.6.0 to align with other frameworks/libraries, and use scale_factor directly, instead of relying on the computed output size. If you wish to keep the old behavior, please set recompute_scale_factor=True. See the documentation of nn.Upsample for details. 
  warnings.warn("The default behavior for interpolate/upsample with float scale_factor will change "
/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:2854: UserWarning: The default behavior for interpolate/upsample with float scale_factor will change in 1.6.0 to align with other frameworks/libraries, and use scale_factor directly, instead of relying on the computed output size. If you wish to keep the old behavior, please set recompute_scale_factor=True. See the documentation of nn.Upsample for details. 
  warnings.warn("The default behavior for interpolate/upsample with float scale_factor will change "
/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:2854: UserWarning: The default behavior for interpolate/upsample with float scale_factor will change in 1.6.0 to align with other frameworks/libraries, and use scale_factor directly, instead of relying on the computed output size. If you wish to keep the old behavior, please set recompute_scale_factor=True. See the documentation of nn.Upsample for details. 
  warnings.warn("The default behavior for interpolate/upsample with float scale_factor will change "
/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:2854: UserWarning: The default behavior for interpolate/upsample with float scale_factor will change in 1.6.0 to align with other frameworks/libraries, and use scale_factor directly, instead of relying on the computed output size. If you wish to keep the old behavior, please set recompute_scale_factor=True. See the documentation of nn.Upsample for details. 
  warnings.warn("The default behavior for interpolate/upsample with float scale_factor will change "
/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:2854: UserWarning: The default behavior for interpolate/upsample with float scale_factor will change in 1.6.0 to align with other frameworks/libraries, and use scale_factor directly, instead of relying on the computed output size. If you wish to keep the old behavior, please set recompute_scale_factor=True. See the documentation of nn.Upsample for details. 
  warnings.warn("The default behavior for interpolate/upsample with float scale_factor will change "
/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:2854: UserWarning: The default behavior for interpolate/upsample with float scale_factor will change in 1.6.0 to align with other frameworks/libraries, and use scale_factor directly, instead of relying on the computed output size. If you wish to keep the old behavior, please set recompute_scale_factor=True. See the documentation of nn.Upsample for details. 
  warnings.warn("The default behavior for interpolate/upsample with float scale_factor will change "
/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:2854: UserWarning: The default behavior for interpolate/upsample with float scale_factor will change in 1.6.0 to align with other frameworks/libraries, and use scale_factor directly, instead of relying on the computed output size. If you wish to keep the old behavior, please set recompute_scale_factor=True. See the documentation of nn.Upsample for details. 
  warnings.warn("The default behavior for interpolate/upsample with float scale_factor will change "
You can deactivate this warning by passing `no_check=True`.
/usr/local/lib/python3.6/dist-packages/fastai/basic_data.py:262: UserWarning: There seems to be something wrong with your dataset, for example, in the first batch can't access these elements in self.train_ds: 9600,37233,16116,38249,1826...
  warn(warn_msg)
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/IPython/core/formatters.py in __call__(self, obj)
    697                 type_pprinters=self.type_printers,
    698                 deferred_pprinters=self.deferred_printers)
--> 699             printer.pretty(obj)
    700             printer.flush()
    701             return stream.getvalue()

11 frames
/usr/local/lib/python3.6/dist-packages/fastai/vision/image.py in affine(self, func, *args, **kwargs)
    181         "Equivalent to `image.affine_mat = image.affine_mat @ func()`."
    182         m = tensor(func(*args, **kwargs)).to(self.device)
--> 183         self.affine_mat = self.affine_mat @ m
    184         return self
    185 

RuntimeError: Expected object of scalar type Float but got scalar type Double for argument #3 'mat2' in call to _th_addmm_out

jessiel · May 2, 2020, 11:06pm

I’m running lesson1 notebook in Colab and go the same error. It looks like the version running on Colab maybe not be compatible w/ FastAI? Appreciate the help!

https://colab.research.google.com/github/fastai/course-v3/blob/master/nbs/dl1/lesson1-pets.ipynb#scrollTo=SEMlDH19duLT&line=2&uniqifier=1

Running cell w/
data = ImageDataBunch.from_name_re(path_img, fnames, pat, ds_tfms=get_transforms(), size=224, bs=bs).normalize(imagenet_stats)

error
/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:2854: UserWarning: The default behavior for interpolate/upsample with float scale_factor will change in 1.6.0 to align with other frameworks/libraries, and use scale_factor directly, instead of relying on the computed output size. If you wish to keep the old behavior, please set recompute_scale_factor=True. See the documentation of nn.Upsample for details.
warnings.warn("The default behavior for interpolate/upsample with float scale_factor will change "
/usr/local/lib/python3.6/dist-packages/fastai/basic_data.py:262: UserWarning: There seems to be something wrong with your dataset, for example, in the first batch can’t access these elements in self.train_ds: 5478,1218,2792,5389,3016…
warn(warn_msg)

oo92 · May 3, 2020, 12:14am

It seems that some issues with torch that is used in colab

Forum Try to install specific version of torch in your colab before run fastAI python code

!pip install "torch==1.4" "torchvision==0.5.0"

jessiel · May 3, 2020, 4:58am

Thank you! that do the trick!

valbm · May 4, 2020, 4:07pm

So, in this case, it will be old version of pytorch, will it?

oo92 · May 4, 2020, 8:41pm

That seems to be the case.

streicher · May 6, 2020, 4:49am

The problem seems to be with a single function from pytorch, nn.Upsample, used by the RatioResize transform in fastai2’s batch image augmenter, that changed it’s default behaviour between pytorch versions 1.4.0 and 1.5.0. Luckily, you can get pytorch 1.5.0 to behave like 1.4.0 by simply passing it an additional parameter, called “recompute_scale_factor=True” when you call it. In practical terms this means updating the fastai file augment.py (found in /fastai/vision) to add this option. On my system, I did that by uninstalling the pip version of fastai (pip3 uninstall fastai2) checking out an editable version of fastai with “git clone https://github.com/fastai/fastai2” and editing line 289 of the file ~/fastai2/fastai2/vision/augment.py from “x = F.interpolate(x, scale_factor=1/d, mode=‘area’)” to x = F.interpolate(x, scale_factor=1/d, mode=‘area’,recompute_scale_factor=True) , and installing the patched fastai2 with "pip install -e “.[dev]”.

btblueskies · May 11, 2020, 4:49am

I was able to apply this same fix to the fastai (v1) by making this edit to the file …/fastai/fastai/vision/image.py on line 540.
It worked a charm… Thanks!

avanishk · May 14, 2020, 11:46am

Hi, Onur.

Thanks for this. I have a question.

Will I have to execute this line installing torch 1.4 whenever I open and run the notebook?

oo92 · May 14, 2020, 8:05pm

If you’re using Colab, then yes

avanishk · May 14, 2020, 8:37pm

Thanks Onur, is there any workaround? Using any other jupyter notebook?

oo92 · May 14, 2020, 10:16pm

I haven’t tried tbh. I’ve only been using Colab.

yesssicavalencia · May 17, 2020, 4:14pm

It worked for me. Thank you

dtpdx · June 12, 2020, 6:20am

thank you!! this seems to have worked for me!! fingers crossed!

muellerzr · June 12, 2020, 6:17pm

@streicher would you be willing to put a PR in for this? (if not I can go ahead but you found the solution first )

streicher · June 13, 2020, 2:26am

Hi @muellerzr

Thank you for your consideration. I would be happy to!

Streicher.

streicher · June 15, 2020, 12:37pm

Hi Zachary. Thanks again for this opportunity to contribute. I created F.interpolate compatability patch for pytorch 1.5.0 by streicherlouw · Pull Request #400 · fastai/fastai2 · GitHub with the fix to nbs/09_vision.augment.ipynb. I also regenerated fastai2/vision/augment.py with the nbdev_build_lib script. I have tested the rebuilt library against pytorch 1.5.0 and it works, but I am less sure how well it works on pytorch 1.4.0. I created a test environment with torch 1.4.0 and it does not throw any errors, but something seems to go wrong when the dataloader uses aug_transforms, which triggers the patched code. Is it possible to check for pytorch version in augment.py, and then either include or exclude the patch?

muellerzr · June 15, 2020, 12:51pm

I’d check out how we took care of the Pillow version issue in vision.core (I think?)

streicher · June 16, 2020, 1:31am

I added a debug trace in augment.py and as expected, the root of the problem with pytorch < 1.5 is the unexpected argument:

>>> torch.__version__
'1.4.0'
> /home/linuxbrew/.linuxbrew/Cellar/python/3.7.7_1/lib/python3.7/site-packages/fastai2/vision/augment.py(289)_grid_sample()
-> x = F.interpolate(x, scale_factor=1/d, mode='area', recompute_scale_factor=True)
(Pdb) s
TypeError: interpolate() got an unexpected keyword argument 'recompute_scale_factor'

I have also tried multiple regressions, and can confirm that the fastai2 code works reliably as-is on pytorch 1.4.0, and warns (but works) consistently on 1.5.0, even in colab. This suggests not merging PR400 unless some way can be found to differentiate between pytorch versions at runtime.

streicher · June 16, 2020, 8:26am

I have also had a look at the code that generates the waring (torch/nn/functional.py line 2989) and it seems to be doing so irrespective of whether there is a problem or not. As far as I can see, the only thing it checks for is whether you are using a float for the scale factor:

 if recompute_scale_factor is None:
        # only warn when the scales have floating values since
        # the result for ints is the same with/without recompute_scale_factor

        is_float_scale_factor = False
        for scale in scale_factors:
            is_float_scale_factor = math.floor(scale) != scale
            if is_float_scale_factor:
                break

        if is_float_scale_factor:
            warnings.warn("The default behavior for interpolate/upsample with float scale_factor will change "
                          "in 1.6.0 to align with other frameworks/libraries, and use scale_factor directly, "
                          "instead of relying on the computed output size. "
                          "If you wish to keep the old behavior, please set recompute_scale_factor=True. "
                          "See the documentation of nn.Upsample for details. ")

The documentation for torch.nn.functional.interpolate (aka nn.Upsample) elaborates as follows:

recompute_scale_factor (bool ,optional) – recompute the scale_factor for use in the interpolation calculation. When scale_factor is passed as a parameter, it is used to compute the output_size. If recompute_scale_factor is True or not specified, a new scale_factor will be computed based on the output and input sizes for use in the interpolation computation (i.e. the computation will be identical to if the computed output_size were passed-in explicitly). Otherwise, the passed-in scale_factor will be used in the interpolation computation. Note that when scale_factor is floating-point, the recomputed scale_factor may differ from the one passed in due to rounding and precision issues.

Another way to solve the warning could be to pass torch.nn.functional.interpolate the desired output size instead of a float scale factor. The “size” argument in torch.nn.functional.interpolate stays constant across pytorch versions. Looking at the code around lines 280 to 288 of augment.py I think it should be doable, but I do not understand the RatioResize routine well enough to work out how to do it.