Tensors are on separate devices?

En_g_neer · April 5, 2021, 10:53pm

Hey there,

I’m trying to train my first GAN with the FastAI vision GANs.

Now that I finally have pytorch and fastai running together (seemingly), I’m getting the following error when I attempt to create my dataloader (full traceback at the bottom)

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

I will post the code below (above the traceback) up to where it completely errors out on me. Any thoughts?

from fastai.vision import *
from fastai.vision.gan import *
from fastai.vision.all import *
import pathlib

###### Inputs ######
datafolder = r'F:\GAN\ripme\rips\FullFolder'
size = 1024
bs = 16

###### Setting Up The datafolder ######
path = pathlib.Path(datafolder)

###### Setting up our DataBlock #####
# This includes transforms for generating noise and cropping images down to the size we want.

dblock = DataBlock(blocks = (TransformBlock, ImageBlock),
     get_x = generate_noise,
     get_items = get_image_files,
     splitter = IndexSplitter([]),
     item_tfms=Resize(size, method=ResizeMethod.Crop), 
     batch_tfms = Normalize.from_stats(torch.tensor([0.5,0.5,0.5]), torch.tensor([0.5,0.5,0.5])))

dls = dblock.dataloaders(source=path, path=path, bs=bs, num_workers = 0)
dls.show_batch(max_n=16)

Could not do one pass in your dataloader, there is something wrong in it

RuntimeError Traceback (most recent call last)
in
1 dls = dblock.dataloaders(source=path, path=path, bs=bs, num_workers = 0)
----> 2 dls.show_batch(max_n=16)

~\anaconda3\envs\GANfastai\lib\site-packages\fastai\data\core.py in show_batch(self, b, max_n, ctxs, show, unique, **kwargs)
100 old_get_idxs = self.get_idxs
101 self.get_idxs = lambda: Inf.zeros
→ 102 if b is None: b = self.one_batch()
103 if not show: return self._pre_show_batch(b, max_n=max_n)
104 show_batch(*self._pre_show_batch(b, max_n=max_n), ctxs=ctxs, max_n=max_n, **kwargs)

~\anaconda3\envs\GANfastai\lib\site-packages\fastai\data\load.py in one_batch(self)
148 def one_batch(self):
149 if self.n is not None and len(self)==0: raise ValueError(f’This DataLoader does not contain any batches’)
→ 150 with self.fake_l.no_multiproc(): res = first(self)
151 if hasattr(self, ‘it’): delattr(self, ‘it’)
152 return res

~\anaconda3\envs\GANfastai\lib\site-packages\fastcore\basics.py in first(x, f, negate, **kwargs)
545 x = iter(x)
546 if f: x = filter_ex(x, f=f, negate=negate, gen=True, **kwargs)
→ 547 return next(x, None)
548
549 # Cell

~\anaconda3\envs\GANfastai\lib\site-packages\fastai\data\load.py in iter(self)
111 if self.device is not None and multiprocessing.get_start_method().lower() == “fork”:
112 b = to_device(b, self.device)
→ 113 yield self.after_batch(b)
114 self.after_iter()
115 if hasattr(self, ‘it’): del(self.it)

~\anaconda3\envs\GANfastai\lib\site-packages\fastcore\transform.py in call(self, o)
196 self.fs.append(t)
197
→ 198 def call(self, o): return compose_tfms(o, tfms=self.fs, split_idx=self.split_idx)
199 def repr(self): return f"Pipeline: {’ → '.join([f.name for f in self.fs if f.name != ‘noop’])}"
200 def getitem(self,i): return self.fs[i]

~\anaconda3\envs\GANfastai\lib\site-packages\fastcore\transform.py in compose_tfms(x, tfms, is_enc, reverse, **kwargs)
148 for f in tfms:
149 if not is_enc: f = f.decode
→ 150 x = f(x, **kwargs)
151 return x
152

~\anaconda3\envs\GANfastai\lib\site-packages\fastcore\transform.py in call(self, x, **kwargs)
71 @property
72 def name(self): return getattr(self, ‘_name’, _get_name(self))
—> 73 def call(self, x, **kwargs): return self._call(‘encodes’, x, **kwargs)
74 def decode (self, x, **kwargs): return self._call(‘decodes’, x, **kwargs)
75 def repr(self): return f’{self.name}:\nencodes: {self.encodes}decodes: {self.decodes}’

~\anaconda3\envs\GANfastai\lib\site-packages\fastcore\transform.py in _call(self, fn, x, split_idx, **kwargs)
81 def _call(self, fn, x, split_idx=None, **kwargs):
82 if split_idx!=self.split_idx and self.split_idx is not None: return x
—> 83 return self._do_call(getattr(self, fn), x, **kwargs)
84
85 def _do_call(self, f, x, **kwargs):

~\anaconda3\envs\GANfastai\lib\site-packages\fastcore\transform.py in do_call(self, f, x, **kwargs)
88 ret = f.returns(x) if hasattr(f,‘returns’) else None
89 return retain_type(f(x, **kwargs), x, ret)
—> 90 res = tuple(self.do_call(f, x, **kwargs) for x in x)
91 return retain_type(res, x)
92

~\anaconda3\envs\GANfastai\lib\site-packages\fastcore\transform.py in (.0)
88 ret = f.returns(x) if hasattr(f,‘returns’) else None
89 return retain_type(f(x, **kwargs), x, ret)
—> 90 res = tuple(self.do_call(f, x, **kwargs) for x_ in x)
91 return retain_type(res, x)
92

~\anaconda3\envs\GANfastai\lib\site-packages\fastcore\transform.py in do_call(self, f, x, **kwargs)
87 if f is None: return x
88 ret = f.returns(x) if hasattr(f,‘returns’) else None
—> 89 return retain_type(f(x, **kwargs), x, ret)
90 res = tuple(self.do_call(f, x, **kwargs) for x in x)
91 return retain_type(res, x)

~\anaconda3\envs\GANfastai\lib\site-packages\fastcore\dispatch.py in call(self, *args, **kwargs)
116 elif self.inst is not None: f = MethodType(f, self.inst)
117 elif self.owner is not None: f = MethodType(f, self.owner)
→ 118 return f(*args, **kwargs)
119
120 def get(self, inst, owner):

~\anaconda3\envs\GANfastai\lib\site-packages\fastai\data\transforms.py in encodes(self, x)
360 self.mean,self.std = x.mean(self.axes, keepdim=True),x.std(self.axes, keepdim=True)+1e-7
361
→ 362 def encodes(self, x:TensorImage): return (x-self.mean) / self.std
363 def decodes(self, x:TensorImage):
364 f = to_cpu if x.device.type==‘cpu’ else noop

~\anaconda3\envs\GANfastai\lib\site-packages\fastai\torch_core.py in torch_function(self, func, types, args, kwargs)
327 convert=False
328 if _torch_handled(args, self._opt, func): convert,types = type(self),(torch.Tensor,)
→ 329 res = super().torch_function(func, types, args=args, kwargs=kwargs)
330 if convert: res = convert(res)
331 if isinstance(res, TensorBase): res.set_meta(self, as_copy=True)

~\anaconda3\envs\GANfastai\lib\site-packages\torch\tensor.py in torch_function(cls, func, types, args, kwargs)
960
961 with _C.DisableTorchFunction():
→ 962 ret = func(*args, **kwargs)
963 return _convert(ret, cls)
964

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

ilovescience · April 6, 2021, 2:06am

try doing dblock.summary which might provide more information about the error…

muellerzr · April 6, 2021, 3:10am

Pretty sure the issue here is you’re passing in a tensor for the stats on normalize, try passing in just two arrays:

Normalize.from_stats([0.5,0.5,0.5], [0.5,0.5,0.5])

What’s happening here is fastai runs a very basic division when Normalizing and assumes just raw numbers are being passed in, while your tensor here starts out on the CPU. Hence why we have this GPU/CPU cross. You could also just ensure those torch.tensors are on cuda (IE torch.tensor([0.5...]).cuda()))

En_g_neer · April 6, 2021, 11:27pm

Oddly enough my FastAI version 2.1.1 I had downgraded to doesn’t run into the mixed tensor issue anymore.

Now, I run into a different issue while attempting to train, but it at least isn’t this tensor issue anymore. If I end up grabbing the current version of FastAI again, and have the tensor issue again, I’ll give this a shot.

As a side note, most of my code was based off the old GAN tutorial

SpaceCowboy850 · April 7, 2021, 5:53pm

I was able to fix my problem by modifying the fastAI code (Part1 2020 : 02_Production - found at least two devices, cuda:0 and cpu). I’m on Windows 10, so maybe that was the reason. Regardless, I still think something is bugged here.