Lesson 2 amazon planet - can't create numpy are from cuda tensor


(p wills) #1

I have had no problems until now. However, I am having a problem running the “amazon planet” example in lesson 2 image_models (I believe that this example may have been created in late 2017 - I have a bit of confusion on the website with dates…). In running learn.fit, I get a bunch of error messages shown below - basically in the loss calculation (f2?) it sees a cuda tensor that cannot be converted to numpy.

I am running on my own Ti1080 and followed the GitHub instructions for the install, which has worked so far.

I see no mention of this error on the forum so I created this thread in case someone has seen this error and solved it. At least, someone might be able to give me generic info on how to handle this type of message in pytorch

Blockquote

~/anaconda3/envs/fastai/lib/python3.7/site-packages/sklearn/utils/multiclass.py in type_of_target(y)
247 raise ValueError(“y cannot be class ‘SparseSeries’.”)
248
–> 249 if is_multilabel(y):
250 return ‘multilabel-indicator’
251

~/anaconda3/envs/fastai/lib/python3.7/site-packages/sklearn/utils/multiclass.py in is_multilabel(y)
138 “”"
139 if hasattr(y, ‘array’):
–> 140 y = np.asarray(y)
141 if not (hasattr(y, “shape”) and y.ndim == 2 and y.shape[1] > 1):
142 return False

~/anaconda3/envs/fastai/lib/python3.7/site-packages/numpy/core/numeric.py in asarray(a, dtype, order)
499
500 “”"
–> 501 return array(a, dtype, copy=False, order=order)
502
503

~/anaconda3/envs/fastai/lib/python3.7/site-packages/torch/tensor.py in array(self, dtype)
441 def array(self, dtype=None):
442 if dtype is None:
–> 443 return self.numpy()
444 else:
445 return self.numpy().astype(dtype, copy=False)

TypeError: can’t convert CUDA tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

Blockquote


#2

Hi Pete

Similar problem on my system, on lr_find() in same lesson.

John


#3

Pete

Update your environment - fixed.

John


(p wills) #4

John,

Will do, thanks!

I made a temporary fix yesterday - just put the .cpu() in the flagged line in numeric.py. But it seems out of place to fix a pytorch problem in core numpy.


#5

Hi Pete

From what I have read, torch needs to convert GPU tensors to CPU tensors so that numpy can handle them.

Cool - glad your running, my error was in a different line but related I guess - I’ll keep it in the tool shed for next time!

Kind regards

John


(Simron Thapa) #6

Hi John,

I am having the same issue. How did you fix that? Please let me know.

Best regards,
Simron

TypeError Traceback (most recent call last)
in
----> 1 lrf=learn.lr_find()
2 learn.sched.plot()

/media/ivlab/FAST-FROG/fastai/courses/dl1/fastai/learner.py in lr_find(self, start_lr, end_lr, wds, linear, **kwargs)
343 layer_opt = self.get_layer_opt(start_lr, wds)
344 self.sched = LR_Finder(layer_opt, len(self.data.trn_dl), end_lr, linear=linear)
–> 345 self.fit_gen(self.model, self.data, layer_opt, 1, **kwargs)
346 self.load(‘tmp’)
347

/media/ivlab/FAST-FROG/fastai/courses/dl1/fastai/learner.py in fit_gen(self, model, data, layer_opt, n_cycle, cycle_len, cycle_mult, cycle_save_name, best_save_name, use_clr, use_clr_beta, metrics, callbacks, use_wd_sched, norm_wds, wds_sched_mult, use_swa, swa_start, swa_eval_freq, **kwargs)
247 metrics=metrics, callbacks=callbacks, reg_fn=self.reg_fn, clip=self.clip, fp16=self.fp16,
248 swa_model=self.swa_model if use_swa else None, swa_start=swa_start,
–> 249 swa_eval_freq=swa_eval_freq, **kwargs)
250
251 def get_layer_groups(self): return self.models.get_layer_groups()

/media/ivlab/FAST-FROG/fastai/courses/dl1/fastai/model.py in fit(model, data, n_epochs, opt, crit, metrics, callbacks, stepper, swa_model, swa_start, swa_eval_freq, visualize, **kwargs)
160
161 if not all_val:
–> 162 vals = validate(model_stepper, cur_data.val_dl, metrics, epoch, seq_first=seq_first, validate_skip = validate_skip)
163 stop=False
164 for cb in callbacks: stop = stop or cb.on_epoch_end(vals)

/media/ivlab/FAST-FROG/fastai/courses/dl1/fastai/model.py in validate(stepper, dl, metrics, epoch, seq_first, validate_skip)
239 batch_cnts.append(batch_sz(x, seq_first=seq_first))
240 loss.append(to_np(l))
–> 241 res.append([to_np(f(datafy(preds), datafy(y))) for f in metrics])
242 return [np.average(loss, 0, weights=batch_cnts)] + list(np.average(np.stack(res), 0, weights=batch_cnts))
243

/media/ivlab/FAST-FROG/fastai/courses/dl1/fastai/model.py in (.0)
239 batch_cnts.append(batch_sz(x, seq_first=seq_first))
240 loss.append(to_np(l))
–> 241 res.append([to_np(f(datafy(preds), datafy(y))) for f in metrics])
242 return [np.average(loss, 0, weights=batch_cnts)] + list(np.average(np.stack(res), 0, weights=batch_cnts))
243

/media/ivlab/FAST-FROG/fastai/courses/dl1/planet.py in f2(preds, targs, start, end, step)
9 warnings.simplefilter(“ignore”)
10 return max([fbeta_score(targs, (preds>th), 2, average=‘samples’)
—> 11 for th in np.arange(start,end,step)])
12
13 def opt_th(preds, targs, start=0.17, end=0.24, step=0.01):

/media/ivlab/FAST-FROG/fastai/courses/dl1/planet.py in (.0)
9 warnings.simplefilter(“ignore”)
10 return max([fbeta_score(targs, (preds>th), 2, average=‘samples’)
—> 11 for th in np.arange(start,end,step)])
12
13 def opt_th(preds, targs, start=0.17, end=0.24, step=0.01):

~/anaconda2/envs/fastai/lib/python3.7/site-packages/sklearn/metrics/classification.py in fbeta_score(y_true, y_pred, beta, labels, pos_label, average, sample_weight)
832 average=average,
833 warn_for=(‘f-score’,),
–> 834 sample_weight=sample_weight)
835 return f
836

~/anaconda2/envs/fastai/lib/python3.7/site-packages/sklearn/metrics/classification.py in precision_recall_fscore_support(y_true, y_pred, beta, labels, pos_label, average, warn_for, sample_weight)
1029 raise ValueError(“beta should be >0 in the F-beta score”)
1030
-> 1031 y_type, y_true, y_pred = _check_targets(y_true, y_pred)
1032 check_consistent_length(y_true, y_pred, sample_weight)
1033 present_labels = unique_labels(y_true, y_pred)

~/anaconda2/envs/fastai/lib/python3.7/site-packages/sklearn/metrics/classification.py in _check_targets(y_true, y_pred)
70 “”"
71 check_consistent_length(y_true, y_pred)
—> 72 type_true = type_of_target(y_true)
73 type_pred = type_of_target(y_pred)
74

~/anaconda2/envs/fastai/lib/python3.7/site-packages/sklearn/utils/multiclass.py in type_of_target(y)
247 raise ValueError(“y cannot be class ‘SparseSeries’.”)
248
–> 249 if is_multilabel(y):
250 return ‘multilabel-indicator’
251

~/anaconda2/envs/fastai/lib/python3.7/site-packages/sklearn/utils/multiclass.py in is_multilabel(y)
138 “”"
139 if hasattr(y, ‘array’):
–> 140 y = np.asarray(y)
141 if not (hasattr(y, “shape”) and y.ndim == 2 and y.shape[1] > 1):
142 return False

~/anaconda2/envs/fastai/lib/python3.7/site-packages/numpy/core/numeric.py in asarray(a, dtype, order)
499
500 “”"
–> 501 return array(a, dtype, copy=False, order=order)
502
503

~/anaconda2/envs/fastai/lib/python3.7/site-packages/torch/tensor.py in array(self, dtype)
448 def array(self, dtype=None):
449 if dtype is None:
–> 450 return self.numpy()
451 else:
452 return self.numpy().astype(dtype, copy=False)

TypeError: can’t convert CUDA tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.


(Simron Thapa) #7

Hi John,

I updated my environment with " conda update --all ", but the error wasn’t fixed. I also did git pull to get the latest code. But nothing seems to work. Please let me know if there is any other way. Or anything I did was wrong. It would be a great help.

Best regards,
Simron


(p wills) #8

Simron,

I looked at it for a while, doing complicated things but the fix for me turned out to be very simple. I went to the fastai code in the last file mentioned in the error(tensor.py) and replaced self.numpy() with self.cpu().numpy(). Your calls will automatically access this change.

It is not a nice fix for at least two reasons:

  • the fix is in core numpy code - too deep for this problem
  • it will get overwritten by any update

I am sticking with it - fixing at each update if necessary because I don’t have the time (or skills) to make a robust fix.


#9

Hi Simron

Sorry for the delay I’ve had a weekend off.

It just worked after an update, but Pete has saved the day.

Kind regards and good luck :grinning:

John


(Denis Shvetsov) #10

Hi,

can you please give update instruction.
i did
conda update --all
conda install -f -c pytorch pytorch-nightly cuda92
conda install -f -c fastai torchvision-nightly
conda install -f -c fastai fastai
but problem still here


(Raul Sierra Alcocer) #11

Hi,

I had the same problem, and this worked for me:

git pull
conda env update
conda update --all

Hope this helps someone.

Best,
Raúl