TypeError on the first example in Lesson 1: No loop matching the specified signature and casting

Hi,

I’m trying to run the Lesson 1 jupyter notebook on my personal computer (with NVIDIA GPU), but even after all the requirements passed, I can’t figure out what this error message means in the first exemple:

TypeError: No loop matching the specified signature and casting
was found for ufunc add

It comes from the learn.fit(0.01, 2) but the end of the error message mentions numpy which seems quite complicated for me as a beginner.

Did any one ran into something similar or knows what could cause this error?

Thank you for your help.

Here is the complete error message:

TypeError                                 Traceback (most recent call last)
<ipython-input-13-e6c87b20ce86> in <module>()
  2 data = ImageClassifierData.from_paths(PATH, tfms=tfms_from_model(arch, sz))
  3 learn = ConvLearner.pretrained(arch, data, precompute=True)
----> 4 learn.fit(0.01, 2)

~/fastai/courses/dl1/fastai/learner.py in fit(self, lrs, n_cycle, wds, **kwargs)
300         self.sched = None
301         layer_opt = self.get_layer_opt(lrs, wds)
--> 302         return self.fit_gen(self.model, self.data, layer_opt, n_cycle, **kwargs)
303 
304     def warm_up(self, lr, wds=None):

~/fastai/courses/dl1/fastai/learner.py in fit_gen(self, model, data, layer_opt, n_cycle, cycle_len, cycle_mult, cycle_save_name, best_save_name, use_clr, use_clr_beta, metrics, callbacks, use_wd_sched, norm_wds, wds_sched_mult, use_swa, swa_start, swa_eval_freq, **kwargs)
247             metrics=metrics, callbacks=callbacks, reg_fn=self.reg_fn, clip=self.clip, fp16=self.fp16,
248             swa_model=self.swa_model if use_swa else None, swa_start=swa_start,
--> 249             swa_eval_freq=swa_eval_freq, **kwargs)
250 
251     def get_layer_groups(self): return self.models.get_layer_groups()

~/fastai/courses/dl1/fastai/model.py in fit(model, data, n_epochs, opt, crit, metrics, callbacks, stepper, swa_model, swa_start, swa_eval_freq, visualize, **kwargs)
160 
161         if not all_val:
--> 162             vals = validate(model_stepper, cur_data.val_dl, metrics, epoch, seq_first=seq_first, validate_skip = validate_skip)
163             stop=False
164             for cb in callbacks: stop = stop or cb.on_epoch_end(vals)

~/fastai/courses/dl1/fastai/model.py in validate(stepper, dl, metrics, epoch, seq_first, validate_skip)
240             loss.append(to_np(l))
241             res.append([f(datafy(preds), datafy(y)) for f in metrics])
--> 242     return [np.average(loss, 0, weights=batch_cnts)] + list(np.average(np.stack(res), 0, weights=batch_cnts))
243 
244 def get_prediction(x):

~/miniconda3/envs/fastai/lib/python3.6/site-packages/numpy/lib/function_base.py in average(a, axis, weights, returned)
381             wgt = wgt.swapaxes(-1, axis)
382 
--> 383         scl = wgt.sum(axis=axis, dtype=result_dtype)
384         if np.any(scl == 0.0):
385             raise ZeroDivisionError(

~/miniconda3/envs/fastai/lib/python3.6/site-packages/numpy/core/_methods.py in _sum(a, axis, dtype, out, keepdims, initial)
 34 def _sum(a, axis=None, dtype=None, out=None, keepdims=False,
 35          initial=_NoValue):
---> 36     return umr_sum(a, axis, dtype, out, keepdims, initial)
 37 
 38 def _prod(a, axis=None, dtype=None, out=None, keepdims=False,

TypeError: No loop matching the specified signature and casting
was found for ufunc add
1 Like

I would look at your data. Can you see what your data looks like and post a sample of that? I could maybe recreate it on my end if I had that information. It looks like you are grabbing images from a certain location so just a few of those images would be enough to recreate the issue potentially.

Thank you Kevin. I’m actually only trying to run the lesson 1 jupyter notebook with cats and dogs data. They are set up exactly as explained for paperspace but on my personal computer: I have a symbolic link in the courses/dl1 folder to my data. And looking at the first parts of the notebook, the path seems to be set up correctly.

@vctrd Yes I am also facing the same problem.


Maybe this can be of some use, however I am still not able to solve the problem.

There’s some bug in type conversions in the accuracy metric. Try setting learner.metrics = []

3 Likes

I tried it and it doesnt work.

Thanks! I had the same issue on a fresh local install (Ubuntu 18.04, Cuda 9.2) and this seems to have worked for me.

Updated the repo and did fresh install. learn.metrics = [] works now

I got the same problem, and not happy without my accuracy metrics…

so it was not that hard to solve (in my case)

I got installed a newer version of pytorch

$ pip freeze | grep torch
torch==1.0.0.dev20180928
torchtext==0.2.3
torchvision==0.2.1

and I suppose that when the videos were recorded the torch version was 0.4 so I just executed

conda install -c pytorch=0.4

and everything worked :slight_smile:

Hope it helps somebody :blush:

Excellent tip, thanks so much! I am running the Paperspace Jupyter notebook setup from here - () and also experienced this error. I’m used to coding locally in Pycharm so not being able to open a debugger and dig into libs is frustrating!!

I’ve pasted the full code that should be in the section below in case anyone else is having problems figuring out where to put that line you recommend, under “Our first model: Quick Start”:

arch=resnet34
data = ImageClassifierData.from_paths(PATH, tfms=tfms_from_model(arch, sz))
learn = ConvLearner.pretrained(arch, data, precompute=True)
learn.metrics = []
learn.fit(0.01, 2)
2 Likes

Thanks it worked!

But it doesn’t display the accuracy column as expected below
epoch trn_loss val_loss accuracy
0 0.038843 0.024146 0.99
1 0.040339 0.022008 0.992
My output is:
epoch trn_loss val_loss
0 0.053091 0.027982
1 0.041086 0.02792

Any quick fix?

hi there,
res.append([to_np(f(datafy(preds), datafy(y))) for f in metrics]) is part of validate function in model.py (fast 0.3) .
f in metrics is looping through various metrics function. when doing to_np() to the loss returned from the various metrics function, they are mismatched by shape. Say one is () and other is [1,] . I changed things in to_np in core.py . what was np.array(v) , i needed to change it to np.array([v]) . You may want something along these lines.

1 Like

if we specify the metrics as null list then how would we know the accuracy when trying to run learner.fit() method

but i want the model to reduce dice loss so i have to include it in learn.metrics ,and if i do that i am ending up with this error
TypeError: No loop matching the specified signature and casting
was found for ufunc add
how to resolve this error?