I’ll try looking again at this today…this is seriously strange
I saw it was training slower for me to with CUDA 10.2 but it was fast with CUDA 10.1. But i didn’t look into into it more so could be just by chance. Camvid was taking around 1min from Zach’s notebook. Not sure if this helps you
@muellerzr
Everything seems to be working fine today with vanilla Colab Pro + fastai updated to 2.1.7
By the way people, I plan to start watching and doing walk with fastai series/videos and notebooks this new year, I guess time will be mornings at 8 to 9 Central Time (maybe on nights, it depends if all who want to join is on morning or night). Tuesday we will watch video and Thursday we will run notebooks and chat, I think it is about 12 hours in videos or 12 weeks or from January until march (maybe timming can change). Starting Jan 5.
This will be on discord I guess.
The entirety of this course is now available on my website:
The notebooks are a bit more fleshed out now, and these will be copied over to the Practical Deep Learning for Coders repo as well later today
While testing out notebook 7 for Super-Resolution, I found our that unet_cofig
is deprecated.
So I changed unet_learner
to
unet_learner(dls_gen,arch=resnet34, loss_func=MSELossFlat, blur=True, norm_type=NormType.Weight, self_attention=True, y_range=(-3.,3.), n_out=3)
replacing all the params for config and adding them in directly.
But I can’t get it to work. When I run learn_gen.fit_one_cycle(2, pct_start=0.8, wd=WeightDecay)
I get the following error:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-76-c661610c2951> in <module>()
----> 1 learn_gen.fit_one_cycle(2, pct_start=0.8, wd=WeightDecay)
2 # learn_gen.fit_one_cycle()
8 frames
/usr/local/lib/python3.6/dist-packages/fastai/callback/schedule.py in fit_one_cycle(self, n_epoch, lr_max, div, div_final, pct_start, wd, moms, cbs, reset_opt)
110 scheds = {'lr': combined_cos(pct_start, lr_max/div, lr_max, lr_max/div_final),
111 'mom': combined_cos(pct_start, *(self.moms if moms is None else moms))}
--> 112 self.fit(n_epoch, cbs=ParamScheduler(scheds)+L(cbs), reset_opt=reset_opt, wd=wd)
113
114 # Cell
/usr/local/lib/python3.6/dist-packages/fastai/learner.py in fit(self, n_epoch, lr, wd, cbs, reset_opt)
198
199 def fit(self, n_epoch, lr=None, wd=None, cbs=None, reset_opt=False):
--> 200 with self.added_cbs(cbs):
201 if reset_opt or not self.opt: self.create_opt()
202 if wd is None: wd = self.wd
/usr/lib/python3.6/contextlib.py in __enter__(self)
79 def __enter__(self):
80 try:
---> 81 return next(self.gen)
82 except StopIteration:
83 raise RuntimeError("generator didn't yield") from None
/usr/local/lib/python3.6/dist-packages/fastai/learner.py in added_cbs(self, cbs)
119 @contextmanager
120 def added_cbs(self, cbs):
--> 121 self.add_cbs(cbs)
122 try: yield
123 finally: self.remove_cbs(cbs)
/usr/local/lib/python3.6/dist-packages/fastai/learner.py in add_cbs(self, cbs)
100
101 def _grab_cbs(self, cb_cls): return L(cb for cb in self.cbs if isinstance(cb, cb_cls))
--> 102 def add_cbs(self, cbs): L(cbs).map(self.add_cb)
103 def remove_cbs(self, cbs): L(cbs).map(self.remove_cb)
104 def add_cb(self, cb):
/usr/local/lib/python3.6/dist-packages/fastcore/foundation.py in map(self, f, gen, *args, **kwargs)
152 def range(cls, a, b=None, step=None): return cls(range_of(a, b=b, step=step))
153
--> 154 def map(self, f, *args, gen=False, **kwargs): return self._new(map_ex(self, f, *args, gen=gen, **kwargs))
155 def argwhere(self, f, negate=False, **kwargs): return self._new(argwhere(self, f, negate, **kwargs))
156 def filter(self, f=noop, negate=False, gen=False, **kwargs):
/usr/local/lib/python3.6/dist-packages/fastcore/basics.py in map_ex(iterable, f, gen, *args, **kwargs)
654 res = map(g, iterable)
655 if gen: return res
--> 656 return list(res)
657
658 # Cell
/usr/local/lib/python3.6/dist-packages/fastcore/basics.py in __call__(self, *args, **kwargs)
644 if isinstance(v,_Arg): kwargs[k] = args.pop(v.i)
645 fargs = [args[x.i] if isinstance(x, _Arg) else x for x in self.pargs] + args[self.maxi+1:]
--> 646 return self.func(*fargs, **kwargs)
647
648 # Cell
/usr/local/lib/python3.6/dist-packages/fastai/learner.py in add_cb(self, cb)
107 cb.learn = self
108 setattr(self, cb.name, cb)
--> 109 self.cbs.append(cb)
110 return self
111
AttributeError: 'NoneType' object has no attribute 'append'
Do you have any solution for this? I don’t understand what I did wrong.
This is running on:
Name: fastai
Version: 2.1.10
Name: fastcore
Version: 1.3.16
I downgraded fastcore
and Still get the same error.
Name: fastcore
Version: 1.3.13
EDIT:
So based on others in the thread, I have downgraded the rest to this:
!pip uninstall torch -y
!pip install torch==1.6.0+cu101 torchvision==0.7.0+cu101 -f https://download.pytorch.org/whl/torch_stable.html
!pip install fastai==2.0.19
!pip install fastcore==1.3.1
now I don’t get the same NoneType
error, I instead get this:
epoch train_loss valid_loss time
0 0.000000 00:00
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-21-c661610c2951> in <module>()
----> 1 learn_gen.fit_one_cycle(2, pct_start=0.8, wd=WeightDecay)
2 # learn_gen.fit_one_cycle()
20 frames
/usr/local/lib/python3.6/dist-packages/torch/nn/_reduction.py in legacy_get_string(size_average, reduce, emit_warning)
35 reduce = True
36
---> 37 if size_average and reduce:
38 ret = 'mean'
39 elif reduce:
RuntimeError: Boolean value of Tensor with more than one value is ambiguous
For now use the notebooks in the walkwithfastai repo: walkwithfastai.github.io/nbs/course2020 at master · walkwithfastai/walkwithfastai.github.io · GitHub
The original repo notebooks I’ll update soon. However keep an eye on the installed pip versions. Some parts are using the dev version of the library.
I’ll make sure to keep an eye on that. At the moment, the train time for each epoch in the Generator is taking me close to 90mins. I’ll mess around with the libraries and see if anything helps out. Thank you so much!
Did you enable GPU?
Yes, GPU is enabled.
This is the generator: epochs 3 to 5. The previous two also took 90 minutes for each run
This is the Critic training right now, somewhat faster than I expected
What does GenL.dls.device
give you? That definitely looks like it’s running CPU bound
I’ll re-run it and check. But I’m 90% sure it was running on the GPU. I think some other APIs straight-up fail on the CPU, I’m not sure. I trained this yesterday and saved the model.
Running the reloaded model, I get this:
I think you’re missing loss_func=MSELossFlat()
with parentheses.
I’d like to join this. How much have you covered so far? And do I need to have plenty of experience to join? I don’t want to slow down the group.
No, there is no need for prior experience with fastai, probably there are people that have not set up their fastai environment.
We have only covered first video of the walk with fastai series this Tuersday, and consider running yourself the pets notebook on your own if want to sync with what we did on Thursday.
Also anyone reading here, there are some people thinking to open a study hour at UE time, wonder if there are more ppl here in the forum interested in that.
That’s awesome. I started doing walk with fastai many months ago - but had to stop it. Will join your group and restart the course now. thanks.
Sorry about that late reply. I think you’re right. I’d fixed that mistake but never run my function.
However, when I run this:
learn.fit(10, lr, wd=wd)
I get the following error log:
/usr/local/lib/python3.6/dist-packages/fastai/callback/core.py:50: UserWarning: You are shadowing an attribute (generator) that exists in the learner. Use `self.learn.generator` to avoid this
warn(f"You are shadowing an attribute ({name}) that exists in the learner. Use `self.learn.{name}` to avoid this")
/usr/local/lib/python3.6/dist-packages/fastai/callback/core.py:50: UserWarning: You are shadowing an attribute (critic) that exists in the learner. Use `self.learn.critic` to avoid this
warn(f"You are shadowing an attribute ({name}) that exists in the learner. Use `self.learn.{name}` to avoid this")
/usr/local/lib/python3.6/dist-packages/fastai/callback/core.py:50: UserWarning: You are shadowing an attribute (gen_mode) that exists in the learner. Use `self.learn.gen_mode` to avoid this
warn(f"You are shadowing an attribute ({name}) that exists in the learner. Use `self.learn.{name}` to avoid this")
0.00% [0/10 00:00<00:00]
epoch train_loss valid_loss gen_loss crit_loss time
0.00% [0/184 00:00<00:00]
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-54-2d5f936dd8a8> in <module>()
----> 1 learn.fit(10, lr, wd=wd)
13 frames
/usr/local/lib/python3.6/dist-packages/fastai/vision/gan.py in critic(self, real_pred, input)
108 def critic(self, real_pred, input):
109 "Create some `fake_pred` with the generator from `input` and compare them to `real_pred` in `self.crit_loss_func`."
--> 110 fake = self.gan_model.generator(input).requires_grad_(False)
111 fake_pred = self.gan_model.critic(fake)
112 self.crit_loss = self.crit_loss_func(real_pred, fake_pred)
TypeError: requires_grad_() takes 1 positional argument but 2 were given
I’ll rerun it to see if I missed something. I was downloading a saved critic and running the GAN at the same time maybe that’s what caused some issues?
Hi all!
I’m trying a semantic segmentation task, and having a problem.
I followed this course(https://walkwithfastai.com/Segmentation), and understood the workflow.
In CAMVID dataset, all pixel values of the annotation images are ranging from 0 to 31, each of the pixel values represents a label(category) not a color.
But in my dataset, I have full-color RGB images and a mapping document which says red is “person” as label 1, blue is “road” as label 2, green is “car” as label 3.
Without preprocessing, I can’t start training because of RuntimeError: CUDA error: device-side assert triggered
error.
Is there a nice way to handle this type of annotation images? Thanks a lot.
You should follow the Binary Segmentation tutorial, it discusses overriding mask values to make them linear (names get out of that error):