A walk with fastai2 - Vision - Study Group and Online Lectures Megathread

I’ll try looking again at this today…this is seriously strange

1 Like

I saw it was training slower for me to with CUDA 10.2 but it was fast with CUDA 10.1. But i didn’t look into into it more so could be just by chance. Camvid was taking around 1min from Zach’s notebook. Not sure if this helps you :slight_smile:

@muellerzr
Everything seems to be working fine today with vanilla Colab Pro + fastai updated to 2.1.7

1 Like

By the way people, I plan to start watching and doing walk with fastai series/videos and notebooks this new year, I guess time will be mornings at 8 to 9 Central Time (maybe on nights, it depends if all who want to join is on morning or night). Tuesday we will watch video and Thursday we will run notebooks and chat, I think it is about 12 hours in videos or 12 weeks or from January until march (maybe timming can change). Starting Jan 5.

This will be on discord I guess.

2 Likes

The entirety of this course is now available on my website:

The notebooks are a bit more fleshed out now, and these will be copied over to the Practical Deep Learning for Coders repo as well later today

3 Likes

While testing out notebook 7 for Super-Resolution, I found our that unet_cofig is deprecated.
So I changed unet_learner to
unet_learner(dls_gen,arch=resnet34, loss_func=MSELossFlat, blur=True, norm_type=NormType.Weight, self_attention=True, y_range=(-3.,3.), n_out=3) replacing all the params for config and adding them in directly.

But I can’t get it to work. When I run learn_gen.fit_one_cycle(2, pct_start=0.8, wd=WeightDecay)

I get the following error:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-76-c661610c2951> in <module>()
----> 1 learn_gen.fit_one_cycle(2, pct_start=0.8, wd=WeightDecay)
      2 # learn_gen.fit_one_cycle()

8 frames
/usr/local/lib/python3.6/dist-packages/fastai/callback/schedule.py in fit_one_cycle(self, n_epoch, lr_max, div, div_final, pct_start, wd, moms, cbs, reset_opt)
    110     scheds = {'lr': combined_cos(pct_start, lr_max/div, lr_max, lr_max/div_final),
    111               'mom': combined_cos(pct_start, *(self.moms if moms is None else moms))}
--> 112     self.fit(n_epoch, cbs=ParamScheduler(scheds)+L(cbs), reset_opt=reset_opt, wd=wd)
    113 
    114 # Cell

/usr/local/lib/python3.6/dist-packages/fastai/learner.py in fit(self, n_epoch, lr, wd, cbs, reset_opt)
    198 
    199     def fit(self, n_epoch, lr=None, wd=None, cbs=None, reset_opt=False):
--> 200         with self.added_cbs(cbs):
    201             if reset_opt or not self.opt: self.create_opt()
    202             if wd is None: wd = self.wd

/usr/lib/python3.6/contextlib.py in __enter__(self)
     79     def __enter__(self):
     80         try:
---> 81             return next(self.gen)
     82         except StopIteration:
     83             raise RuntimeError("generator didn't yield") from None

/usr/local/lib/python3.6/dist-packages/fastai/learner.py in added_cbs(self, cbs)
    119     @contextmanager
    120     def added_cbs(self, cbs):
--> 121         self.add_cbs(cbs)
    122         try: yield
    123         finally: self.remove_cbs(cbs)

/usr/local/lib/python3.6/dist-packages/fastai/learner.py in add_cbs(self, cbs)
    100 
    101     def _grab_cbs(self, cb_cls): return L(cb for cb in self.cbs if isinstance(cb, cb_cls))
--> 102     def add_cbs(self, cbs): L(cbs).map(self.add_cb)
    103     def remove_cbs(self, cbs): L(cbs).map(self.remove_cb)
    104     def add_cb(self, cb):

/usr/local/lib/python3.6/dist-packages/fastcore/foundation.py in map(self, f, gen, *args, **kwargs)
    152     def range(cls, a, b=None, step=None): return cls(range_of(a, b=b, step=step))
    153 
--> 154     def map(self, f, *args, gen=False, **kwargs): return self._new(map_ex(self, f, *args, gen=gen, **kwargs))
    155     def argwhere(self, f, negate=False, **kwargs): return self._new(argwhere(self, f, negate, **kwargs))
    156     def filter(self, f=noop, negate=False, gen=False, **kwargs):

/usr/local/lib/python3.6/dist-packages/fastcore/basics.py in map_ex(iterable, f, gen, *args, **kwargs)
    654     res = map(g, iterable)
    655     if gen: return res
--> 656     return list(res)
    657 
    658 # Cell

/usr/local/lib/python3.6/dist-packages/fastcore/basics.py in __call__(self, *args, **kwargs)
    644             if isinstance(v,_Arg): kwargs[k] = args.pop(v.i)
    645         fargs = [args[x.i] if isinstance(x, _Arg) else x for x in self.pargs] + args[self.maxi+1:]
--> 646         return self.func(*fargs, **kwargs)
    647 
    648 # Cell

/usr/local/lib/python3.6/dist-packages/fastai/learner.py in add_cb(self, cb)
    107         cb.learn = self
    108         setattr(self, cb.name, cb)
--> 109         self.cbs.append(cb)
    110         return self
    111 

AttributeError: 'NoneType' object has no attribute 'append' 

Do you have any solution for this? I don’t understand what I did wrong.

This is running on:

Name: fastai
Version: 2.1.10

Name: fastcore
Version: 1.3.16

I downgraded fastcore and Still get the same error.

Name: fastcore
Version: 1.3.13

EDIT:
So based on others in the thread, I have downgraded the rest to this:

!pip uninstall torch -y
!pip install torch==1.6.0+cu101 torchvision==0.7.0+cu101 -f https://download.pytorch.org/whl/torch_stable.html
!pip install fastai==2.0.19
!pip install fastcore==1.3.1 

now I don’t get the same NoneType error, I instead get this:

epoch	train_loss	valid_loss	time
0	0.000000	00:00
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-21-c661610c2951> in <module>()
----> 1 learn_gen.fit_one_cycle(2, pct_start=0.8, wd=WeightDecay)
      2 # learn_gen.fit_one_cycle()

20 frames
/usr/local/lib/python3.6/dist-packages/torch/nn/_reduction.py in legacy_get_string(size_average, reduce, emit_warning)
     35         reduce = True
     36 
---> 37     if size_average and reduce:
     38         ret = 'mean'
     39     elif reduce:

RuntimeError: Boolean value of Tensor with more than one value is ambiguous

For now use the notebooks in the walkwithfastai repo: https://github.com/walkwithfastai/walkwithfastai.github.io/tree/master/nbs/course2020

The original repo notebooks I’ll update soon. However keep an eye on the installed pip versions. Some parts are using the dev version of the library.

I’ll make sure to keep an eye on that. At the moment, the train time for each epoch in the Generator is taking me close to 90mins. I’ll mess around with the libraries and see if anything helps out. Thank you so much!

Did you enable GPU?

Yes, GPU is enabled.
This is the generator: epochs 3 to 5. The previous two also took 90 minutes for each run
This is the generator. Epochs 3-> 5

This is the Critic training right now, somewhat faster than I expected
This is the Critic training right now, somewhat faster than I expected

What does GenL.dls.device give you? That definitely looks like it’s running CPU bound

I’ll re-run it and check. But I’m 90% sure it was running on the GPU. I think some other APIs straight-up fail on the CPU, I’m not sure. I trained this yesterday and saved the model.
Running the reloaded model, I get this:
image

I think you’re missing loss_func=MSELossFlat() with parentheses.

2 Likes

I’d like to join this. How much have you covered so far? And do I need to have plenty of experience to join? I don’t want to slow down the group.

No, there is no need for prior experience with fastai, probably there are people that have not set up their fastai environment.

We have only covered first video of the walk with fastai series this Tuersday, and consider running yourself the pets notebook on your own if want to sync with what we did on Thursday.


Also anyone reading here, there are some people thinking to open a study hour at UE time, wonder if there are more ppl here in the forum interested in that.

3 Likes

That’s awesome. I started doing walk with fastai many months ago - but had to stop it. Will join your group and restart the course now. thanks.

1 Like

Sorry about that late reply. I think you’re right. I’d fixed that mistake but never run my function.

However, when I run this:
learn.fit(10, lr, wd=wd)
I get the following error log:

/usr/local/lib/python3.6/dist-packages/fastai/callback/core.py:50: UserWarning: You are shadowing an attribute (generator) that exists in the learner. Use `self.learn.generator` to avoid this
  warn(f"You are shadowing an attribute ({name}) that exists in the learner. Use `self.learn.{name}` to avoid this")
/usr/local/lib/python3.6/dist-packages/fastai/callback/core.py:50: UserWarning: You are shadowing an attribute (critic) that exists in the learner. Use `self.learn.critic` to avoid this
  warn(f"You are shadowing an attribute ({name}) that exists in the learner. Use `self.learn.{name}` to avoid this")
/usr/local/lib/python3.6/dist-packages/fastai/callback/core.py:50: UserWarning: You are shadowing an attribute (gen_mode) that exists in the learner. Use `self.learn.gen_mode` to avoid this
  warn(f"You are shadowing an attribute ({name}) that exists in the learner. Use `self.learn.{name}` to avoid this")

 0.00% [0/10 00:00<00:00]
epoch	train_loss	valid_loss	gen_loss	crit_loss	time

 0.00% [0/184 00:00<00:00]
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-54-2d5f936dd8a8> in <module>()
----> 1 learn.fit(10, lr, wd=wd)

13 frames
/usr/local/lib/python3.6/dist-packages/fastai/vision/gan.py in critic(self, real_pred, input)
    108     def critic(self, real_pred, input):
    109         "Create some `fake_pred` with the generator from `input` and compare them to `real_pred` in `self.crit_loss_func`."
--> 110         fake = self.gan_model.generator(input).requires_grad_(False)
    111         fake_pred = self.gan_model.critic(fake)
    112         self.crit_loss = self.crit_loss_func(real_pred, fake_pred)

TypeError: requires_grad_() takes 1 positional argument but 2 were given

I’ll rerun it to see if I missed something. I was downloading a saved critic and running the GAN at the same time maybe that’s what caused some issues?

2 Likes

Hi all!

I’m trying a semantic segmentation task, and having a problem.
I followed this course(https://walkwithfastai.com/Segmentation), and understood the workflow.

In CAMVID dataset, all pixel values of the annotation images are ranging from 0 to 31, each of the pixel values represents a label(category) not a color.

But in my dataset, I have full-color RGB images and a mapping document which says red is “person” as label 1, blue is “road” as label 2, green is “car” as label 3.

Without preprocessing, I can’t start training because of RuntimeError: CUDA error: device-side assert triggered error.

Is there a nice way to handle this type of annotation images? Thanks a lot.

1 Like

You should follow the Binary Segmentation tutorial, it discusses overriding mask values to make them linear (names get out of that error):