Lesson 2 - Official Topic

I am facing this issue now. Have you got any idea how to fix this?

Hi,
I am a course-v3 student and have a few questions for lesson 2,

  1. Should we be running pip install fastai --upgrade in our notebooks? (Considering there’s a new version of fastai now that is incompatible with course-v3 fastai.
  2. How can train and valid loss be > 1? Isn’t 1 == 100% loss? What does a loss greater than 100% mean?
  3. I noticed in the notebook learn.export() and load_learner(...) is used; how is this different from learn.save(...) and learn.load(...) ?

Thank you.

Regards,

I am going through Chapter 2’s notebook, and remaking the example bear classifier. I get this error when running learn.fine_tune4():

RuntimeError: DataLoader worker (pid 19862) is killed by signal: Killed. 

I am using Gradient. Any help will be appreciated! Thank you.

The full error message is:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-12-672a4803901c> in <module>
      1 learn = cnn_learner(dls, resnet18, metrics=error_rate)
----> 2 learn.fine_tune(4)

/opt/conda/envs/fastai/lib/python3.8/site-packages/fastcore/utils.py in _f(*args, **kwargs)
    470         init_args.update(log)
    471         setattr(inst, 'init_args', init_args)
--> 472         return inst if to_return else f(*args, **kwargs)
    473     return _f
    474 

/opt/conda/envs/fastai/lib/python3.8/site-packages/fastai/callback/schedule.py in fine_tune(self, epochs, base_lr, freeze_epochs, lr_mult, pct_start, div, **kwargs)
    159     "Fine tune with `freeze` for `freeze_epochs` then with `unfreeze` from `epochs` using discriminative LR"
    160     self.freeze()
--> 161     self.fit_one_cycle(freeze_epochs, slice(base_lr), pct_start=0.99, **kwargs)
    162     base_lr /= 2
    163     self.unfreeze()

/opt/conda/envs/fastai/lib/python3.8/site-packages/fastcore/utils.py in _f(*args, **kwargs)
    470         init_args.update(log)
    471         setattr(inst, 'init_args', init_args)
--> 472         return inst if to_return else f(*args, **kwargs)
    473     return _f
    474 

/opt/conda/envs/fastai/lib/python3.8/site-packages/fastai/callback/schedule.py in fit_one_cycle(self, n_epoch, lr_max, div, div_final, pct_start, wd, moms, cbs, reset_opt)
    111     scheds = {'lr': combined_cos(pct_start, lr_max/div, lr_max, lr_max/div_final),
    112               'mom': combined_cos(pct_start, *(self.moms if moms is None else moms))}
--> 113     self.fit(n_epoch, cbs=ParamScheduler(scheds)+L(cbs), reset_opt=reset_opt, wd=wd)
    114 
    115 # Cell

/opt/conda/envs/fastai/lib/python3.8/site-packages/fastcore/utils.py in _f(*args, **kwargs)
    470         init_args.update(log)
    471         setattr(inst, 'init_args', init_args)
--> 472         return inst if to_return else f(*args, **kwargs)
    473     return _f
    474 

/opt/conda/envs/fastai/lib/python3.8/site-packages/fastai/learner.py in fit(self, n_epoch, lr, wd, cbs, reset_opt)
    205             self.opt.set_hypers(lr=self.lr if lr is None else lr)
    206             self.n_epoch,self.loss = n_epoch,tensor(0.)
--> 207             self._with_events(self._do_fit, 'fit', CancelFitException, self._end_cleanup)
    208 
    209     def _end_cleanup(self): self.dl,self.xb,self.yb,self.pred,self.loss = None,(None,),(None,),None,None

/opt/conda/envs/fastai/lib/python3.8/site-packages/fastai/learner.py in _with_events(self, f, event_type, ex, final)
    153 
    154     def _with_events(self, f, event_type, ex, final=noop):
--> 155         try:       self(f'before_{event_type}')       ;f()
    156         except ex: self(f'after_cancel_{event_type}')
    157         finally:   self(f'after_{event_type}')        ;final()

/opt/conda/envs/fastai/lib/python3.8/site-packages/fastai/learner.py in _do_fit(self)
    195         for epoch in range(self.n_epoch):
    196             self.epoch=epoch
--> 197             self._with_events(self._do_epoch, 'epoch', CancelEpochException)
    198 
    199     @log_args(but='cbs')

/opt/conda/envs/fastai/lib/python3.8/site-packages/fastai/learner.py in _with_events(self, f, event_type, ex, final)
    153 
    154     def _with_events(self, f, event_type, ex, final=noop):
--> 155         try:       self(f'before_{event_type}')       ;f()
    156         except ex: self(f'after_cancel_{event_type}')
    157         finally:   self(f'after_{event_type}')        ;final()

/opt/conda/envs/fastai/lib/python3.8/site-packages/fastai/learner.py in _do_epoch(self)
    189 
    190     def _do_epoch(self):
--> 191         self._do_epoch_train()
    192         self._do_epoch_validate()
    193 

/opt/conda/envs/fastai/lib/python3.8/site-packages/fastai/learner.py in _do_epoch_train(self)
    181     def _do_epoch_train(self):
    182         self.dl = self.dls.train
--> 183         self._with_events(self.all_batches, 'train', CancelTrainException)
    184 
    185     def _do_epoch_validate(self, ds_idx=1, dl=None):

/opt/conda/envs/fastai/lib/python3.8/site-packages/fastai/learner.py in _with_events(self, f, event_type, ex, final)
    153 
    154     def _with_events(self, f, event_type, ex, final=noop):
--> 155         try:       self(f'before_{event_type}')       ;f()
    156         except ex: self(f'after_cancel_{event_type}')
    157         finally:   self(f'after_{event_type}')        ;final()

/opt/conda/envs/fastai/lib/python3.8/site-packages/fastai/learner.py in all_batches(self)
    159     def all_batches(self):
    160         self.n_iter = len(self.dl)
--> 161         for o in enumerate(self.dl): self.one_batch(*o)
    162 
    163     def _do_one_batch(self):

/opt/conda/envs/fastai/lib/python3.8/site-packages/fastai/learner.py in one_batch(self, i, b)
    177         self.iter = i
    178         self._split(b)
--> 179         self._with_events(self._do_one_batch, 'batch', CancelBatchException)
    180 
    181     def _do_epoch_train(self):

/opt/conda/envs/fastai/lib/python3.8/site-packages/fastai/learner.py in _with_events(self, f, event_type, ex, final)
    153 
    154     def _with_events(self, f, event_type, ex, final=noop):
--> 155         try:       self(f'before_{event_type}')       ;f()
    156         except ex: self(f'after_cancel_{event_type}')
    157         finally:   self(f'after_{event_type}')        ;final()

/opt/conda/envs/fastai/lib/python3.8/site-packages/fastai/learner.py in _do_one_batch(self)
    162 
    163     def _do_one_batch(self):
--> 164         self.pred = self.model(*self.xb)
    165         self('after_pred')
    166         if len(self.yb): self.loss = self.loss_func(self.pred, *self.yb)

/opt/conda/envs/fastai/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
    720             result = self._slow_forward(*input, **kwargs)
    721         else:
--> 722             result = self.forward(*input, **kwargs)
    723         for hook in itertools.chain(
    724                 _global_forward_hooks.values(),

/opt/conda/envs/fastai/lib/python3.8/site-packages/torch/nn/modules/container.py in forward(self, input)
    115     def forward(self, input):
    116         for module in self:
--> 117             input = module(input)
    118         return input
    119 

/opt/conda/envs/fastai/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
    720             result = self._slow_forward(*input, **kwargs)
    721         else:
--> 722             result = self.forward(*input, **kwargs)
    723         for hook in itertools.chain(
    724                 _global_forward_hooks.values(),

/opt/conda/envs/fastai/lib/python3.8/site-packages/torch/nn/modules/container.py in forward(self, input)
    115     def forward(self, input):
    116         for module in self:
--> 117             input = module(input)
    118         return input
    119 

/opt/conda/envs/fastai/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
    720             result = self._slow_forward(*input, **kwargs)
    721         else:
--> 722             result = self.forward(*input, **kwargs)
    723         for hook in itertools.chain(
    724                 _global_forward_hooks.values(),

/opt/conda/envs/fastai/lib/python3.8/site-packages/torch/nn/modules/container.py in forward(self, input)
    115     def forward(self, input):
    116         for module in self:
--> 117             input = module(input)
    118         return input
    119 

/opt/conda/envs/fastai/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
    720             result = self._slow_forward(*input, **kwargs)
    721         else:
--> 722             result = self.forward(*input, **kwargs)
    723         for hook in itertools.chain(
    724                 _global_forward_hooks.values(),

/opt/conda/envs/fastai/lib/python3.8/site-packages/torchvision/models/resnet.py in forward(self, x)
     61         out = self.relu(out)
     62 
---> 63         out = self.conv2(out)
     64         out = self.bn2(out)
     65 

/opt/conda/envs/fastai/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
    720             result = self._slow_forward(*input, **kwargs)
    721         else:
--> 722             result = self.forward(*input, **kwargs)
    723         for hook in itertools.chain(
    724                 _global_forward_hooks.values(),

/opt/conda/envs/fastai/lib/python3.8/site-packages/torch/nn/modules/conv.py in forward(self, input)
    417 
    418     def forward(self, input: Tensor) -> Tensor:
--> 419         return self._conv_forward(input, self.weight)
    420 
    421 class Conv3d(_ConvNd):

/opt/conda/envs/fastai/lib/python3.8/site-packages/torch/nn/modules/conv.py in _conv_forward(self, input, weight)
    413                             weight, self.bias, self.stride,
    414                             _pair(0), self.dilation, self.groups)
--> 415         return F.conv2d(input, weight, self.bias, self.stride,
    416                         self.padding, self.dilation, self.groups)
    417 

/opt/conda/envs/fastai/lib/python3.8/site-packages/torch/utils/data/_utils/signal_handling.py in handler(signum, frame)
     64         # This following call uses `waitid` with WNOHANG from C side. Therefore,
     65         # Python can still get and update the process status successfully.
---> 66         _error_if_any_worker_fails()
     67         if previous_handler is not None:
     68             previous_handler(signum, frame)

RuntimeError: DataLoader worker (pid 19862) is killed by signal: Killed. 

Managed to get my first production model! Felt like it took far too long to get it all to work, but I got there in the end…

1 Like

For the first one, I would recomend use conda enviroments or another way to install fastai v2 that yes is not backward compatible in API, but some knowledge could be translated from one to the other, only things things changed from name and got another organization.

And yes, if you will not use the old one, you should do --upgrade.

This is typically a memory problem. How much RAM do you have? Try reducing the number of concurrent processes, it might help you.

1 Like

Thank you, I want to continue using fastai-v1, is there a safe way to upgrade to the latest version of fastai-v1 without corrupting the environment with fastai-v2?

I think I know the mistake now. All the free GPU at Gradient was used up and so I was unknowingly using CPU.

If you sign up for an account and use the discount code, you’ll magically get the free GPUs on a more regular basis than trying to get by using a free account.

1 Like

Hi,

The dls.valid.show_batch and dls.train.show_batch does not show any images when I execute the command. Screenshot attach. The same is the case with the confusion matrix.

Is any specific package required? How can I resolve the issue?

Having same problem here with paperspace free instance too

Sorry, misspeled…
In my case in paperspace (don’t know why) the notebook code had the simbol dash larger (:face_with_monocle:) so substituted with two regular – did the thing. Maybe it helps you too if that’s the case also¿?

Does Jeremy recommend we go through the course lesson and then the related book chapter? Or the other way around?

I am not sure what jeremy recommends so someone else who does know can jump in here.
I think jeremy made the book to be used in a hands-on manner: I would read the book, and as you go, run the notebooks in parallel.

1 Like

Hi,

Is there a way to ensure that the validation set in the datablock has equal set of samples from each class assuming a balanced class for classification?

Regards
Ganesh Bhat

Yes, you can use the stratify option in the TrainTestSplitter. Under the hood it uses the train_test_split function of scikit-learn.

2 Likes

Thanks.

Hi Everyone,

I have been trying to launch a bear classifier on binder since last 3 days, not sure where I am getting stuck.

Have been able to launch the web app locally with voila, but not working with binder.

Github repository link: https://github.com/Avinash1419/Voila_bear

The app when launched locally on voila looks like this:

It works perfectly fine for basic options as expected.

But when tried on binder, without mentioning anything in the file section on mybinder, it loads as a jupyter notebook hosted online. But with URL as required for voila, with necessary form fields, it always comes up with 404 error issue as shown below:

The form fields filled are as shown in the figure here:

Please suggest ways to resolve this issue!!

Things I felt might have been the issue, but I couldn’t resolve them:

  1. I have stated required packages in the requirements.txt after finding out the locally used packages in the Jupyter notebook. List of packages mentioned are listed here:

voila==0.2.2
torchvision==0.7.0
torch==1.6.0
sentencepiece==0.1.86
scipy==1.4.1
requests==2.22.0
Pillow==7.2.0
path==13.1.0
pandas==1.0.1
numpy==1.18.1
msrest==0.6.19
matplotlib==3.1.3
ipywidgets==7.5.1
fastprogress==1.0.0
fastcore==1.0.9
fastai==2.0.13

The list was generated by a basic code (included in the main .ipynb file in the last cell). Voila was included manually (it wasn’t generated in the code). But I am not sure if any other packages are to be listed.

  1. On running the beginning part of the code for importing fastai modules, faced a small issue as shown in the image:

It didn’t create issue in voila launch, not sure if it could have been an issue in binder.

  1. On mentioning the URL in binder site, I wasn’t sure of the format of URL. I used the one shown in the image above of mybinder, with and without ‘/’ in the beginning, i.e. before “voila”. Both of them didn’t work. Let me know if I should try some other way.

  2. I have commented these two commands in the .ipynb file.
    !pip install voila
    !jupyter serverextension enable voila -sys-prefix

Basically, since all the requirements of installation are being given through requirements.txt, I thought these aren’t needed in the .ipynb file.

This is the basic context of how I tried to work on the problem, and concerns I felt were relevant. Please let me know if this can be resolved one way or the other!

1 Like

Greetings All,

I am currently unable to deploy my drone classifier notebook onto Binder. I get the error below (please refer to the screenshot). Was wondering if anyone could share where I might have gone wrong here.

Link to my github for the drone classifier code: https://github.com/SantoshYadaw/drone_class

Thank you in advance :slight_smile:

You’ve told it to use a branch called master but you don’t have one, just one branch called main

2 Likes