Lesson 1 In-Class Discussion ✅

perceptron · December 26, 2019, 2:20pm

@kdorichev Thank you so much! I would love to have a look and get back to you if I don’t get anything.

Roszko · December 29, 2019, 9:57am

Hi, the Lesson 1 files out-of-the-box does not work for me. Followed all instructions - fresh install / did not change anything (on Gradient / Paperspace). Really stuck:/ will appreciate help. Is it possible that something is outdated?

On step: interp.plot_top_losses(9, figsize=(15,11))

I get an error:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-26-1b2e75ee979a> in <module>
      1 #interp.plot_top_losses(9, figsize=(7,6))
----> 2 interp.plot_top_losses(6, figsize=(15,11))

/opt/conda/envs/fastai/lib/python3.6/site-packages/fastai/vision/learner.py in _cl_int_plot_top_losses(self, k, largest, figsize, heatmap, heatmap_thresh, alpha, cmap, show_text, return_fig)
    174     if show_text: fig.suptitle('Prediction/Actual/Loss/Probability', weight='bold', size=14)
    175     for i,idx in enumerate(tl_idx):
--> 176         im,cl = self.data.dl(self.ds_type).dataset[idx]
    177         cl = int(cl)
    178         title = f'{classes[self.pred_class[idx]]}/{classes[cl]} / {self.losses[idx]:.2f} / {self.preds[idx][cl]:.2f}' if show_text else None

/opt/conda/envs/fastai/lib/python3.6/site-packages/fastai/data_block.py in __getitem__(self, idxs)
    647     def __getitem__(self,idxs:Union[int,np.ndarray])->'LabelList':
    648         "return a single (x, y) if `idxs` is an integer or a new `LabelList` object if `idxs` is a range."
--> 649         idxs = try_int(idxs)
    650         if isinstance(idxs, Integral):
    651             if self.item is None: x,y = self.x[idxs],self.y[idxs]

/opt/conda/envs/fastai/lib/python3.6/site-packages/fastai/torch_core.py in try_int(o)
    365     "Try to convert `o` to int, default to `o` if not possible."
    366     # NB: single-item rank-1 array/tensor can be converted to int, but we don't want to do this
--> 367     if isinstance(o, (np.ndarray,Tensor)): return o if o.ndim else int(o)
    368     if isinstance(o, collections.Sized) or getattr(o,'__array_interface__',False): return o
    369     try: return int(o)

AttributeError: 'Tensor' object has no attribute 'ndim'

kdorichev · December 29, 2019, 4:34pm

This was fixed in pytorch 1.2. Please uprgade pytorch.

ameerkat · December 29, 2019, 8:12pm

I am trying to apply the resnet34 model from lesson one to the old Kaggle competition for humpback whale tail identification. Given I don’t have a lot of background here I was wondering a few things

There are 4000+ classes as each whale is individually labeled. Would the resnet network be suited for that large a number of classes? Does the number of classes make a difference here? When you have a large number of classes what kind of things would you do differently compared to having 30 or so like the pet example?
All unidentified whales are labeled as “new_whale”. It seems like having such a class would cause issues as all the new whales are not necessarily related and would cause the class to become over eager. Should I throw out the new whales or is there a way to effectively use this data?

Thanks for any help!

Tom3 · December 29, 2019, 8:41pm

I’ ve encountered exactly the same problem on Paperspace/Gradient with a Free-GPU notebook, but till now did not find a way to upgrade pytorch from version 1.0.0 (“quota exceeded…”). Thanks for any help!

aquietlife · December 30, 2019, 4:12am

I had the same problem, specifically on one of Paperspace’s Core Compute machines (not just a Gradient notebook).

The short story is, upgrade PyTorch specifically like this:

conda install pytorch torchvision cudatoolkit=10.0 -c pytorch

The long story is, if you just try to upgrade PyTorch, it will also upgrade the CUDA driver that will be incompatable with the NVIDIA GPU cards on their machines, disabling them from use in your notebooks. (I was getting this error when trying to debug why training was going extremely slow: The NVIDIA driver on your system is too old (found version 9010))

You can upgrade PyTorch, but make sure to flag it to use cudatoolkit=10.0, not 10.1.

Roszko · December 30, 2019, 9:51am

Thx, sounds like a solution. On free Paperspace/Gradient I cannot however install anything as it exceeds 500 mb quota. Still working on it but appreciate your help.

UPDATE: Tom3 - if you chose lower tier free notebook na Paperspace/Gradient (I took P4000) you can install aquietlife’s command (seems they might have a different storage quota):
conda install pytorch torchvision cudatoolkit=10.0 -c pytorch
(from: Lesson 1 Discussion ✅)
then everything runs smooth. problem solved.

aquietlife - thanks once again

Roszko · December 30, 2019, 9:52am

We should probably let someone know about this - the defaults from tutorial are not working. The installation package from tutorial should be modified I guess: https://course.fast.ai/start_gradient.html.

Anyone knows who to contact on this?

kdorichev · December 30, 2019, 12:47pm

As an alternative, you may want to try Google Colab. It has fairly new fastai and pytorch installed:

import fastai
fastai.__version__
'1.0.59'
import torch
torch.__version__
'1.3.1'

aquietlife · December 30, 2019, 1:37pm

Glad it helped!!

Tom3 · December 31, 2019, 2:49pm

I contacted the support of Paperspace. Waiting for an answer. Thanks to all for help!

Tom3 · December 31, 2019, 2:50pm

Thanks a lot! Will try.

dtpdx · December 31, 2019, 8:03pm

Hi Jamesnixon94. I am a newbie myself and just got a dataset trained. Here’s what I did. It was a somewhat painful process of clicking around to figure it all out…

I uploaded a tarball of my images, then I opened up a terminal and ran the ‘tar -cvzxf yourfilename.tgz’ command to decompress it.

These instructions assume a familiarity with some command line navigating directories from the command line. It’s also not trivial to get all your images into the assumed label_imagenumber format into a directory…

manoj.dmc · January 4, 2020, 11:58am

How do we exactly interpret train_loss, valid_loss and error_rate ? Is there a link where I can find definitions for these ?

moulika · January 5, 2020, 7:30pm

I have used Jupyter notes havent installed GPU. In lesson 1 when i use learn.fit_one_cycle(4) its loading for ever. any inputs on this?

moulika · January 5, 2020, 7:36pm

I have used Jupyter notes.have’nt installed GPU. In lesson 1 when i use learn.fit_one_cycle(4) its loading for ever. any inputs on this?

kdorichev · January 6, 2020, 11:06am

For performance, you do need GPU. There are quite a few online services which provide GPUs.
The esiest and a free of charge one to start with I found is Google Colab.

Find more in Course Documents, refer to “Returning to work” section.
Welcome, @moulika, and good luck!

aanghosh17 · January 6, 2020, 12:49pm

Apart from the accuracy of the model, I don’t think you will have any issues with using the resnet34 model.
I did a cursory search for competition notebooks that used fast ai.
Here is a kaggle notebook using the resnet 50
and here is one that usesresnet 18
Assuming you remove the class ‘new_whale’, you will have all real samples of new whales spread out and misidentified as one of the other 3k/4k+ whales. I can’t immediately tell how the accuracy of your model will change based on this decision, but I suppose you can experiment and make a decision.

dtpdx · January 7, 2020, 4:11am

I believe the train_loss is the value of the loss function associated with the training data set and the validation_loss is the value of the loss function associated with the validation data set. Not exactly sure about the error_rate, other than the obvious (… rate at which the model is wrong i.e. # of wrong predictions / number of data points (i.e. number of images). I’m a newbie, so someone else please chime in!

moulika · January 8, 2020, 8:28am

Thanks Konstantin