Lesson 2 official topic

You can still do that for additional distros, or I think you can even start the install fresh from the Windows Store with your first distro. You can then start each from Start, or you will see them added to the drop down of different ‘terminals’ to open in Windows Terminal.
image

Quite possibly - I haven’t tried it in JupyterLab.

1 Like

I’ve annotated the image a bit from the lowest point for validation_loss (i’m assuming), where it seems like training loss and validation loss has diverged.

The error rate (once again i’m assuming) also eventually increases towards the end again. With this information, I would assume that if the training continued in the same trajectory as it is in the image above, the model would definitely get worse overall (it’s currently overfitting, started at epoch 9, and also got worse from epoch 18 in terms of metrics).

The learner object also has a recorder that can plot losses after each fit session. You can get a plot via learn.recorder.plot_loss() and interpret the results better. Image below taken from Chapter 5 of the book. And as the book mentions, in the end what matters is your metrics, not the losses really.

So, I’d say that you might have to train a bit further to see the error_rate get worse with certainity before you can rule that the model is indeed getting much worse.

2 Likes

I having trouble reconciling these two comments:

WSL is “through Linux”.

Now I suspected this would be the case for `which code`.

My vague memory from last week is that this remained after I uninstalled VSCode from Linux. This indicates “code” was not fully uninstalled, and this is being executed in preference to the “code.exe” located on the windows side. I think I just deleted the file, but not sure - keep a backup of the file.

Yup that file is safe to delete. If it exists, it means you’re running the WSL vscode, which isn’t what you want.

1 Like

Has anyone had any luck with setting up a rig to use fast.ai with ROCm? I’ll be attempting this by working with the official AMD documentation and looking at this topic, though it has aged somewhat Fastai on AMD GPUs - Working dockerfile . Using docker seems to be the best supported way.

Yeah, it is not working on Jupyterlab which is the default while using Jarvislabs.ai, the simplest way to switch to Jupyter notebooks would be to replace the word lab with tree in the URL.

I will explore if it can be run in Jupyterlab.

1 Like

Great! Let us know what you find. It’s just using regular ipywidgets.

2 Likes

Question:

Hi All whenever we are running the fine_tune of a particular pre-trained model we get this output.

What does the highlighted one represent? How is it different from below epoch 0?

image

1 Like

So basically you have a pretrained model, right? This is a model trained on a very large image classification dataset, almost always ImageNet. The idea behind fine-tuning is that many features learned to classify on ImageNet will transfer well to other datasets. So you simply have to fine-tune the pretrained model for it to do well on the target dataset.

The process that fastai takes is that we first change the last few layers of the model and train it from scratch. This is because those last few layers are more specific for performing well on ImageNet the dataset (for example, the last layer outputs 1000 values for the 1000 classes in ImageNet but your dataset will have a different number of classes). So we train just those last few layers, in this case for one epoch. That is where the first “epoch 0” is coming from. Next, we “unfreeze” the whole model and continue training the whole model, but you don’t have to train much because the features are already pretty close so that’s what the remaining epochs 0-2 are for.

Does this clarify stuff a little bit? A lot of this will be discussed in more detail later in the course and the book.

8 Likes

Thanks looking forward for more new things!

1 Like

Really thankful for the detailed explanations like this and for quoting where in the book also :pray:.

If you look at the kaggle notebook, the metrics is not improving after like 20-30 epochs. So it’s indeed over fitting.

2 Likes

Thanks didn’t know that actually existed for using Jupyter notebooks too :slight_smile: .

I noticed jupyter-contrib-nbextensions also doesn’t work in JupyterLabs for using collapsible headings.

Hi,
After git pushing the app.py file to spaces I am running into runtime error:

Traceback (most recent call last):
  File "app.py", line 6, in <module>
    from fastai.vision.all import *
ModuleNotFoundError: No module named 'fastai'

Did I miss any steps?

Oh I missed requirements.txt.
What should this file contain?
I have fastai == 2.5.2 only in this file. Anything else needed?

It works now. All I had to do is update the Jupyterlab version from 2.3 to 3.4. Starting from version 3 ipywidgets are supported in Jupyterlab :blush:.

@kurianbenoy Now both the widgets and collapsible headings work from Jupyterlab itself.

4 Likes

Wow, that’s great. Thanks for updating :slight_smile:

1 Like

Thanks to @kurianbenoy, @suvash, @brismith, @mike.moloch, and @tapashettisr for your guesses!!! I’m really grateful to be part of such an engaged and thoughtful community.

My own guess was probably closest to (a) originally–I imagined that the things that the learner had picked up to indicate dogs and snakes would be absent in houses, and therefore those two classes would be relatively suppressed. I find @suvash’s suggestion most compelling: that one of the easiest and most beneficial heuristics for the model to learn was effectively “Never guess ‘OTHER’!”

In the end, I am delighted to share the results of the experiment (run in Colab), which converged–but not on any of the exact options I provided!

I ran fine_tune(20), and here are the first few rows to show that the loss and error rate decrease over training.

epoch train_loss valid_loss error_rate time
0 1.520274 1.346103 0.547945 00:21
epoch train_loss valid_loss error_rate time
0 0.339380 0.362611 0.123288 00:21
1 0.287089 0.089451 0.027397 00:20
2 0.198841 0.015010 0.000000 00:20
3 0.146187 0.003257 0.000000 00:21
4 0.112802 0.001330 0.000000 00:21
5 0.090765 0.000901 0.000000 00:21

I then tested this trained model on 200 images of houses (thanks Duck Duck Go!), and took the mean of the 200 prediction tensors to see what category [“OTHER”, “Dog”, “Snake”] the inferences tended to:

OTHER Dog Snake
1.14% 80.50% 18.35%

So I think that, in the end, the prize has to go to @brismith for best answer (though points to anyone who guessed something like ‘c’)!
In the spirit of his answer, I’ll note, that in retrospect it seems like there may be significantly more overlap between dogs and houses than between snakes and houses, by nature of where these animals are most often photographed. This seems like a great example of unexpected bias in the dataset. Perhaps it would have been most prudent to test on 200 images of grayscale static instead?

Finally, I’m curious if anyone can come up with a solution that accomplishes the goal of: dog images getting correctly labeled, snakes getting correctly labeled, but every other type of image getting labeled “other”, whether it is a picture of an airplane, a shark, a Jackson Pollock, or white noise?
Is it impossible?

Thanks again everyone!

5 Likes

And a quick update, fine_tune(20) on 200 ‘static noise’ images:

OTHER Dog Snake
0.65% 8.88% 90.47%

So, you can color me confused! I did not expect to see such a drastic asymmetry between Dog and Snake. What’s your best guess? Is it that perhaps the training dataset contains a black and white picture of a snake, but none of a dog? Or is it that snakes are photographed in noisier environments? I realize this is hard for anyone to guess without access to the specific photos I downloaded from DDG, but just wanted to share.

2 Likes