Lesson 2 official topic

Reworking lesson 2, let me share my learnings from putting my Bear Detector on HuggingFace

  • Interestingly, some things had changed a little bit in the meantime: the gradio API, working with nbdev
  • Working on my local machine, I had to install gradio, nbdev and Git Large File Storage (LFS)

For the complete summary and the source code, check out my GitHub.

3 Likes

See…
https://medium.com/mlearning-ai/approaching-multi-label-image-classification-using-fastai-515a4fd52c8c

but note that cnn_learner() is now named vision_learner()

1 Like

What is the difference between the search_images function used in ch. 1 of the book and search_images_bing which is used on p. 66 of the book?
Why is it nescessary to use bing func when the search_images seemingly worked fine?

Can’t seem to find any explanation in the documentation

You can just use whatever works, it doesn’t matter. I believe the current recommendation is to use search_images_ddg which uses the duck duck go search engine.

1 Like

Ha ha. Yes, after lesson 1, I heard about that cat/dog and thought it would be fun to use an image of it to see what the Cat/Dog classifier model thought. Turns out the model predicted cat - with a high degree of confidence! :smile:

I guess that’s to be expected given the confusion it caused in the human community.

Was strange watching Lesson 2 and seeing @jeremy discuss the very same dog!

Thank you for the link, quite helpful.

1 Like

In the Lesson 2: Practical Deep Learning for Coders 2022 video, when Jeremy shows the “Dogs v Cats” app.ipynb notebook, he says that “def is_cat(x): return x[0].isupper()” needs to be included, but I don’t get why, as I understand we used this function to label training data, no? If so why would we need it now, since our model is already trained by now??

The learner you saved contains the transform pipeline used for training, which uses the is_cat function for labeling. When you load that learner, pickle (thats the library used for loading/saving) doesn’t know which parts you are going to need and which not, and fails if something is not available, that it expects to be.
You could make an effort and remove everything that uses is_cat from the learner before you save it, but it’s easier to just redefine the labeling function at inference :slight_smile:.

3 Likes

Hello there :wave:

I was wondering: once you use the ImageClassifierCleaner object to remove / change wrongly interpreted images, the learn variable you export at the end takes these changes into account right?

Thanks :wink:

thank you!! :slight_smile:

1 Like

Hey @ThomasAlibert,
Well, not quiet. I’ll try not to go too deep, say if you need more details :slight_smile:.

The thingy in a learner that holds the data is the DataLoaders object at learn.dls. It consists of data sources (the paths to your files for example) and the transforms to get from the source to a preprocessed tensor (x) aswell as the targets (y). When you save the learner with learn.export() it empties the data source, leaving only the transforms, so you exportet learner has no idea about the data it trained on before (besides the trained weights it learned from them obviously).

What happens with the ImageClassifierCleaner is that you extract lists of filenames that should be deleted/changed and in the cell below you actually delete that file from your system / copy it to another path. Then, the next time you create your DataLoaders those files are either gone or reside at a path the loader can derive the correct labels from.

Also when I tried this I had to run the cell to delete/change the instances each time before I changed one of the label or split options in the two dropdowns, otherwise the lists got reset. I don’t know if that is intended, but that’s something you should be aware of. (The first time I wasn’t and had to do it a second time :laughing:)

Hope this helps!

2 Likes

Helloo again, sooo
When I run the first cell in the notebook,
" #|export
from fastai.vision.all import *
import gradio as gr
def is_cat(x): return x[0].isupper() "

I get the following,
" home/dim/mambaforge/lib/python3.10/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: libc10_cuda.so: cannot open shared object file: No such file or directory
warn(f"Failed to load image Python extension: {e}") "

The code still runs, as far as I understand but do you know if it’s an issue with my pc or…?

Hey,
I had the same issue and fixed it by downgrading torchvision.

mamba install torchvision==0.13

The version that raised the warning was 0.13.1, 0.13.0 works without complaints :slight_smile: Also this pytorch discussion might be helpfull.

1 Like

Lecture 2 Summary
Google Doc View

2 Likes

Thanks a lot for your reply Ben :wink:

I think I get it.

Once you’ve got ridden of bad pictures or changed it, you have to run again the DataLoaders, data augmentation, fine tuning cells before exporting it.

Is that correct?

1 Like

I managed to solve an error that caused me a bit of pain. Here is the solution in case someone’s stuck as I was.

From my understanding nbdev has been upgraded since the last recording of the lesson.

Instead of from nbdev.export import notebook2script we now have to use from nbdev import nbdev_export.

But now I still have another error :thinking:

I tried to add a cell with #|default_exp core at the top without understanding what it does.
It solved the message from the first screenshot but not the second.

Anyone knows what it’s all about? Thanks :slightly_smiling_face:

2 Likes

Yes you should do it again.

2 Likes

Thank you again!

1 Like

In the end you shouldn’t use either:

from nbdev.export import notebook2script
or
from nbdev import nbdev_export.

But instead this worked for me :sweat_smile::

from nbdev.export import nb_export
nb_export('THE_NAME_OF_YOUR_NOTEBOOK.ipynb', '.')
3 Likes

does anyone know if the ‘Dogs v Cats’ app.ipynb file that Jeremy uses in the video is available anywhere? I’d like to replicate what he did in the video.