Lesson 2 official topic

Hey everyone. I have just completed the lesson and trained my own model. I had some problems, because some of the code in the notebook don’t seem to work anymore and I had to implement those parts differently then presented in the lecture.
I’ve written a blog post about it. Maybe it will help someone

Hi,

Here’s my app for classifying hand signs.

I don’t get why we have to define the is_cat function.I mean I read the Learner.export documentation and it states that external functions are not exported and I understand that. But, if I’m not mistaken, is_cat is supposed to be used to label images at training/validation time, not at inference time… so why do we need it when creating the Gradio web app?

I think it’s because the model that we exported contains a reference to our dataloader which references our label_func. If we try to run the web app with a code that references something that doesn’t exist it will crash. My guess as to why it still references our dataloader would be to help us train it letter with the same setup.

2 Likes

Hi,
Here’s my extended version of bear classifier app:

I added red and regular pandas + polar bears to complicate the task.

Key takeaways:

  • There was an initial hiccup when differentiating between pandas and red pandas. Data cleanup worked perfectly. Gotcha: after applying changes to your images via the cleaner, you need to rerun cells reading and loading images into data loader for changes to be reflected on the next tuning.
  • T4 GPU is so much faster than CPU, highly highly recommend
  • Stuck for quite awhile because the categories defined in the Jupyter notebook seemed to be out of order, i.e. loading black bear was showing 100% for grizzly, while I knew that the model is correct. Turned out - the actual sequence of categories is taken from directory order under the parent bears/ dir, and not from the array we feed to the image loader. Figured through trial and error.
  • Gradio is cool. Local setup is relatively easy. Couple of gotchas (did not know that you have a default app name of “demo” which was stopping hot reloading). I essentially had a fully working app locally, and uploaded to HF just to experiment myself and share with this group.
  • Had an issue with pytorch<>fastai<>numpy<>etc compatibility. I used poetry by default, switched to just pip, nothing was helping for proper local setup. Finally tried miniconda (never used before) and voi-la, everything works out of the box. So I guess I’m staying with conda.
  • Had to add git-lfs to be able to upload the 40MB models to HF despite their limit is higher. After installing the lfs everything was fine. pkl extension is there by default. One gotcha is that if you already had a pkl model in your repo, you’ll need to run an additional command, i think it was
    git lfs migrate import --include="*.pkl", pls. look up the “import” command.
  • To increase quality I was trying to add more epochs and add more images. It seems like adding more images works better (had to override the default 30 images limit in the downloading function), while after 3 epochs the error rate was not dropping any more, so I just scaled the number of epochs down to 3.
  • And yeah, had to use a different image downloader - from DDG.

Cheers
D

4 Likes

i wiil be happy for someone to send me a link for the Dog V Cats notebook
41:32 on the lecture

There’s a link to the notebooks under the resource section of that lesson video.

Hello @wsaujanya,

The thread here is quite useful for troubleshooting issues with the ImageClassifierCleaner, and yes, your browser may be an issue.

What it does not mention and what helped me when my ImageClassifierCleaner wasn’t working is that it might be a compatibility issue between the ImageClassifierCleaner module and Torch, which you can get around by installing an older version of PyTorch:

!pip uninstall torch torchvision torchaudio
!pip install torch==2.3.1 torchvision==0.18.1
import torch
print(torch.__version__)

2.3.1 was the latest version I found worked. If you don’t install the correct version of torch and torchvision it’ll give you a warning about their codependencies but these two worked.

Hope this helps.

any help with this issue?
error log: “Error displaying widget: model not found”

I believe I was receiving a similar error when I was running mine and installing an older version of Torch cleared it. Has doing that not worked for you?

Thanks…

Wasn’t able to clean data after running the model at all.

Will revisit the same topic once again on a different browser.

Hi all,
Posting my very basic image classifier. Very simple compared to other classifiers posted here but its a start.
I got stuck trying to use Gitbash then git desktop. I then realised I could bypass these by using the file upload functionality in the Hugging Faces Spaces GUI. Much easier. I will learn git in the future.

Wow congratulations @oharlem this works really well. I have a lot to learn. Would you mind sharing your source code please? Also - sounds like you have found a switch to use GPU - is this a setting in Colab, Kaggle?
Thanks

hey @diggooddog
thanks

> Would you mind sharing your source code please?

Sure thing. The repo is actually in Huggingface Files area of the app.
You can
git clone git@hf.co:spaces/oharlem/bear-classifier

It’s location:

>sounds like you have found a switch to use GPU - is this a setting in Colab, Kaggle?
Yep, used Colab. There you can change runtime type to select GPU.

Best
D.

Many thanks - sorry should have been more clear - was actually after the code used to create and train the model pre-pkl if you are willing to share please. I couldn’t see it in the hugging face files.

Ah, sure, try this: fastai-lesson2-bear-classifier2.ipynb · GitHub

Apologies if this was already asked and answered (I’ve searched through this topic to no avail). In the lesson Jeremy is setting a seed for a random number generator to get the very same validation set each time, to be able to test what is the effect of all the levers (eg. number of epochs or resizing method) on the final model. Does this random seed setting affect also random augmentations being applied in the bear classifier example? In other words are the augmentations exactly the same each time the notebook is being rerun?

2 Likes

Great thanks

Hi! I want to share my progress in the second module using a brain tumor classification

Fastai brain tumor MRI (kaggle.com)

Hi I am going though Module 2 and am confused by something.
My understanding of item tfms is that the transformation is applied per item basis while if we use the batch tfms the transformation is applied on a batch basis.
My doubt arises from this example

bears = bears.new(item_tfms=Resize(128), batch_tfms=aug_transforms(mult=2)) #if batch allies by batch, should the output not be same as we are using same image. Or does the randopm parameters chosen for each item and not the whole batch?
dls = bears.dataloaders(path)
dls.train.show_batch(max_n=8,nrows=2,unique=True)

As we are using Unique = True the same item is passed though for the batch transofrmation. My expection is that all itemas are being subjected to same transformation and the output would be the same. (8 bears of same type) but the output i got and the one in the notebook shows diffrent transformation being applied to same image.

So i though ok maybe the randomness is diffrent for each iteam but from a processign point of view the transformation take place in parallel.

To test this i tried the following code sample.

bears = bears.new(batch_tfms=RandomResizedCrop(128, min_scale=0.3))
dls = bears.dataloaders(path)
dls.train.show_batch(max_n=6, nrows=1, unique=True) #unique means if we want toi pull n entries of same item

Alas the output is 6 bears of same type.
Is random resize, not Random?

Can someone please help me with my doubt.