Lesson 1 In-Class Discussion ✅

Haha, I guess it’s an issue with Google Colab. I run mine on paperspace and it works fine.
Anyways, the matrix is fine now right?

Hi, I ran the Lesson 1 notebook and everything worked fine, I even ran the same code with MNIST_SAMPLE which also worked fine. However, I can’t create a databunch when I’m trying it with the MNIST sample, the original one with folders from 0 to 9.

Link to notebook:
https://colab.research.google.com/drive/1I29SPsFmVrntP-4MjayCv4t1mZovS1o6#scrollTo=rZ0np9evL1CP

@akshitx10 post your data structure here. It says that it cannot detect a training set. There can be only 2 possible cases :

  1. Your path variable does not point to the correct location
  2. Your data is not structured in the proper format.

Type doc(ImagedataBunch.from_folder) to see the documentation. The docs will give you the proper format :slight_smile:

@swastikmohapatra Here is how I proceeded.

https://forums.fast.ai/uploads/default/original/3X/e/7/e7bc327070cb5a06e3f21acca419e22123020bd5.png

Yup, it looks as expected now. I think I may have initially run the original document without copying it to my personal drive first, so maybe it was a multi-tenancy issue? Anyway, it works now, thanks for double checking.

1 Like

Hi guys, I’m a newbie and therefore might be posting in a wrong thread - sorry if that is the case.
So I registered at Gradient Paperspace however when I run Jupiter notebook the files directory seems to include course-v4 rather than v3 materials. Did I do something wrong?

@Namm

It looks like when you first go to the notebooks section in gradient the default ‘recommended’ tab shows the V4 container:

If you switch to the ‘All Containers’ tab, V3 is available too.

Hope that helps

1 Like

Solved! Thank you, @nickkb!

Hey @akshitx10

See the training folder has to be named as “train” and the test folder as “test”.

You have named it as “training” due to which it cannot detect the training folder.

Hey everyone. For a lesson 1 project I tried to build a classifier for breast cancer pathology slides. I used the kaggle dataset here. I have two questions actually. One is, when creating my ImageDataBunch, what should I set my size to be? The images are 50x50 – does that mean my size should be 2500? Second, has anyone ever seen a learning rate plot look like this? For reference, I just ran learn.fit_one_cycle(4) followed by learn.recorder.plot()

image

2 Likes

Hi @chkchk12

First question
The assumption here is that the images are all square in size. Jeremy mentions this in the first lecture that we are dealing with square images so when we say the size of an image is 224, it’s actually 224 x 224.
Similarly, your input images are of a size 50 x 50 so set your size to 50 or a bit less if there is noise surrounding the image and you want to crop it out.

I am equally stumped by the learning rate plot and have never seen one like this. I guess some experts in the forum can help you out with it :slight_smile:

Regards,
Swastik

@swastikmohapatra, great, thank you so much for the answer to the first question!

I think I might have discovered the issue, I forgot to run learn.lr_find(). This is what I am getting now:

1 Like

I want to know what can be concluded about the model architecture and data if lr_finder founds the graph of loss with lr as the attached graph.

Questions:
I have come up with some problems of my codes.
I upload my dataset to the google cloud drive and mount it with colab.
image
My dataset is in this structure:


and I add the path to the data by using this code:

However, when I try to load the data with the model, mistakes happens:

What should I do to get throught it ?
Thanks.

Hi all, trying this do some practise of my own. I see a lot of examples of creating image datasets. My confusion lays in the fact that downloading or scraping without the right labelling will not work right? We also need to label the images correctly, after all rubbish in, rubbish out right?

The only practical example I’ve seen is the one guy who posted his experiment using his photos from iphone (assuming that the facial recognition was correct).

Am I missing something here?

cnn_learner error

Hi Kevin

Yes, we need to correctly label the images and this is where fastai makes our job easier. Try out the following experiment

  1. Go to chrome web store and install the Fatkun Batch Downloader
  2. Search for “cats” on google and use the Fatkun downloader to download the images to a folder named cats
  3. Repeat the process for dogs and keep the images in a folder named dogs
  4. Place both cats and dogs folder in a folder named train
  5. This is your training set. Now you can use ImageDataBunch.from_folder and it will pick up the folder names as labels thus correctly classifying the images.

Hope this help :slight_smile:

2 Likes

Thanks. So the better we frame the search query to get “cleaner” results from google the better, right.

Right now cats as a search result gives me images from the musical. :slight_smile:

But got it. Thanks!

Yes, the better you frame your query the cleaner the results.
Also, I typically spend some time cleaning my downloaded images manually by checking them in a grid view on my computer. This might sound tiring but usually takes <30 mins and I can be sure that my input data is clean :slight_smile:

Happy learning