Lesson 1 In-Class Discussion ✅

  1. get_transforms() is imported from here - https://docs.fast.ai/vision.transform.html#get_transforms
  2. the arguments are optional if the argument is assigned a default value as seen here

from_name_re ( path : PathOrStr , fnames : FilePathList , pat : str , valid_pct : float = 0.2 , **** kwargs ** )

valid_pct has a default value (0.2) assigned to it

  1. Check out what **kwags are here - 1. *args and **kwargs — Python Tips 0.1 documentation

  2. You’re probably gonna need to understand Python well

I am playing around with Caltech Birds Dataset.

In first I’ve tried copy-paste approach. It wasn’t good enough, so I make bigger classes based on general birds titles and use stratification (so every class is presented either in train and validation dataset with given proportion), maybe ImageDataBunch do it itself, I don’t know. I’ve used my script for it.

Now I want to explore bounding_boxes.txt annotations for better classification. I cropped images with another script
I works perfect on my local tests.
But when I try look on my dataset with show_batch() I get this:


I assume that where are three possible causes:

  1. I do incorrect cropping
  2. ImageDataBunch constructor does some additional crop on my images
  3. show_batch() method doesn’t show full images only some region (and in dataset all images are correct)

Any ideas what is going on and how can I check it?
ImageDataBunch creation code

local_data = ImageDataBunch.from_folder(Path('./images'), ds_tfms=get_transforms(), valid='validation', size=299, bs=32).normalize(imagenet_stats)

EDIT
The environment I was having these issues on was Gradient, still not sure why but tested out the notebook on kaggle and did not encounter any errors.


Original Post

Can anyone guide me here, working my way through lesson one and have hit a road block at The Other Data Formats section.

Specifically getting this error message when I run the Image Data From Bunch command:

tfms = get_transforms(do_flip=False)

data = ImageDataBunch.from_folder(path, ds_tfms=tfms, size=26)

/opt/conda/envs/fastai/lib/python3.6/site-packages/fastai/data_block.py:537: UserWarning: You are labelling your items with CategoryList.
Your valid set contained the following unknown labels, the corresponding items have been discarded.
7
  if getattr(ds, 'warn', False): warn(ds.warn)

Which leads (I think) to a whole bunch of errors later:

/opt/conda/envs/fastai/lib/python3.6/site-packages/fastai/basic_data.py:262: 
UserWarning: There seems to be something wrong with your dataset, for example, in the first batch can't access these elements in self.train_ds: 9023,4005,7043,7988,5679...
 warn(warn_msg)

How to check what features are the intermediate layers learning in a particular model? And how can we display it in the notebook by highligthing those features in the image?

Can anyone clear this doubt please?
When we downloaded the pets dataset, we had a separate folder for annotations. Then why did we use regex to get label names when instead, we could have got the names from the annotations folder? Please help me with this question.

After running the notebook couple of times in Google Colab, I find myself stuck at this point
What have I tried so far:

  1. Reset and try again
  2. Shutdown the notebook and load the example again

Hi I am new to this course and forum. Stupid question, but how does one search for relevant threads? i dont see a search icon.

Hi Sdash I hope you are having a fantastic day!

If you are logged in you should see the page above!

image

If you mouse click the first icon above you should be able to enter what you are searching for.

Cheers mrfabulous1 :smiley: :smiley:

Hi, I am new to fastai and deep learning. Can someone please help me understand the problem with using a different RegEx in ImageDataBunch.from_name_re.

I have set pat = r’(?:.)/(.).(?:.*)’
But using this re, I am getting the image annotations right. But data.classes is returning incorrect classes - e.g. Abyssinian_1, Abyssinian_2 are being returned as different classes

Please help me understand why pat = r’(?:.)/(.).(?:.)’ works but pat = r’(?:.)/(.).(?:.)’ doesn’t

Hello everyone. I’ve just started trying to learn deep learning here on fast.ai and have been stuck at the very beginning (trying to create an image data set from google) for the past week now. This is my code, with the error I am receiving at the bottom ( and the error is repeated multiple times going further down the notebook).


I was told that my urls are google cache images for deleted websites, but I am using the same code to download the images (
urls=Array.from(document.querySelectorAll(’.rg_i’)).map(el=> el.hasAttribute(‘data-src’)?el.getAttribute(‘data-src’):el.getAttribute(‘data-iurl’));
window.open(‘data:text/csv;charset=utf-8,’ + escape(urls.join(’\n’)));
) as shown in the official walkthrough. Even when filtering my search results to only show images from the last 24 hours, I get all urls with the same encryption. I am completely lost and stuck. Any help would be greatly appreciated. Thank you.

1 Like

Actually, I think I may have figured out what my problem was. My max_pics was equal to 200 and I only have 80 images. Once I changed it to 80 I stopped receiving the error. I am posting this in case anybody runs into the same problem because I couldn’t find this solution anywhere.

2 Likes

In lesson2-download.ipynb it is said that you should not run the imageCleaner method in colab. I would like to let anyone who just started on lesson one that whatever issue was there when the notebook was made, it is no longer there. I run the imageCleaner on colab without any problem. you just need to remember to open another cell and run it maybe every 5 min so that your session does not timeout. for me I had a cell where I kept running a=5 at least every 5 min to keep the session a live.


Can anyone please point out what I am doing wrong in this?

This might be of help

It’s always a good practice to check the forum for your questions as it already might have been answered and would save your time :slight_smile:

Thank you so much for the help!

I had to import the v3 of the course (in the directory in Gratient) and encountered this problem afterwards when saving the model:

It seems the system can’t edit the file system of the v3 course directory, as shown when I tried to add a folder to the directory:

Does anyone know how to import a writable v3 file system or work around this problem?

I figured this out. It was just a strange error that got solved when I rebooted a bunch of programs.

For image classification of dogs and cats in Lesson 1, The filenames present in path_img folder is used to get the labels of the pictures. Then, that is the use of the path_anno folder? What type of annotation information is stored and used? Is it fine to ignore the path_anno folder if the labels are present in the path_img filenames?

@jeremy path.ls() is convenient (as you point out) and discoverable, since it’s simply a method and comes up in tab completion. On the flip side, it builds muscle memory which will fail people whenever they encounter standard path objects or paths represented as strings.

In Jupyter/IPython, there’s an alternative which works with plain paths and strings alike: %ll {path}. It’s also fairly succinct IMHO, though definitely less discoverable, but it’s a good way to get used to variable interpolation in magics (or shell commands, !ls -l {path} works too of course) :slight_smile:

image

Hello everyone,

I just started this course and just finished lesson 1. Then I wanted to do some practice using lesson 1’s code, but I encountered a little problem and hope I can get an answer here.

So basically I want to what Jeremy did in lesson 1 all over again, but on CIFAR 10 dataset. However, after I downloaded the CIFAR 10 dataset, it only has a test set and a label.txt. There is no training or validation set. I did some search in the forum and found that other people has a training set for CIFAR 10. I just want to know if this is by design and I need to partition the training set and validation set by myself, or there are some errors.

Screenshot of my code and the output: