Lesson 1 In-Class Discussion ✅

maxho · April 23, 2020, 5:23am

get_transforms() is imported from here - https://docs.fast.ai/vision.transform.html#get_transforms
the arguments are optional if the argument is assigned a default value as seen here

from_name_re ( path : PathOrStr , fnames : FilePathList , pat : str , valid_pct : float = 0.2 , **** kwargs ** )

valid_pct has a default value (0.2) assigned to it

Check out what **kwags are here - 1. *args and **kwargs — Python Tips 0.1 documentation
You’re probably gonna need to understand Python well

termith · April 23, 2020, 10:51am

I am playing around with Caltech Birds Dataset.

In first I’ve tried copy-paste approach. It wasn’t good enough, so I make bigger classes based on general birds titles and use stratification (so every class is presented either in train and validation dataset with given proportion), maybe ImageDataBunch do it itself, I don’t know. I’ve used my script for it.

Now I want to explore bounding_boxes.txt annotations for better classification. I cropped images with another script
I works perfect on my local tests.
But when I try look on my dataset with show_batch() I get this:

I assume that where are three possible causes:

I do incorrect cropping
ImageDataBunch constructor does some additional crop on my images
show_batch() method doesn’t show full images only some region (and in dataset all images are correct)

Any ideas what is going on and how can I check it?
ImageDataBunch creation code

local_data = ImageDataBunch.from_folder(Path('./images'), ds_tfms=get_transforms(), valid='validation', size=299, bs=32).normalize(imagenet_stats)

thomasSDK · April 23, 2020, 2:12pm

EDIT
The environment I was having these issues on was Gradient, still not sure why but tested out the notebook on kaggle and did not encounter any errors.

Original Post

Can anyone guide me here, working my way through lesson one and have hit a road block at The Other Data Formats section.

Specifically getting this error message when I run the Image Data From Bunch command:

tfms = get_transforms(do_flip=False)

data = ImageDataBunch.from_folder(path, ds_tfms=tfms, size=26)

/opt/conda/envs/fastai/lib/python3.6/site-packages/fastai/data_block.py:537: UserWarning: You are labelling your items with CategoryList.
Your valid set contained the following unknown labels, the corresponding items have been discarded.
7
  if getattr(ds, 'warn', False): warn(ds.warn)

Which leads (I think) to a whole bunch of errors later:

/opt/conda/envs/fastai/lib/python3.6/site-packages/fastai/basic_data.py:262: 
UserWarning: There seems to be something wrong with your dataset, for example, in the first batch can't access these elements in self.train_ds: 9023,4005,7043,7988,5679...
 warn(warn_msg)

yashbansal6 · April 26, 2020, 1:47pm

How to check what features are the intermediate layers learning in a particular model? And how can we display it in the notebook by highligthing those features in the image?

AdityaG · April 26, 2020, 6:41pm

Can anyone clear this doubt please?
When we downloaded the pets dataset, we had a separate folder for annotations. Then why did we use regex to get label names when instead, we could have got the names from the annotations folder? Please help me with this question.

abdnafees · April 28, 2020, 5:53pm

After running the notebook couple of times in Google Colab, I find myself stuck at this point
What have I tried so far:

Reset and try again
Shutdown the notebook and load the example again

Sdash · May 3, 2020, 5:09am

Hi I am new to this course and forum. Stupid question, but how does one search for relevant threads? i dont see a search icon.

mrfabulous1 · May 3, 2020, 12:11pm

Hi Sdash I hope you are having a fantastic day!

If you are logged in you should see the page above!

If you mouse click the first icon above you should be able to enter what you are searching for.

Cheers mrfabulous1

prachipm · May 10, 2020, 1:14am

Hi, I am new to fastai and deep learning. Can someone please help me understand the problem with using a different RegEx in ImageDataBunch.from_name_re.

I have set pat = r’(?:.)/(.).(?:.*)’
But using this re, I am getting the image annotations right. But data.classes is returning incorrect classes - e.g. Abyssinian_1, Abyssinian_2 are being returned as different classes

Please help me understand why pat = r’(?:.)/(.).(?:.)’ works but pat = r’(?:.)/(.).(?:.)’ doesn’t

timmyabina · May 28, 2020, 4:52pm

Hello everyone. I’ve just started trying to learn deep learning here on fast.ai and have been stuck at the very beginning (trying to create an image data set from google) for the past week now. This is my code, with the error I am receiving at the bottom ( and the error is repeated multiple times going further down the notebook).

I was told that my urls are google cache images for deleted websites, but I am using the same code to download the images (
urls=Array.from(document.querySelectorAll(’.rg_i’)).map(el=> el.hasAttribute(‘data-src’)?el.getAttribute(‘data-src’):el.getAttribute(‘data-iurl’));
window.open(‘data:text/csv;charset=utf-8,’ + escape(urls.join(’\n’)));
) as shown in the official walkthrough. Even when filtering my search results to only show images from the last 24 hours, I get all urls with the same encryption. I am completely lost and stuck. Any help would be greatly appreciated. Thank you.

timmyabina · May 28, 2020, 5:16pm

Actually, I think I may have figured out what my problem was. My max_pics was equal to 200 and I only have 80 images. Once I changed it to 80 I stopped receiving the error. I am posting this in case anybody runs into the same problem because I couldn’t find this solution anywhere.

elie · May 30, 2020, 6:34pm

In lesson2-download.ipynb it is said that you should not run the imageCleaner method in colab. I would like to let anyone who just started on lesson one that whatever issue was there when the notebook was made, it is no longer there. I run the imageCleaner on colab without any problem. you just need to remember to open another cell and run it maybe every 5 min so that your session does not timeout. for me I had a cell where I kept running a=5 at least every 5 min to keep the session a live.

venus · June 2, 2020, 10:35am

Can anyone please point out what I am doing wrong in this?

swastikmohapatra · June 2, 2020, 3:06pm

This might be of help

It’s always a good practice to check the forum for your questions as it already might have been answered and would save your time

venus · June 2, 2020, 5:33pm

Thank you so much for the help!

Adam_mehdi · June 9, 2020, 5:54pm

I had to import the v3 of the course (in the directory in Gratient) and encountered this problem afterwards when saving the model:

It seems the system can’t edit the file system of the v3 course directory, as shown when I tried to add a folder to the directory:

Does anyone know how to import a writable v3 file system or work around this problem?

Adam_mehdi · June 12, 2020, 9:36pm

I figured this out. It was just a strange error that got solved when I rebooted a bunch of programs.

konj · July 7, 2020, 1:41am

For image classification of dogs and cats in Lesson 1, The filenames present in path_img folder is used to get the labels of the pictures. Then, that is the use of the path_anno folder? What type of annotation information is stored and used? Is it fine to ignore the path_anno folder if the labels are present in the path_img filenames?

dlukes · July 8, 2020, 9:57am

@jeremy path.ls() is convenient (as you point out) and discoverable, since it’s simply a method and comes up in tab completion. On the flip side, it builds muscle memory which will fail people whenever they encounter standard path objects or paths represented as strings.

In Jupyter/IPython, there’s an alternative which works with plain paths and strings alike: %ll {path}. It’s also fairly succinct IMHO, though definitely less discoverable, but it’s a good way to get used to variable interpolation in magics (or shell commands, !ls -l {path} works too of course)

MichaelCh · July 9, 2020, 4:02pm

Hello everyone,

I just started this course and just finished lesson 1. Then I wanted to do some practice using lesson 1’s code, but I encountered a little problem and hope I can get an answer here.

So basically I want to what Jeremy did in lesson 1 all over again, but on CIFAR 10 dataset. However, after I downloaded the CIFAR 10 dataset, it only has a test set and a label.txt. There is no training or validation set. I did some search in the forum and found that other people has a training set for CIFAR 10. I just want to know if this is by design and I need to partition the training set and validation set by myself, or there are some errors.

Screenshot of my code and the output: