Wiki: Lesson 1

No, not part 1 version 1. In the video, Rachel was referring to this course.


Thanks Rachel. I went through the code and ran notebook 1 yesterday.

Even though I have GTX 1080 in my home system, I had issues running the notebook at that batch size of 224. I played around by increasing size from 32,64,128,160,192. I could never get the same accuracy numbers, they were always less than the original notebook. I tried playing around with learning rate by decreasing it and increasing the numbers, but the result was still the same.

I assume, the code is using batch SGD. Do you have any reading material on how does the size of batch impacts the optimization?

I think, I will play around by changing the optimization algorithms next. But as far as I read, ‘Adam’ and new algorithms maybe faster but SGD is still more accurate.

Finally, thanks to you and Jeremy for the effort you guys put in this course.


If you use a smaller batch size, you may want to decrease the learning rate by the same ratio.


Hi Luke. I think you might have better luck if you ask this in the following topic.

I tried two different cases wherein I downloaded 10 pictures each of yachts and cruise ships in train and 5 each in valid folder. The learning rate calculation graph doesn’t work

Is there a minimum sample size for this to work. I used a similar data structure for differentiating between BMW and AUDIs and the same problem.

Learning Rate Finder runs for a maximum of ONE Epoch by trying various learning rates for different mini-batches in that one Epoch. If your batch size is larger than 5 and you have only 5 images, then One Epoch means run One Batch i.e, one learning rate. So, it results in a one datapoint for the Loss. Since it doesn’t have multiple Batch runs, its not able to Plot the curve on how Loss changes for various Learning Rates.

You can try reducing the batch size (bs parameter) to learner / data object to 1 and see if that gives you a plot (since it will try a max of 5 batches with different learning rates) But for this to be useful, you might need to collect more data. Having a dataset of 32 and Batch Size of 8 might be a good place to start?

When I have problem with finding a good learning rate, I usually start with a rule of thumb of 1e-2 (0.01). But lr_find is the optimal way to find a good starting learning rate.


Use this to download images(it works like a charm…
Tried and tested)


Thanks for the link… Had to update it to work with Python 3.6 as the urllib2 has been deprecated i think. Submitted an update on the github too

Here is an interesting use of image recognition to fight corruption in the extractive industries. This webinar is being shared by DataKind who helps non-profit make use of their data for good.

if you want an opportunity to use the skills you are gaining in this course see the folks at DataKind.


I just finished building a deep learning PC earlier last month. I followed the general instructions from the last class version to install cuda 8 and cudann 6. I see in the startup script for class version 2 that it’s using cuda 9 and cudann 7. Will this course run with the old 8/6, or will I need to upgrade to 9/7?


1 Like

Should work fine with the older versions, but some architectures will be far slower.

1 Like

Thanks Cedric, solved my issue

Hi All,

I found this useful tool to download images from Google Images

I am doing simple human race image classifier, I have created a folder called ~/data/people_original and have three folders in there called caucasian, african and asian and populated each folder using the command

google-images-download download 'african man' 'african woman' --keywords '' --download-limit 100

from within each folder.

So now I have three folders of ~/data/people_original/asian, ~/data/people_original/african, ~/data/people_original/cucasian, each with 200 images in them.

I was wondering if anyone has any munging code that could be repurposed for moving these splitting these images into the required folder structure that is present in the dogscats folder i.e. models sample test1 train valid.

I am guessing that this will be the way that most people will attempt to do the homework from lesson 1 and so figured this might be a useful snippet/recommended way of doing something that someone might have already done.

Or perhaps is in (or could be in) the FastAI library.

Kind regards,

Luke Byrne


Thanks for this amazing deep learning course.
I just shifted from v1 to v2.
Where can I access the v2 .ipynb notebooks?
Are there any setup instructions for Mac and PC for v2 of the course (conda yaml file)?
The main site only seems to contain v1 content.
I use a Mac for reviewing the notebooks with sample data,
and a PC with a GPU for more compute intensive tasks.

1 Like

cpgrant theV2 video link is at the top of this post.

here is is again -

Hey Luke,
Try modifying this for your purposes – @rodjun created this script for the dogs vs. cats competition, but the idea is the same for what you’re doing:

Best of luck!

1 Like

I watched some of the webinars and are pretty interesting. Thanks for posting it here.
I signed up as volunteer some time ago but have not been able to contribute so far.
What is your oppinion on DataKind and their projects?
Do you know of other similar initiatives?

1 Like

Ok… tried it with around 175 images each of white tiger and zebras downloaded from Google images with 150 train set and validation set.

First a few questions:

  1. What does different epoch means? Does the system build the layers from scratch in each epoch or is it that the next epoch is building upon layers of previous epoch?
  2. I am used to scikit train_test_split with random. Here we do the batch sizing. Is data within batches randomized every time as each time I run an epoch, the loss rates and accuracy is slightly different?
  3. I used a batch size of 15. Still my learning rate schedule doesn’t work
  4. This is what it does when I run data augmentation. Is it because the images are high definition. I have gone through the images they arent that heavy

Thanks for the help

1 Like

For their bigger / longer term projects I have not been able to connect with them.
But I have participated in two of their “data dives”.
These are basically a weekend-long data focused hackathons.

In NYC they get 50-100 really amazing folks from major financial, Internet organizations to come together to work on interesting data sets with interesting government and non-profit organizations. I’ve participated in two of these events. I have found these very useful personally. And I feel like I have helped do some good for the non-profit organizations.

The Taproot foundation also provides pro bono opportunities. Not specifically related to data science or machine learning. However, I can imagine that they would find folks with skills in these areas useful.


Not sure whether this helps or not…

  • The neural net is basically trying to improve its weights for better performance based on your metrics… A epoch is basically a number of times your net has seen the whole data…

  • Not sure about randomisation of the dataset but the data is splitted according to your batch size…It’s like how much images in one go is seen by the net…Just like a Pipeline simultaneously…

Couldn’t recognise the last plots?(it seems that the images are zoomed in to get pixels?)

Just one question…

Does the order of images in the dataset matters?(provided they have equal distribution in terms of counts…)

1 Like