Wiki: Lesson 1


(ashish johnson) #261

thanks for the terms and expressions !! It is well appreciated and keep doing it


(Shivaraj Bakale) #262

Why do we always have to find the local minimum in Deep learning algorithms ? What purpose does it serve


(Stas Bekman) #263

The issue with np.mean() and the call to accuracy_np(probs, y) failing as it was getting passed a one-dim array:

AxisError: axis 1 is out of bounds for array of dimension 1

update: for some reason pip wasn’t loading the latest version of fastai - I replaced it with pulling directly from github and it all works now. So it was a false alarm.


(Alex) #264

Ideally we would want to find a global minimum of our loss function which should represent “how far away” we are from our desired values. But in practice we may end up with overfitting.
From this paper: https://arxiv.org/abs/1412.0233

We empirically verify several hypotheses regarding
learning with large-size networks:
• For large-size networks, most local minima are equivalent and yield similar performance on a test set.
• The probability of finding a “bad” (high value) local minimum is non-zero for small-size networks and decreases quickly with network size.
• Struggling to find the global minimum on the training set (as opposed to one of the many good local ones) is not useful in practice and may lead to overfitting.


(shweta ) #265

Hi everyone,
I have written my first blog post on Dogs Vs Cats Classification, please have a look

give suggestion for further improvement
thanks!!


(This Connection is Not Secure) #266

Hi just some feedback. I was following an older version of this course a while ago, and I found that much, much easier to follow than this version.

The old one had a few utility methods and stuff (“utils.py” and “vgg16.py”!), but this new one comes with thousands of lines of “helpful” code in the fastai library, way too much to casually understand without a lot of work.

Now I feel like I’m not learning how to use keras or theano or tensorflow or pytorch, I’m just investing a lot of time into learning your made-for-this-course framework.

I’m willing to work hard, but if I put in the work to understand the fastai library it’s not transferrable or useful. I’d much rather have to slowly build up over time all the code for image-loading, transforming, model creation, etc. Then at least that effort teaches me something that’s useful in the future.

As helpful as the fastai library is, it’s not likely to be used outside of this course. Rather than learn it, I’d like to learn how to do those things myself.


(Vamsi Uppala) #267

Can we please move the link to auto generate test data to a more prominent position in the wiki? I didn’t pay enough attention to this link until I actually spent considerable time to find web scrapers, download and arrange images into folders and be satisfied with my hours of work before realizing there was an easier way to do it. :slight_smile:


(Kofi Asiedu Brempong) #268

In lesson 1 around the 29th minute, @jeremy says that you could download some pictures, change the path to point to those pictures and use the same lines of code to train the neural network to recognize those images too.

I wanted to train it to recognize minerals so I downloaded some pictures of 2 minerals and changed the path to point to the folder containing them but I’m getting some errors with the code.


(Rahul) #269

I remember there was a link to a pdf where someone had made notes commenting in the Jupiter notebook itself and explaining the codes. Anyone know where I can find that?


(Dusten) #270

This was just posted two days ago, May 15 2018, and it was very helpful in getting Fast.ai course work up and running in Sagemaker.

https://aws.amazon.com/blogs/machine-learning/running-fast-ai-notebooks-with-amazon-sagemaker/


#271

Did you make it work?
Check if you have good folder structure inside train and valid (in my example, I was compering chinese and windsor chairs)

image


(John Richmond) #272

Thanks - any thoughts as to Sagemaker vs Paperspace in terms of overall ease or use, setup, flexibility. I have managed to do most things locally to date but am thinking I need to move some stuff to the Cloud now and trying to decide which way to go.


(Dusten) #274

I use AWS for other things and so I already had an account with all the billing things configured.

From a Sagemaker point of view I do not know enough to tell you if it’s better than Paperspace.

There are many “short-cuts” that AWS provides for you that may reduce the time spent in maintaining an ML network. Also with the new pricing reductions for a p2.xl is $0.90 an hour.

You might find that you’ll develop on Paperspace but then run production workloads on AWS.


(John Richmond) #275

Thanks, will probably start with Paperspace and see how Sagemaker comes along for a while


(Dusten) #276

I can share the Lesson 1 and 2 work on the ml.p2.xl machines, granted its a little slow at times.

The other thing to note is that AWS is one of the 7 cloud companies that get Pre-Release CPU/GPU/FPGA before the rest of the marketplace.

If you need the cutting edge of computing AWS may be that location.

I would also like to see someone try it in AZURE.


(Kofi Asiedu Brempong) #277

@sayko still haven’t been able to make it work


(Waris Gill) #278

Hi Sir,

I am still confused on “sz”. Let say sz = 224, does it means that it will reduce the resolution or crop the image if image size is greater 224 * 224 pixels? And how it will change if have a different dataset (medical images, satellite images etc).


(Sam Lloyd) #279

By default, it just reduces (or increases) the resolution, but you have the option of applying crops, zooms, stretching, rotations etc

And yes, the sort of transforms you apply depends on the dataset. Sorry if that’s a bit vague!


(Waris Gill) #280

Thank you so much. Actually my second question was if we have a medical images dataset or satellite images should i have to decrease or increase sz.


(Sairam) #282

Thanks for doing this for all of us Jeremy & Rachel!

I have two questions re Lesson 1:

Regarding the Cyclical Learning Rate paper: Is there a way we can use this method to determine the optimum learning rate even when our loss function isn’t SGD? For example, what if our loss function is a combination of two different losses or what if it’s something like Entropy loss?

Where to put the sample images for the homework assignment: In the video, Jeremy asked us to put a few images from 2 classes of our choice and train the network on those classes. In the data subfolder, we already have the dog and cat subfolders. Do we remove that and put in our new image class folders? If we don’t, then the network is trying to classify 4 different categories right?