Lesson 1 Homework Help

Hi friends :wave:

I started doing Deep Learning For Coders course this week and I am super excited. I finished a decent amount of courses on ML but mostly focused on regression and classification (SVM, linear/logistic regression etc), not so much neural nets. That being said pardon any questions that don’t make sense :slight_smile:

So I wanted to create an image classifier that classifies signatures into “real” or “fake”. What I did was I downloaded a bunch of signatures from this dataset and manually created a bunch of fake signatures. Basically you can think of a fake signature as a line or X or some sort of a simple shape.

I managed to get a classifier that has around 3% error rate without “unfreezing” it. I’m pretty excited about that :tada:

One thing I was struggling with (and where most of my time went) was data collection and processing. Specifically I was struggling with creating data for a binary classifier (ie hot dog or not a hot dog). The only way that worked was to create a csv with links to images and having 0 for “fake signatures” and 1 for “real signatures”. Is there any other way of doing this? I really liked approach Jeremy took in lesson 1 with just having label in the name of the file but I was not able to make that work. :confused:

Another thing I’m trying to figure out is what’s the best way to go about storing your own notebooks? I created a VM in GC and it works like a champ, the only thing is I pulled down original course repo and I can’t push there. Do you just fork the repo or do something else? Another a bit confusing thing was where do you store your own datasets? I know by default when you download them with untar_data they go to .fastai but what if you upload your dataset?

Last question I had was regarding unfreezing a model. When I do that last two error rates are 0 which smells super fishy :slight_smile: Do you guys had any idea on why?

Thank you everyone, I’m super excited to go through this course! :muscle:

You can use a number of things :slight_smile: Continuing into lesson 2, Jeremy will show a from_folder method in which you’d simply have a “real” folder and a “fake” folder and you’re good to go!

For storage, etc, I use google colaboratory and grab the notebook via “open from github” and it’s good to go.

I usually upload my own datasets to google drive and can utilize colab to grab my drive and everything is there, or just upload it onto their server.

To the fishiness, try keeping some data out that the model is never trained on, and see if it guesses it right :slight_smile:

Welcome to the course and the forums!!!

1 Like

Sounds good and thank you!