I started doing
Deep Learning For Coders course this week and I am super excited. I finished a decent amount of courses on ML but mostly focused on regression and classification (SVM, linear/logistic regression etc), not so much neural nets. That being said pardon any questions that don’t make sense
So I wanted to create an image classifier that classifies signatures into “real” or “fake”. What I did was I downloaded a bunch of signatures from this dataset and manually created a bunch of fake signatures. Basically you can think of a fake signature as a line or
X or some sort of a simple shape.
I managed to get a classifier that has around 3% error rate without “unfreezing” it. I’m pretty excited about that
One thing I was struggling with (and where most of my time went) was data collection and processing. Specifically I was struggling with creating data for a binary classifier (ie hot dog or not a hot dog). The only way that worked was to create a csv with links to images and having
0 for “fake signatures” and
1 for “real signatures”. Is there any other way of doing this? I really liked approach Jeremy took in lesson 1 with just having label in the name of the file but I was not able to make that work.
Another thing I’m trying to figure out is what’s the best way to go about storing your own notebooks? I created a VM in GC and it works like a champ, the only thing is I pulled down original course repo and I can’t push there. Do you just fork the repo or do something else? Another a bit confusing thing was where do you store your own datasets? I know by default when you download them with
untar_data they go to
.fastai but what if you upload your dataset?
Last question I had was regarding unfreezing a model. When I do that last two error rates are
0 which smells super fishy Do you guys had any idea on why?
Thank you everyone, I’m super excited to go through this course!