Struggling to tell if I'm making progress (lec 3)

seijfa · April 26, 2020, 9:58pm

Hi!

I’m on lecture 3 and am trying to get a feel for learning rates, downsampling, reducing precision, and other techniques by playing around with them (using cat face point dataset https://www.kaggle.com/crawford/cat-dataset) as Jeremy has suggested in the class, however I’m feeling a bit lost.

I’m having trouble keeping track of all the experiments I’m running and how they’re doing relative to each other. A single notebook from top to bottom is nice when you have just a couple of stages but when you’re tweaking everything then just growing the notebook down gets a bit unmanageable, but I don’t want overwrite previous results. I’m thinking I may want to just start dumping all my error values, annotated roughly with what I doing and which was the parent experiment, into one big csv.
But even when looking at visualizations I’m not sure what to make of them. I’ve found that there’s quite a bit of variance, e.g. if I forget to fix the seed, so I have no idea if my tweaks to the model are helping or if it’s just random noise. I haven’t seen any suggestions to do something like rerun with different seeds before deciding whether a model choice hurt or helped, am I overestimating the risk here?
Fastai by default is quite good so I don’t know how much of an improvement I can squeeze out of it. E.g. I can look at a Kaggle competition’s results and I’m doing pretty well but I haven’t done anything. Everything’s fine, my model performance is fine, but I don’t feel like I’ve learned or achieved anything by throwing the out-of-the-box setup at a dataset.
Overall I’m having trouble telling if my model is making progress, so I’m having trouble figuring out if I’m making progress. I’m not sure that my playing around has helped me at all aside from some extra familiarity with the various ItemList types in the fastai library.

At this point I’m going to move on to lec 4 in the name of not getting stuck, but I have a feeling this will all come up again when I attempt the next notebook so I’d love some input on how to get more out of this course.

JonathanSum · April 27, 2020, 1:16am

This is a very unhelpful suggestion since it is not a very technical suggestion.
I think you are doing the 2020 version. However, I suggest you want to move on and come back. I remember Jeremy in the 2019 videos said you need to at least watch three times and without much stopping in the middle of the video.
For my personal suggestion, I suggest you add more comments in code and recreate a notebook with a different dataset. We all struggle. Sometimes, I even try another approach, and many of them cause another issue. But the result is I learned a lot. I personally use a dataset that is interesting and builds my own dataset from the thing I love.

rgarcia · May 1, 2020, 11:52am

(My experience)

You will truly test your knowledge when you work on Kaggle competitions.

Start with something small (you will find enough trouble there).

If you are within the top 10%, then you are ok. Getting much further normally involves improving the dataset more than the model/training, so don’t spend more time on that.