I’m at beginning of the course, about to finish lesson 1 home work. I successfully submitted Dogs vs Cats and got to the top 50%, but I noticed that my batches.filenames are sorted:
[‘unknown\1.jpg’, ‘unknown\10.jpg’, ‘unknown\100.jpg’, ‘unknown\1000.jpg’, ‘unknown\10000.jpg’]
and not random as in Jeremy’s notebook
[‘unknown/9292.jpg’, ‘unknown/12026.jpg’, ‘unknown/9688.jpg’, ‘unknown/4392.jpg’, ‘unknown/779.jpg’].
I didn’t give it too much thought in the beginning because i got good results, but now when I’m trying to do state-farm I get really bad results which validation probabilities skewed towards c8 & c9… The only explanation I can think of is that it’s because they are last in training, the network learns to favorite them.
Is the sorting really a problem? what can I do about it?
P.s. I know that there is a course notebook for state-farm but I’m trying to do this myself first