So I’ve been trying to repurpose VGG for the State Farm competition. For many hours, I was stuck at 10% - like a lot of folks. Couldn’t even get close to Jeremy’s worst model from class. I finally figured out how to mess with the learning rate, but even that didn’t help.
After rereading forum posts yet again, I just decided to set all the dense layers as trainable, and re-ran things on my sample - got over 90% accuracy on the first shot. Then on the full data set - first epoch got to val_acc: 0.735, and #2 hit 98% accuracy (admittedly with plenty of overfitting).
Now that the initial shock has worn off, this isn’t as satisfying as I’d hoped. It’s not like I have a better understanding of the data, or a deep understanding of why retraining the dense layers is so effective (at least not yet). I just tried something, and it sort of worked.
I have an economics background. In that field, a model is typically informed by theory. Data mining is looked down upon as mere correlation, when what you need for policy making is a theory about (and statistics demonstrating) causality. At least correlation often has an intuitive interpretation even if interpretation can literally be dangerous. But this … it’s just a mashup of simple math and a ton of data.
Is anyone else struggling with this? I have trouble trusting what I’m seeing. Is there any external validity? I know there are a million blogposts about this issue, but I’d love to hear your thoughts, or be directed to some writings that you think grapple with this issue effectively.
I have a feeling that this is partly a generational thing - a generation growing up with self-driving cars and hacking neural nets from high school or college will trust them because they do work. Is my suspicion unreasonable? According to these papers, it’s not entirely unreasonable. But these may be outliers.
http://www.evolvingai.org/fooling
http://cs.nyu.edu/~zaremba/docs/understanding.pdf
Still, as someone who will likely be depending on neural networks to solve difficult real-world problems for real customers spending real money, I am a bit troubled by their black-box nature. They must be terrible to debug!