I'm quite interested in the 3rd kaggle NIPS 2017 GAN sub-competition: Building adversarial hack resistant models.
One of the things that I keep thinking about is how knowledge is hierarchically represented in images. edges->simple shapes->complex shapes->objects->scene.
A cat looks like a cat because it has cat like eyes/nose/whiskers in the previous layer...and each of those things are made up of further cat like things in the lower layers.
Even humans use this conceptual knowledge of object hierarchy to identify objects.
But in the GAN hacks, we seem to tweak just pixels, and try to fool the model into predicting an object of the wrong class.....this happens only because we do not 'conceptually' store the knowledge of object hierarchies. While CNNs learn the object hierarchies at a pixel level, they seem to ignore the conceptual meaning of objects found in the lower layers. Only the last prediction layer neurons are mapped to labels...
What if we gave labels to objects found in the earlier layers as well? For example, if we use Matt Zeiler's deconv idea or any other top-down visualization idea, we can take 'cat eyes' and 'cat nose' and give them labels. We can separately maintain a label hierarchy to say that 'cat eyes' and 'cat nose' are great indicators that the next layer actually contains a 'cat', rather than a 'toaster'.
I'm talking about moving from pixel manipulation to better knowledge representation (conceptually).
Conceptual knowledge representation is being done in other areas as well...for example RL for game play.
There is no way any adversarial hack can be done purely by pixel manipulation if we conceptually understood the image.
Heck, we can even take this a step further and map the label hierarchy we have learned to word2vec labels. We should be able to get synonyms, hypernyms and make a really secure adversarial hack resistant system.
Let me know what you guys think...thanks for your time!