Get classification labels from convolution filters

taiharry108 · May 24, 2019, 6:50pm

I’m wondering if this is possible:

Train a CNN as a standard classifier, e.g. classifying cats and dogs.
Since different filters can recognize different things, for example, some of them should be able to recognize eyes.
The activation of those layers should tell us where the eyes are. Then we should be able to extract the “coordinates” of the eyes and use them as the labels for a regression problem.
Then we should be able to train a CNN that can track where the eyes are in an image.

However, I do not know how to find those layers. Do you guys think it may work? Any suggestion?

edxz7 · May 26, 2019, 12:25am

Hi Harry

Indeed, It’s possible to visualize the feature maps inside a convolutional neural network as you said, you can google “visualizing feature maps in a convolutional neural network” and you will find a good start, but if you want to track a particular feature, like the eyes in an image to do some inference for futures updates of the same image (like in the frames of a video) you will face serious problems.

First, you don’t know where are those layers which have the “eyes feature”. Second, for the neural network, all the features are meaningless by themselves.,(you maybe will require another neural network to check for the feature maps of the first). And finally all the features captured in the feature maps are rescaled and they have been lost all its context with respect to the original image.

On the other hand, a neural network called capsule network was design precisely to achieve the kind of things you described. Search on the internet for capsule networks and you will find plenty of useful information. Indeed the idea behind capsule networks is very clever.

A warning note: the capsule network is relatively recent (its paper was publish in 2017), so they are in an immature stage and it’s hard to make them work as they should. I tried to implement the capsule network published in the original paper to deploy an app capable to predict handwritten digits in real time under hard conditions and they perform poorly compare to others CNNs. I described a little more about this result here . The code I wrote to implement the capsule networks is in pytorch but I adapted it to be compatible with fastai. For some weird reason, I can’t get good results training this network with fastai (usually it’s very easy improve the accuracy of any architecture with fastai), but maybe you can make it work.

good, luck

taiharry108 · June 1, 2019, 3:10am

Thank you so much for your detailed reply! I’m gonna look into capsule network then. Hopefully I can make something from it!