Could a convolutional neural network learn to count between 1 and say 4 dogs on an image in theory? Because my understanding is that they are good at recognizing the features of a dog but not at counting how many there are.
I am interested in this same question, and I am actually planning to run some experiments soon, to see if NNs can learn to count objects (but I will use synthetics images though, not real ones). I also found this interesting article about the topic https://arxiv.org/abs/1807.09856
If you just want a quick “prototype” I think you can use the code from the object localization with bounding boxes lecture with an appropriate dataset and add something that counts the number of bounding boxes labeled “dog” in the output after inference. There is a predefined maximum of bounding boxes but that can be adjusted if needed.
There are far more professional (and more complicated) solutions / architectures for this though, maybe checkout some of these:
Thanks, yes I see there are method for doing this. What I was more trying to understand is if you can do this with a regular CNN architecture and get the intuition on why you can or why you can’t.