Heatmaps for counting (Crowd counting)

Hi everyone,
I would like to perform a counting task (counting all the grape berries in an image).

From the dot annotations I generated an heatmap (convolving the dot annotations with a gaussian kernel and normalizing).

The problem is I don’t know which DataBlock I should employ and how I should pass my labels.
The segmentation block expects a mask with discrete values (the categories) while my target are Images where each pixel has a float value (density estimated in that pixel).

I hope I explained my problem, please tell me if I can elaborate better to make people understand.
Thanks to anyone who will help, really appreciate =D


What does the density value represent? Can you bin up the values to change the floats into discrete categories?

Do you also have a number, the count of grapes, per target image in your training set?

Also, check out this post: Counting segmented objects. It suggests to use cv2 once you have your segmentation done, which seems like the way to go to me.


1 Like

Hi, thanks a lot for your feedback.
Sadly there is no way to discretize the density values.
The density values are just computed applying a gaussian kernel to the dot annotations (a blurring filter).

I post here an article where the author talk about density maps for crowd counting, you can find a nice explanation of how they are obtained or some cool images to grasp the intuition:

I do have the count of grapes per target image since I can just count all the dot annotations per image.

Concerning the “Counting segmented objects” topic, thanks a lot, I will go through it right now.

I would still like to give a try to the density approach so any help on that side would be hugely appreciated.

I took a look at the article and I think you should be able to use:

DataBlock(blocks=(ImageBlock, MaskBlock), …)

To make that work, I think you need to change the density map to a grayscale image (one that only has 1 channel - for example a shape of (1, 256, 256) vs a color rgb image that has a shape of (3, 256, 256)) and feed that into the MaskBlock.

Then create a dataloader and feed it though the unet_learner method.

If you have already tried that, what error message are you receiving?

1 Like

Thanks again for your support.
I actually didn’t realize it was that easy, I feel pretty dumb right now =)
I was stuck with providing the float values.
I will try it next week, thanks a lot.

1 Like

Hi again :slight_smile:
I was rethinking to my problem, if I do like you suggested isn’t fastai considering it like a normal segmentation task (so trying to classify each pixel)?
That wouldn’t be optimal because it would lose the ordinal information (in a heatmap the class 12 is more similar to 13 than it is to 254 but the net wouldn’t know it because it considers gray scales as classes… am i wrong?)


Yes, it is trying to classify each pixel. My understanding is that the numbers in the output per layer are the probabilities that that pixel is that category. There is one layer generated per category. (for 1 category the output is (1, 256, 256), 2 categories it is (2, 256, 256), and so on.) If you have a grayscale output (one with one layer), then the output is the probability of that pixel being the thing you are looking for. You are really only looking for the points to get a count and in saying that you might consider using the PointBlock instead of the MaskBlock.

The output, from what I have seen, should have the highest probabilities nearest the actual point and it descends as it gets further out. That sounds like a heat map to me. I had been thinking about article and I’m not sure I understand the reasoning behind generating the heat map in the first place when you have the points data already.

Sorry about the delay in the answer, I am working on this on my snippets of time.
I have tried with the PointBlock but it expects a fixed number of points while the whole point is to count them… am I missin somethin?