Hi @alonso - in my experiment I found that convnet handles position sensitivity quite well - somewhat to my surprise. My positive training examples have spikes (and some dips) to the right edge of the image.
What I did have to do was to be careful with image augmentation, which is to say don’t do any at all. Given the classification task requires location sensitivity, I needed to avoid all translations, cropping/padding, flipping the image etc…
This worked out for my scenario - YMMV.