How to interpret inference results (Lesson 3 Head Pose)

Hi! I’m unsure about how to interpret the inference results when using learn.predict(img) for ImagePoint problems such as the Head Pose problem of Lesson 3. My goal is to collect the predicted coordinates and use them for further processing later. What confuses me is the direct output of learn.predict(img). It doesn’t seem to make any sense to me.

When printed out, learn.predict(img) outputs a touple where the first item is the original image shape (120x160 pixels) while 2nd and 3rd items seem to be equal:

(ImagePoints (120, 160),
tensor([[-1.2336e-03, -2.0357e+00]]),
tensor([-1.2336e-03, -2.0357e+00]))

The predicted coordinates don’t seem to be in the output, but still the correct ImagePoint is displayed with img.show(y=learn.predict(img)[0]), just like in the Regression example of https://docs.fast.ai/tutorial.inference.html. This is what I don’t get. Even though img.show() is given the original image shape as y=ImagePoints (120, 160), it still shows the predicted coordinate correctly on the image. What am I missing here? How can I find and collect the predicted ImagePoint or coordinates when using learn.predict(img)? Any hints to put me on the right track are highly appreciated!

Thanks a lot for help folks!

After going through a bunch of different predictions, it looks like the coordinates are actually in the 2nd and 3rd items of the output. The tensor([[-0.5004, -0.1986]]) seems to mean:
y = -50.04% from the center of the image
x = -19.86% from the center of the image

What I still don’t understand is how can the img.show() method display the coordinates correctly when it is only given the image shape and not the coordinates.

Probably because mathematically you can convert those percentages into coordinates based on the image size would be my guess.

img.show(y=learn.predict(img)[0]) is never given the percentages ([1,2]), only the size ([0]). That’s what I’m baffled about. The percentages are the 2nd / 3rd items of the predict method’s output, but the show method is only given the 1st item. Or then I’m just missing something.

https://docs.fast.ai/vision.image.html#ImagePoints

See ImagePoints in the docs. Looks like the object has the image already in it and it overlays those points to the image, creating a new object with all of that combined.

1 Like

The coordinates data are stored in the ImagePoints – you can see by printing out
img[0].data

That contains the coordinates for the red dot which is passed to img.show

learn.predict(img)[0].data

thant gives the coordinates

1 Like

How do you save the image with point on it?

imgP.show(y=learn.predict(imgP)[0])
This just shows the image with predicted point. But I need to save the image.

Any help?

Since Fast.ai’s img.show(y=point) only creates the dot in the image upon display and does not actually draw on the image which is opened, you’ll probably need to draw it separately with PIL or Open-CV libraries and then save the image. Or you could directly change the color value to [255,0,0] for the pixel at the predicted coordinates.

Found a way to do that. Not sure if this is best way:
imgP = open_image(’/content/gdrive/My Drive/XSble/renamed/img_8.png’)
predPoint = learn2.predict(imgP)
finalImg = show_image(imgP) # Original Image object
predPoint[0].show(finalImg) # add point image object
finalImg.figure.savefig(‘plot2.png’) #Save Image