Hi! I’m unsure about how to interpret the inference results when using learn.predict(img) for ImagePoint problems such as the Head Pose problem of Lesson 3. My goal is to collect the predicted coordinates and use them for further processing later. What confuses me is the direct output of learn.predict(img). It doesn’t seem to make any sense to me.
When printed out, learn.predict(img) outputs a touple where the first item is the original image shape (120x160 pixels) while 2nd and 3rd items seem to be equal:
The predicted coordinates don’t seem to be in the output, but still the correct ImagePoint is displayed with img.show(y=learn.predict(img)[0]), just like in the Regression example of https://docs.fast.ai/tutorial.inference.html. This is what I don’t get. Even though img.show() is given the original image shape as y=ImagePoints (120, 160), it still shows the predicted coordinate correctly on the image. What am I missing here? How can I find and collect the predicted ImagePoint or coordinates when using learn.predict(img)? Any hints to put me on the right track are highly appreciated!
After going through a bunch of different predictions, it looks like the coordinates are actually in the 2nd and 3rd items of the output. The tensor([[-0.5004, -0.1986]]) seems to mean:
y = -50.04% from the center of the image
x = -19.86% from the center of the image
What I still don’t understand is how can the img.show() method display the coordinates correctly when it is only given the image shape and not the coordinates.
img.show(y=learn.predict(img)[0]) is never given the percentages ([1,2]), only the size ([0]). That’s what I’m baffled about. The percentages are the 2nd / 3rd items of the predict method’s output, but the show method is only given the 1st item. Or then I’m just missing something.
See ImagePoints in the docs. Looks like the object has the image already in it and it overlays those points to the image, creating a new object with all of that combined.
Since Fast.ai’s img.show(y=point) only creates the dot in the image upon display and does not actually draw on the image which is opened, you’ll probably need to draw it separately with PIL or Open-CV libraries and then save the image. Or you could directly change the color value to [255,0,0] for the pixel at the predicted coordinates.
Found a way to do that. Not sure if this is best way:
imgP = open_image(’/content/gdrive/My Drive/XSble/renamed/img_8.png’)
predPoint = learn2.predict(imgP)
finalImg = show_image(imgP) # Original Image object
predPoint[0].show(finalImg) # add point image object
finalImg.figure.savefig(‘plot2.png’) #Save Image