Trying to adapt the CamVid example on page 42 of the book for my purposes.
Can’t figure how to use the output of predict properly. I (think I)understand that it returns a 3-tuple, with the last one being a tensor reporting confidence in each category, of each pixel. I’ll call this tensor confid.
But how do I convert this to a mask? I figure that I can assign each position of confid a color, and then use it to parse the image manually.
So, I ask, isn’t there a more elegant way to do this?
The tuple isn’t color channels for the mask, it holds 3 different versions of your mask at various stages of "raw"ness. The documentation explains what each entry in the tuple is, the “rawest” prediction will be the one that is non-integer values, and the usable version of the mask, if your loss function has decoding built-in, will be one of the other two depending on how your dataloader transforms your data.