Predicting coordinate positions for a uv map

I’m attempting to predict a uv map from a surface using the unet learner with a dataset of surfaces and corresponding uv map. However I’m getting some less than desirable results without many ideas for how to improve. I’m hoping to get some thoughts from the community.

From the results using a mse loss (prediction left, target right) you can see it manages to produce what resembles a uv map, it is able to ‘segment’ well and fill in plausible values. But if I highlight constant x,y axis values you can see it fills them in erratically. Some values are missing or repeated and it doesn’t produce a smooth result. This is probably expected since the loss function doesn’t penalise either of these behaviours.

download

Of course these unets are able to learn positions very well as shown by segmentation tasks but it seems to be translating the positions into x,y coordinates that it struggles with.
Here was a different test I tried to demonstrate this by using UV maps with only values of constant x and y.

As you can see the positions of lines are almost perfect but the values of the positions are again badly estimated.

How could I improve my results?
Is there a better loss function more suited to the task?
I read the CoordConv paper https://arxiv.org/pdf/1807.03247.pdf which suggests normal convolution are bad at predicting coordinate positions. Has anyone experimented with coord conv or have a different approach?