How to load polygon mask data from .txt file for Image segmentation using UNET

Hi All,
I am practicing the Lesson 3, image segmentation with Satellite /aerial Images to segment the buildings. However in my dataset, images are PNG and corresponding annotation/ polygon masks are in .txt file.
I am tying to figure out how to load this data. Can someone point me in right direction
Sample Image :-


corresponding label.txt for this image:-
[[576, 1279], [561, 1279], [557, 1279], [531, 1273], [535, 1266], [518, 1257], [514, 1265], [487, 1251], [491, 1242],
[459, 1226], [494, 1159], [506, 1165], [508, 1162], [535, 1176], [533, 1179], [555, 1190], [550, 1201], [578, 1216],
[575, 1221], [602, 1235], [610, 1221], [631, 1233], [632, 1230], [659, 1244], [658, 1246], [670, 1253], [633, 1279]]
[[209, 1090], [157, 1029], [168, 1020], [167, 1019], [190, 999], [192, 1002], [211, 985], [221, 997], [244, 978], [241,
974], [266, 952], [259, 943], [276, 929], [274, 926], [297, 907], [300, 910], [311, 900], [360, 958], [339, 976], [334,
970], [324, 979], [328, 985], [316, 996], [311, 990], [291, 1007], [296, 1013], [273, 1033], [271, 1031], [258, 1041],
[264, 1047], [248, 1060], [243, 1054], [218, 1075], [222, 1080], [209, 1090]]
[[487, 1122], [412, 1106], [421, 1064], [413, 1063], [418, 1038], [425, 1039], [429, 1023], [420, 1022], [426, 993],
[436, 995], [440, 977], [431, 975], [435, 958], [443, 960], [450, 928], [442, 926], [446, 909], [524, 925], [521, 939],
[523, 939], [517, 969], [514, 969], [510, 992], [495, 989], [488, 1020], [495, 1021], [489, 1052], [501, 1054], [496,
1078], [499, 1078], [493, 1108], [490, 1107], [487, 1122]]
[[885, 1037], [875, 965], [906, 961], [904, 947], [956, 940],

You need some code that rasterises the coordinates - that is, turns the polygons into a numpy array. If you haven’t been provided code with the data, there are plenty of algorithms on stack overflow etc.

You then have the choice of 1) saving the mask arrays as png’s, which fastai can read natively. Or 2) writing a function that turns polygon coordinates into arrays on the fly as images are read during training. Look at open_mask_rle in the code to see a similar example. I prefer to do (1) as I hate re-computing anything. But sometimes file saving constraints mean (2) is better or you just find it more elegant. Good luck!

1 Like

Convert your points into a svg and then use this to “rasterize”

Things to take care:

  • afaik you’ve two classes (1 and 0), so you’ve to find a way to prevent or filter out “anti aliasing” (is treshold 0/1) from you’re edges, otherwise you’ve some unwanted labels.

  • correctly rescale your svg to be in same scale/size as your reference image (input). Eventually try to display your new mask to see if it’s ok.

1 Like