I have been working on a mapping project in witch it would be useful to train an image classifier on imagery with more than 3 bands/channels. I’m currently working with imagery with 2 classes and 6 bands. I think I mostly have it sorted, however I can’t work out how to tell the model to output a prediction for each class (2 predictions) instead of outputting one prediction for each image
This is how I’m loading in the data.
# open a image and convert it to a tensor def open_img(path): ms_img = rasterio.open(path).read().astype('float32')/255.0 im = torch.from_numpy(ms_img) return im # get the image label from the folder name def get_label(path): label = os.path.basename(os.path.dirname(path)) return label db = DataBlock(blocks=(TransformBlock(open_img), CategoryBlock), get_items = get_image_files, get_y= get_label, splitter=RandomSplitter(valid_pct=0.2, seed=42), ) ds = db.datasets(source=path) dl = db.dataloaders(source=path, bs=4) batch = dl.one_batch() print(batch.shape, batch) #torch.Size([4, 6, 1000, 1000]) TensorCategory([0, 0, 1, 0], device='cuda:0')
Then I’m setting up the learner like this
def print_input(predictions, targets): print(predictions) print(targets) learn = cnn_learner(dl, resnet18,n_in=6, n_out=1, metrics=error_rate, loss_func = print_input).to_fp16() #tensor([[-1.1084], # [-2.2383], # [ 1.8320], # [ 2.2969]], device='cuda:0', grad_fn=<CopyBackwards>) #TensorCategory([1, 0, 0, 0], device='cuda:0')
So I think my problem is that the output above is only giving me one prediction for each input image, and what I want is two predictions for each image, one for each class.
Also I believe I should be using ‘CrossEntropyLossFlat’ as the loss function, however I needed a way to see what the model was outputting which is why a added ‘print_input’ as the loss function (I get that this is a bit odd but I’m getting desperate ).
I believe what I’m after is the model to output a prediction for each class, like this.
tensor([[-1.1084, -1.1084], [-2.2383, -2.2383], [ 1.8320, 1.8320], [ 2.2969, 2.2969]], device='cuda:0', grad_fn=<CopyBackwards>) TensorCategory([1, 0, 0, 0], device='cuda:0')
If anyone could let me know what I’m doing wrong here it would be greatly appreciated.