I’m working on a project to classify high-resolution .png images using Fastai, but my images are stored in 16-bit format. I’m having trouble finding a clear way to use them with Fastai, as it typically expects 8-bit images.
I’ve tried normalizing the pixel values to the 0-255 range before loading, but this seems to affect the accuracy of my model. I’ve also considered using a custom data loader, but I’m not sure how to implement it correctly.
Does anyone have experience using 16-bit images with Fastai? Are there specific libraries or techniques that I could try?
There are more sophisticated methods, but this is simple and fast.ai is build on the Pytorch. You can always step down into float16:
import torch
import numpy as np
from PIL import Image
# Load a 16-bit PNG image
image_path = 'path_to_your_image.png'
image = Image.open(image_path)
# Convert the image to a numpy array in float32 format
# This conversion is lossless, preserving the full dynamic range of the original 16-bit image
image_array = np.array(image, dtype=np.float32)
# Optional normalization methods (uncomment the desired method)
# Method 1: Simple scaling to [0, 1] range
# This method scales pixel values to the [0, 1] range, which is a common preprocessing step.
# normalized_image = image_array / 65535.0
# Method 2: Standardization (zero-mean, unit-variance)
# This method transforms the image to have a mean of 0 and a standard deviation of 1, which can help with the training stability of some models.
# mean = np.mean(image_array)
# std = np.std(image_array)
# normalized_image = (image_array - mean) / std
# Method 3: Custom scaling based on known min/max values
# If you know the specific range of your image data, you can use this method to scale pixel values to [0, 1] or another range.
# min_val = 0 # Adjust based on your data
# max_val = 65535 # Adjust based on your data
# normalized_image = (image_array - min_val) / (max_val - min_val)
# Convert the numpy array to a PyTorch tensor
# Regardless of the normalization method used, the conversion to a tensor is straightforward and lossless.
tensor_image = torch.tensor(image_array, dtype=torch.float32)
# If normalization was applied, you would convert the normalized image instead
# tensor_image = torch.tensor(normalized_image, dtype=torch.float32)
I hope that will help a little in accuracy and I’m not expert :slight_smile