Hello everyone!, I would appreciate your help or any tip in my battle with the following dataset.
I am working with the MedMNIST dataset, the authors created a small Pytorch library to interact with. The images are compressed in a .npz format. I would like to apply batch transforms to my data, the thing is I am not using a DataBlock. I managed to create a dataloaders following the book MNIST example. So far, the data loading part of my code looks like this:
data_flag = ‘breastmnist’
download = True
BATCH_SIZE = 128
info = INFO[data_flag]
task = info[‘task’]
n_channels = info[‘n_channels’]
n_classes = len(info[‘label’])
DataClass = getattr(medmnist, info[‘python_class’])
Apply minimal preprocessing
data_transform = TF.Compose([
TF.ToTensor(),
TF.Normalize(mean=[.5], std=[.5])
])
Download the partitions
train_dataset = DataClass(split=‘train’, transform=data_transform, download=download)
test_dataset = DataClass(split=‘test’, transform=data_transform, download=download)
Load them into a FastAI dataloader
training_dl = DataLoader(train_dataset, batch_size=BATCH_SIZE, shuffle=True)
valid_dl = DataLoader(test_dataset, batch_size=BATCH_SIZE)
Create the DataLoaders
dls = DataLoaders(training_dl, valid_dl)
Any tip to apply batch transforms if you have worked in a similar workflow? I would like to use the FastAI transforms, not the Pytorch ones. Thank you!, I appreciate any tip.