I am currently working on implementing a basic MNIST classifier using fast.ai. The dataset I use is the one from this Kaggle competition: Digit Recognizer | Kaggle
The issue I have is that for this competition, the MNIST images as well as their labels are all in one .csv file.
I could of course convert these rows to image files or use a tabular data loader to deal with the data. However, since I am working on this project to learn more about fast.ai, I am looking for a “cleaner” solution.
The tabular approach has the downside that using resnet or similar to finetune is not possible (I think? At least it didn’t work when I tried.). The net has to learn every aspect of what makes up images from the ground up.
Converting the data to tensors, the tensors to images and loading these images with an ImageDataloader would work and should not be an issue for a dataset this size but seems unnecessarily complex to me. Converting a dataframe to tensors, tensors to image files, only to load the image files with fastai so the framework converts the files back to tensors doesn’t sit right with me.
So is there a way to load the data in the given format into fastai without resorting to a workaround like first saving the data to image files? Perhaps converting a TabularDataLoaders in a way it can be accessed by a VisionLearner?
As I understand, fastai has a layered architecture. Is there alternatively a way to go a layer deeper and pass tensors created from the csv file, with their corresponding labels, directly to fastai?