However, after splitting the csv into dataframes of train/valid/test along with their respective label dataframes, I don’t know how to proceed. Previously there was a way to get a databunch from image arrays however now if I put ??ImageClassifierData.from_arrays()) it gives me an error.
Anyone have an updated tutorial on how to get databunch from numpy arrays?
In the Kaggle competition the images are formatted differently than in Fast.AI version of the dataset which is used in the lessons. To replicate the lesson, you first need to reshape the dataframe columns to a 28x28 shape, since the original dataframe has a dedicated column for each pixel and therefore can’t be directly read as an image.
I think Jeremy prefers using the Datablock API over the direct Databunch method since it is more clear to whats going on. To create the databunch with datablock API, you can follow the notebook in lesson 7. Here’s the combined code for creating it:
data = (ImageList.from_folder(path, convert_mode='L')