I am trying to load datasets from HuggingFace datasets for image classification problems. For example, for Fashion MNIST, I have the following code:
from datasets import load_dataset, load_dataset_builder
from fastai.data.core import DataLoaders
from fastai.vision.all import *
from torch.utils.data import DataLoader
import torch
ds_name = "fashion_mnist"
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
train_ds = load_dataset(ds_name, split="train").with_format("torch", device=device)
valid_ds = load_dataset(ds_name, split="test").with_format("torch", device=device)
train_dl = DataLoader(train_ds, batch_size=256, shuffle=True, num_workers=1, pin_memory=True)
valid_dl = DataLoader(valid_ds, batch_size=256, shuffle=False, num_workers=1, pin_memory=True)
dls = DataLoaders(train_dl, valid_dl)
learn = vision_learner(dls, "convnext_base", metrics=error_rate)
In the last line I get an AssertionError
that says that n_out
is not defined and could not be ascertained from data. I am not sure how to specify the label
key in train_ds
and valid_ds
have the output labels, since DataLoaders
do not take a y_name
keyword argument.