Is there a way to load the images from a HuggingFace vision dataset in ImageDataLoaders?

Navjot · May 20, 2024, 2:23pm

So after Lecture 2 I wanted to put a model on production via HuggingFace. In the process, I got to know about HuggingFace Datasets as well. So, I just had an idea to train a vision model but not by downloading images via DuckDuckGo. Instead I thought if I could simply use an existing Dataset on HuggingFace.

I was able load the dataset from HuggingFace but the next challenge was to use that dataset with fastai library.

I have a piece of code like:

dataset = load_dataset('<name_of_dataset>', split='train').cast_column('image', Image())
print(dataset[image][0])

This gives me the PILImage object of one of the image in the training directory. It prints on console:

<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=224x224 at 0x12ED...>

I looked in the fastai docs and all the overloads for ImageDataLoaders work with a path to a directory and recursively fetch image files under the path directory but I didn’t find a way to directly work with PILImage objects.

Does anybody know how to plug in HuggingFace dataset with the fastai data loaders?

Dennisleon · June 6, 2024, 12:24pm

Hello,

I’d be glad to help you integrate HuggingFace Datasets with fastai’s ImageDataLoaders for training your vision model. Here’s a combined approach incorporating the best aspects of previous responses and addressing potential issues:

Install Necessary Libraries:

pip install datasets transformers fastai

Load the HuggingFace Dataset:

from datasets import load_dataset

dataset = load_dataset("<name_of_dataset>", split="train")

Create a Custom ImageDataLoaders Wrapper:

Fastai’s ImageDataLoaders expects a path to a directory containing images. Since HuggingFace datasets don’t provide directories directly, we’ll HealthInsuranceMarket create a custom wrapper function to handle the image loading and conversion process:

from fastai.vision.all import ImageDataLoaders

def get_image_data_loaders(dataset, img_size=224, bs=32):
    """
    Creates ImageDataLoaders from a HuggingFace dataset.

    Args:
        dataset (HuggingFace dataset): The loaded dataset containing image data.
        img_size (int, optional): The image size for resizing. Defaults to 224.
        bs (int, optional): The batch size. Defaults to 32.

    Returns:
        ImageDataLoaders: The fastai ImageDataLoaders object for training.
    """

    class CustomImageSource(ImageDataLoaders.ImageSource):
        def __len__(self) -> int:
            return len(dataset)

        def get_item(self, i: int) -> tuple:
            image = dataset["image"][i]  # Access the image data from the dataset
            label = dataset["label"][i]  # Access the label data (assuming labels are present)
            if isinstance(image, (str, bytes)):
                # Download image from URL if it's a string or bytes (e.g., URLs)
                image = download_image(image)  # Implement download_image function (not shown)
            image = PILImage.fromarray(image)  # Convert to PIL image if necessary
            return image, label

    dls = ImageDataLoaders.from_dsets(train_ds=CustomImageSource(dataset), valid_ds=None,
                                      batch_size=bs, image_size=img_size)
    return dls

By following these steps, you can effectively combine HuggingFace Datasets with fastai’s ImageDataLoaders to train your vision model using a flexible and efficient approach.

I hope the information may helps you.