ImageClassifierData.from_arrays where classes are also images

ptah · July 24, 2018, 10:36pm

So instead of class labels, my data has Y which consists of images.
I am thinking of using ImageClassifierData.from_arrays but that would mean loading all the data into memory in one go.
How would you do it?

cedric · August 3, 2018, 7:03am

I assumed the reason you are doing things this way is because your scenario is similar to super resolution, neural style transfer, or you are trying to optimize the image pixels instead of weights/class labels.

If so, this is how I do it:

Create the dataset by sub-classing FilesDataset.

class MatchedFilesDataset(FilesDataset):
    def __init__(self, fnames, y, transform, path):
        self.y=y
        assert(len(fnames)==len(y))
        super().__init__(fnames, transform, path)
    def get_y(self, i):
        # This is the Y which consists of images as labels (target)
        return open_image(os.path.join(self.path, self.y[i]))
    def get_c(self): return 0

Create ModelData object. Code example:

from pathlib import Path

# Define global variables
PATH = Path('data/imagenet')
PATH_TRN = PATH / 'train'
arch = vgg16
sz_lr = 72
keep_pct = 1.

# Returns the filenames and labels for a folder within a path
fnames_full, label_arr_full, all_labels = folder_source(PATH, 'train')
fnames_full = ['/'.join(Path(fn).parts[-2:]) for fn in fnames_full]
fnames = np.array(fnames_full, copy=False)

# Cross-validation and split dataset into train set and validation set
val_idxs = get_cv_idxs(len(fnames), val_pct=min(0.01/keep_pct, 0.1))
((val_x, trn_x), (val_y, trn_y)) = split_by_idx(val_idxs, np.array(fnames), np.array(fnames))

# Data transformation and augmentation pipeline
tfms = tfms_from_model(arch, sz_lr, tfm_y=TfmType.PIXEL, aug_tfms=aug_tfms, sz_y=sz_hr)

# Create ImageData dataset
datasets = ImageData.get_ds(MatchedFilesDataset, (trn_x, trn_y), (val_x, val_y), tfms, path=PATH_TRN)

# Create ModelData
md = ImageData(PATH, datasets, bs, num_workers=4, classes=None) # note that classes is set to None

ptah · August 3, 2018, 7:50am

yes, that is what i need. it turns out i need unet and fast.ai already has it implemented. in step 2, how do i add a test set? that i can get predictions for with TTA?

cedric · August 3, 2018, 8:06am

Why not use TTA on the validation set?

ptah · August 3, 2018, 8:15am

I don’t have Y for the test set. i want to create predictions. I thought TTA was for test set where you don’t have the Y only?

maw501 · August 7, 2018, 6:58am

Hi @ptah, you predict on augmented versions of the images (as well as at training using augmented versions to train on => weight updates) and then average your classification predictions across these augmentations. The idea is to get a more robust CV estimate and better performance (e.g. dog might off centre but TTA captures this with its random cropping), though I’m not sure how this works for image masking competitions?

ptah · August 7, 2018, 10:17pm

yes, i was thinking about it a bit more and it will make the result blurry. i will stick with learner.predict()

jbmaxwell · February 10, 2019, 6:50pm

I’m trying to create a dataset for a VAE, so I have a similar problem; getting the target to be an image rather than a label (in this case, of course, target is just input, but I think it’s still the same issue). I want to try something like what you’ve done here, but where is FilesDataset defined (i.e., what’s the import)?

[UPDATE: Okay, it looks like it’s in the “old” branch on git… so deprecated, presumably.]