FastAI creating custom ItemBase for vector based images

5wBY2bu · April 13, 2020, 6:22pm

I am doing machine learning with vector based images instead of raster based images. For now, it is images made up of only lines. My input data is many rows of x,y start point and x,y end point pairs for each line. For example:

0 15 22 75 li
100 10 2 17 li

The above represents an image with two lines. The first line starts at 0, 15 and ends at 22, 75. None of the existing fastai data types seem suitable for doing machine learning on this type of data. Which leads me to believe I need to create my own ItemBase and ItemList classes? I read the tutorial Creating a custom ItemBase subclass but I’m still really confused. I don’t know where to even start.

Some more background info. The eventual goal is a GAN. And I do have a GAN working in Pytorch that does generate images. However it trains very slowly and never get quite where I’d like it to be. I think all the advanced techniques used in in fastai would be a big improvement. Just for example, the scheduled learning rate and momentum.

If it helps, this is my current Pytorch based data loader. It’s pretty simple.

def load_one_eps(item_path):
    df = pd.read_csv(item_path, sep=' ',
                 header=None, 
                 dtype={0: 'int64', 1: 'int64', 2: 'int64', 3: 'int64', 4: 'string'},
                 skiprows=20,
                 nrows=eps_rows)
    tensor = torch.from_numpy(df.loc[:,0:3].values).float().div(eps_max)
    return tensor

def load_dataset():
    data_path = 'inputs2'
    train_dataset = torchvision.datasets.DatasetFolder(
        root=data_path,
        loader=load_one_eps,
        extensions='eps'
    )
    train_loader = torch.utils.data.DataLoader(
        train_dataset,
        batch_size=batch_size,
        num_workers=0,
        shuffle=True
    )
    return train_loader

Other questions I have: how should I be normalizing that data? And is it possible to train images with different shapes? This would be ideal, because vector based images that are the same “size” could have just a few lines, or a few thousand lines. In my case that would translate into more rows of data.