Images classification unsupervised problem

shafei · March 23, 2020, 7:15pm

Hi,

I am trying to solve an image classification problem were I have 17000 images in the same file (‘data/images’)

There are 20 classes as shown below each has a txt file that only has the names of the image files

data/
├── images/ # dir for jpg files
├── aeroplane.txt # aeroplane object class labels
├── bicycle.txt # bicycle object class labels
├── bird.txt # bird object class labels
├── boat.txt # boat object class labels
├── bottle.txt # bottle object class labels
├── bus.txt # bus object class labels
├── car.txt # car object class labels
├── cat.txt # cat object class labels
├── chair.txt # chair object class labels
├── cow.txt # cow object class labels
├── diningtable.txt # dining table object class labels
├── dog.txt # dog object class labels
├── horse.txt # horse object class labels
├── motorbike.txt # motorbike object class labels
├── person.txt # person object class labels
├── pottedplant.txt # potted plant object class labels
├── sheep.txt # sheep object class labels
├── sofa.txt # sofa object class labels
├── train.txt # train object class labels
├── tvmonitor.txt # TV monitor object class labels
├── LICENSE
└── README.md
So as it stands I do not have a labeled data set to train my data on I am trying to use the fastai library to create and train and validation data set to build the model on and try an unsupervised model.

My problem is in creating the validation data set and also not sure the best approach that should be taken for the unsupervised model.

from fastai.vision import *

classes = [‘aeroplane’,‘bicycle’,‘bird’,‘boat’,‘bottle’,‘bus’,‘car’,‘cat’,‘chair’,‘cow’,‘diningtable’,‘dog’,‘horse’,‘motorbike’,‘person’,‘pottedplant’,‘sheep’,‘sofa’,‘train’,‘tvmonitor’]

path = Path(‘data/images’)

for folder in classes:
dest = path/folder
dest.mkdir(parents=True, exist_ok=True)

for c in classes:
print ©
verify_images(path/c, delete=True, max_size=500)

data = ImageDataBunch.from_folder(path, train=".", valid_pct=0.2, ds_tfms=get_transforms(), size=224, num_workers=4).normalize(imagenet_stats)

from fastai.basic_train import *

train_size = 1000
train_size = int(train_size * 1.25)
bs = 128
size = 28

import torch
import torchvision.transforms as transforms
import torchvision.models as models
import torch.nn as nn
import torch.optim as optim
import numpy as np
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt

def get_data(train_ds, valid_ds, bs):
return (
DataLoader(train_ds, batch_size=bs, shuffle=True),
DataLoader(valid_ds, batch_size=bs * 2),
)

data, valid_data =get_data(train_ds=train,valid_ds=valid_ds, bs=bs) #the get_data function is not working as I am having trouble defining the valid_ds in the function

Architecture of model

conv = nn.Conv2d
act_fn = nn.ReLU
bn = nn.BatchNorm2d
rec_loss = “mse”

To have an unsupervised model I would set up an encoder and decoder architecture not exactly sure how this should be done.

Thanks,

Aly