Multilabel Classification

_unknown · June 2, 2021, 1:44pm

I tried to make a multilabel classifier for dog breeds (with 2 possible breeds only).
My code for the learner is as follows :

neural_net = nn.Sequential(
    nn.Linear(128*128, 64),
    nn.ReLU(),
    nn.Linear(64, 16),
    nn.ReLU(),
    nn.Linear(16, 2),
    nn.ReLU(),
)

learn = Learner(dls, neural_net, opt_func=SGD, loss_func=nn.BCEWithLogitsLoss(), metrics=accuracy_multi)

learn.fit_one_cycle(5, lr_max=1e-3)

This gives me the following error :

Please let me know what I am doing wrong.
Thank you!

ali_baba · June 2, 2021, 4:28pm

Since this is a Linear layer to start, it is most likely due to the dimensions of your input tensor. Since you are looking at images, you need the last dimension of each sample to be of size 128*128.

You can see what the shape of the input tensor is by grabbing a single batch and then looking at a single sample. How you get that batch depends on how it was created. Most likely this should work to get the batch: batch = next(iter(dis.train))

Once you have that, you just need to index into it to find the dimension size of one sample.

_unknown · June 5, 2021, 1:10pm

The dimension of each input (i.e. image) is (3,128,128) (i.e. each image has the size (128,128) & has 3 channels RGB).

If I use the following architecture (instead of the one I posted originally) then things work just fine.

neural_net = nn.Sequential(
    nn.Flatten(start_dim=1),
    nn.Linear(3*128*128, 64),
    nn.ReLU(),
    nn.Linear(64, 16),
    nn.ReLU(),
    nn.Linear(16, 2),
    nn.ReLU(),
)

The layer nn,Flatten(start_dim=1) changes the dimensions of the input from (3,128,128) to (3,128*128).
My query is that shouldn’t I have to use nn.Flatten(start_dim=0) instead of nn.Flatten(start_dim=1) because the dimension of the input of the next layer is (3*128*128) instead of (3,128*128).