Text multi-label classification - sigmoid results are bigger than 1 (occasionally)

ranih · September 13, 2018, 2:35am

Hey,

I’m building a text classifier with 60 labels. I trained it and the results look good in general but sometimes I’m getting values that are bigger than 1. After calling exp(log_prob), one of the values in the prediction (60x1 array) was bigger than 1 (it was the right label if it matters).

I’m afraid I have a bug. Is it possible to get values bigger than 1?

Thanks!

This is the code I’m using (I copied just the relevant parts):

class MultiLabelClassifier(nn.Module):

def __init__(self, y_range=None):
    super().__init__()
    self.y_range = y_range

def forward(self, input):
    x, raw_outputs, outputs = input
    x = F.sigmoid(x)
    if (self.y_range):
        x = x * (self.y_range[1] - self.y_range[0])
        x = x + self.y_range[0]
    
    return x, raw_outputs, outputs


m = get_rnn_classifer(bptt, max_seq=20*70, n_class=num_classes, n_tok=vs, emb_sz=em_sz, n_hid=nh, n_layers=nl, pad_token=1, layers=[em_sz*3, 50, num_classes], drops=[drops[4], 0.1], dropouti=drops[0], wdrop=drops[1], dropoute=drops[2], dropouth=drops[3])

opt_fn = partial(optim.Adam, betas=(0.7, 0.99))
learn = RNN_Learner(md, TextModel(to_gpu(m)), opt_fn=opt_fn)
learn.reg_fn = partial(seq2seq_reg, alpha=2, beta=1)
learn.clip=25.
learn.metrics = [accuracy_thresh(0.5)]

learn.load_encoder('lm1_enc')
learn.crit = F.binary_cross_entropy
learn.model.add_module('2', MultiLabelClassifier())

learn.freeze_to(-1)
learn.fit(lrs, 1, wds=wd, cycle_len=1, use_clr=(8,3))
torch.save(learn.model, 'torch_class_0.h5')

Prediction code:

m = torch.load('torch_class_0.h5')
m.reset()
m.eval()

inp = [7271, 43, 1623, 970, 7, 1426, 33, 77, ] # tokens list
inp = np.array(inp)
inp = V(np.transpose(inp))
inp = inp.unsqueeze(1)
res = m(inp)[0]
res = np.exp(to_np(res))

print(sorted(res[0], reverse=True))

ranih · September 13, 2018, 6:44am

Solved. Is there a way to delete the post?

The problem was that I loaded the PyTorch model without my last layer.