ULMFiT: Understanding learn.predict output

I’ve written a test script for testing the classifier trained using train_clas.py (on a custom dataset).

Now I don’t have any OOB test data so I usually just send in the (only one) text string, tokenize it, load the classifier, run predict and evaluate the prediction(s) myself. Some like this:

tst_sent = tok2id(dir_path, tst_tok, max_vocab=30000, min_freq=1)
tst_ds = TextDataset(tst_sent, np.zeros(len(tst_sent)))
tst_dl = DataLoader(tst_ds, bs//2, transpose=True, num_workers=1, pad_idx=1, sampler=None)
md = ModelData(dir_path, None, None, tst_dl)
learn = RNN_Learner(md, TextModel(m), opt_fn=opt_fn)

prob = learn.predict(is_test=True)
# output: [[-11.48408  13.15376]] <-- what are these numbers?

pred = np.argmax(prob, axis=1)
# output: [1]

My question is simple: what does learn.predicts() return in the ULMFiT based classifier?

1 Like

These numbers are per-class scores. The higher score is the result the model chooses.

If the classes are class_a, class_b you can get the result with something like this:

classes = ["class_a", "class_b"]
return classes[np.argmax(raw_scores)]
1 Like

Thanks for your reply. What I had meant to ask was: what function was used to get these scores?

It’s a PoolingLinearClassifier. See https://github.com/fastai/fastai/blob/master/fastai/lm_rnn.py#L175 for the code.

1 Like

Ah, right. So I just softmax it and I’d have a nice distribution in [0,1).

Imagine that already have the model and I just need to test a single prediction.
What is the best way to do this?

  1. Tokenize your sentence (such as with the Tokenizer module)
  2. Map the tokens to indexes. The logic will be the same as here.
    … and the rest of it is the same as the code snippet in the question.

Use the (newish) script in designed for doing exactly this: https://github.com/fastai/fastai/blob/master/courses/dl2/imdb_scripts/predict_with_classifier.py


For make a single prediction ,

  1. make sure you have the network built.

def classifier_model_network(dir_path, cuda_id=0):
    :param dir_path:
    :param cuda_id:
    if not hasattr(torch._C, '_cuda_setDevice'):
        print('CUDA not available. Setting device=-1.')
        cuda_id = -1

    dir_path = Path(dir_path)
    # load vocabulary lookup
    itos = pickle.load(open(dir_path / 'tmp' / 'itos.pkl', 'rb'))
    n_tokens = len(itos)

    dps = np.array([0.4, 0.5, 0.05, 0.3, 0.4]) * dropmult

    m = get_rnn_classifier(bptt, 20 * 70, label_class, n_tokens, emb_sz=em_sz, n_hid=nh, n_layers=nl,
                           layers=[em_sz * 3, 50, label_class], drops=[dps[4], 0.1],
                           dropouti=dps[0], wdrop=dps[1], dropoute=dps[2], dropouth=dps[3])
    m.eval  # just to make sure dropout is being applied
    return m
  1. Fetch the model Learner .
def get_learner(dir_path, model_network, modelData, cuda_id=0):
    if not hasattr(torch._C, '_cuda_setDevice'):
        print('CUDA not available. Setting device=-1.')
        cuda_id = -1
    if cuda_id == -1:
        map_location = 'cpu'
        map_location = None

    dir_path = Path(dir_path)

    learner = RNN_Learner(modelData, TextModel(to_gpu(model_network)))
    learner.model.eval  # just to make sure dropout is being applied

    loaded_weights = torch.load(os.path.join(dir_path, "models/fwd_clas_1.h5"), map_location=map_location)

    # confirmed that the new parameters match those of the loaded model
    for k, v in loaded_weights.items():
        print(k, np.all(v == learner.model.state_dict()[k]))

    return learner

def predict(learner: Learner, X):
    return [softmax(x) for x in learner.predict_dl(create_dl(X))]

def softmax(x):
    """Compute softmax values for each sets of scores in x."""
    return np.exp(x) / np.sum(np.exp(x), axis=0)

def get_learner(dir_path,cuda_id):
    network = classifier_model_network(dir_path, cuda_id)
    learn = get_learner(dir_path, network,
                              ModelData(dir_path, None, None),
    return learn

  1. Now make the prediction with predict method.
# fetch the ids of the text 
def get_tokens(sentence, lang='en'):
    fetch work tokens for the sentence
    :param text:
    :param lang:
    text = f'\n{BOS} {FLD} 1 ' + sentence
    return Tokenizer(lang=lang).proc_text(text)

def convert2ids(tokens: list, tok2id_model_path):
    itos = pickle.load(open(tok2id_model_path,'rb'))
    stoi = collections.defaultdict(lambda: 0, {v: k for k, v in enumerate(itos)})

    predict_lm = np.array([stoi[p]  for p in tokens])
    return predict_lm
dir_path= ' '  # data directory path of the model built and saved .
lm_model_path= ' ' #file path of itos.pkl file in data_dir where you have tok2id file dumped.  
toks = get_tokens('some text sentence you want to predict')
ids = convert2ids(toks, lm_model_path)

predict(learner, (ids))


What’s inside create_dl function?

Here is the create_dl function

def create_dl(X):
    tst_ds = TextDataset([X], np.zeros(len(X)))
    tst_dl = DataLoader(tst_ds, bs//2, transpose=True, num_workers=1, pad_idx=1, sampler=None)
    return tst_dl

I have a question about the script.
On line 100, why is the first prediction being sent to softmax?
return softmax(numpy_preds**[0]**)[0]

I have a classifier with 300 labels and when I feed a 351 token string to it, my “numpy_preds” is shaped (351, 300). This confuses me since it appears to be returning a prediction for each input token (the outputs of each time step of the lstm?). If this is the case, then I suppose the last of these predictions would be the one to send to softmax:
return softmax(numpy_preds**[-1]**)[0]

Either way, I’m getting nonsensical results for my predictions. My model was reporting .91 accuracy while training, but I can’t figure out how to actually use the model.

My evaluation code:

for i in range(inputs.shape[0]):
    tensor = torch.from_numpy(np.array([inputs[i]]))
    preds = model(Variable(tensor))
    preds = preds[0].data.numpy()
    pred = softmax(preds[0])  #also tried -1
    idx = np.argmax(pred)
    if targets[i]==idx: