Can't replicate ULMFit validation predictions

So each column (not row) is an example (there are 48 in each batch per your batch size hyperparameter).

Thus the first example is: rx[:,0]

What are those numbers? You are correct … they are the corresponding indices in your vocab for the tokens in each example. Padding is applied as needed before the text so you can expect to see a bunch of 1’s depending on the size of your text.

I have no idea what the incrementing numbers are … maybe just part of the output displayed and not values in your matrix perhaps. I dunno.

Ok, got it, thanks!

So now I change how I get the data to this:

# get the first row of the validation set
print("validation dataset row:", rx[:,0])

# convert it to string tokens
pred_toks = [itos[i] for i in rx[:,0]]

# convert it back to vocab indexes
pred_idxs = [stoi[p] for p in pred_toks]

# print these indexes out
print("converted row:", pred_idxs)

The array pred_idxs is the same as the original, so that is good.

Now I do this:

# pass to the model to make the prediction
result, *_  = m(variable)

result

and get:

Variable containing:
  7.8798 -10.7048
  8.7254 -11.7704
  .. truncated...
  5.5648  -7.5580
 18.9520 -19.4077
[torch.cuda.FloatTensor of size 160x2 (GPU 0)]

How do I get from that to:

array([ 0.88123, -0.47445], dtype=float32)

(assuming they are the per-class probabilities)

what is “variable”?

Hard for me to interpret what your results reference, but it looks like class predictions (0 and 1) for a batch of 160 examples. I think the dimensions of variable are wrong, whatever it is … for example, in your first post you show 10 predictions for a single example that happens to contain 10 tokens. What you should see is a single prediction.

Look at the dimensions of rx and make sure variable is the same. I think you may simply have the dimensions mixed up.

My understanding is that this generates predictions against the validation set. preds.shape is (185, 2) which is (num_rows_in_validation_set x number_classes) so I think this is right.

The ultimate aim is to make a prediction on some text, but I figured if I use a row from the validation set then I’ll be able to tell I’m doing it correctly.

So I guess the question is:

If I have an array of correctly encoded text (ie, have done stoi() correctly on each word and have an array of them) how do I make a prediction?

This seems like it should be easier than I’ve found it so I assume I missing something.

variable.shape, rx[:,0].shape gives me(torch.Size([1, 160]), torch.Size([160]))

So you see that they aren’t the same. Assuming your text has 160 tokens in it, you want something like 160x1.

It’s hard for me to tell what you are doing wrong because I can’t see your full source … just the bits and pieces of your code you want to share.

Make sure you understand the dimensions of your batches; what each axis represents. From there, put your numericalized data into a similar format where the batch size dimension (columns) is equal to 1. Pass that into your model and you should be golden.

That code above is literally my code. It’s straight out of the IMDB sentiment classification notebook with a different dataset.

To recap:

# get predictions against the validation set
preds = learn.predict() 
# preds[0] is what I'm trying to replicate. 
# preds[0] is: array([ 0.88123, -0.47445], dtype=float32)

# get the first row of the validation set
rx, ry = next(iter(val_dl))
print("validation dataset row:", rx[:,0])

# convert it to string tokens
pred_toks = [itos[i] for i in rx[:,0]]

# convert it back to vocab indexes
pred_idxs = [stoi[p] for p in pred_toks]

# print these indexes out
print("converted row:", pred_idxs)

# they appear to be the same, so the prediction should be the same

# get the model
m = learn.model

# set the batch size
m[0].bs = 1

# put into evaluation mode
m.eval()
m.reset()

# convert the (list of) array(s) of indexes to a Pytorch vector
tensor = T([pred_idxs])

# we need a PyTorch variable
variable = V(tensor)

# pass to the model to make the prediction
result, *_  = m(variable)

result

Well this sucked.

If anyone is interested, here is how to do it:

rx, ry = next(iter(val_dl))

# convert it to string tokens
pred_toks = [itos[i] for i in rx[:,0]]

# convert it back to vocab indexes
pred_idxs = [stoi[p] for p in pred_toks]

print(rx[:, :1].shape) # is torch.Size([160, 1])

# make an array the same shape
test_input = np.swapaxes(np.array([pred_idxs]), 0, 1)
print(test_input.shape) # is (160, 1)

# these two should be the same

# predict on the original data
print(learn.predict_array(test_input)[0])

# predict on the reconstructed data
print(learn.predict_array(rx[:, :1])[0])

Only saw this thread now (feel free to tag me in future ULMFiT related questions).

Sorry that you’ve found this so hard to use, @nickl. I was meaning to add an evaluation script but didn’t get around to it yet.

Would you consider submitting a PR to add your script to the imdb_scripts repo to make it easier for others to use in the future?

@sebastianruder Thanks, for sure.

What do you want for a PR? Just an example of how to use a pretrained model on text (which was basically what I was trying to do?) Or is the reconstruction of the validation set useful to?

I think the best thing would be to have an example of predicting on a sentence or text. That should make it easier for people to play around with the model on the command line.

Besides that, I think having an example where we evaluate on a separate test set would also be useful. I can also add this once I have time.

I can do the prediction example for sure.

Cool! That’d be awesome! Feel free to submit a PR with an initial version or post here if you want feedback.

Hey @sebastianruder I’ve done a bit of work on this. I wanted to make it as clear and simple as possible.

I think this version is simple and better than the one above.

I do have one question. Is there a way to create a rnn_classifer without all those parameters being required? A lot of them look like they should only be required at train time.

Also, generally, does this code look right to you?

def load_model(self):

    bptt,em_sz,nh,nl = 70,400,1150,3
    dps = np.array([0.4,0.5,0.05,0.3,0.4])*0.5
    num_classes = 2 # this is the number of classes we want to predict
    vs = len(self.itos)

    self.model = get_rnn_classifer(bptt, 20*70, num_classes, vs, emb_sz=em_sz, n_hid=nh, n_layers=nl, pad_token=1,
            layers=[em_sz*3, 50, num_classes], drops=[dps[4], 0.1],
            dropouti=dps[0], wdrop=dps[1], dropoute=dps[2], dropouth=dps[3]) 

    trained_encoder_path = str((self.MODEL_PATH/'lm1_enc.h5').absolute())
    trained_classifier_path = str((self.MODEL_PATH/'clas_2.h5').absolute())

    self.model[0].load_state_dict(torch.load(trained_encoder_path, map_location=lambda storage, loc: storage))
    self.model.load_state_dict(torch.load(trained_classifier_path, map_location=lambda storage, loc: storage))

    self.model.reset()
    self.model.eval()


def predict_text(self, text):

    # prefix text with tokens:
    #   xbos: beginning of sentence
    #   xfld 1: we are using a single field here
    input_str = 'xbos xfld 1 ' + text

    # predictions are done on arrays of input. 
    # We only have a single input, so turn it into a 1x1 array
    texts = [input_str]

    # tokenize using the fastai wrapper around spacy
    tok = Tokenizer().proc_all_mp(partition_by_cores(texts))
    
    # turn into integers for each word
    encoded = [self.stoi[p] for p in tok[0]]
    
    # we want a [x,1] array where x is the number 
    #  of words inputted (including the prefix tokens)
    ary = np.reshape(np.array(encoded),(-1,1))

    # turn this array into a tensor
    tensor = torch.from_numpy(ary) 

    # wrap in a torch Variable       
    variable = Variable(tensor)
    
    # do the predictions
    predictions = self.model(variable)    

    # convert back to numpy
    numpy_preds = predictions[0].data.numpy()

    return(numpy_preds[0])
2 Likes

@nickl I think it woud be better to do it for a batch of texts rather than just a single text.

Maybe? It makes the code a lot more complicated, and there is the existing TestSetDataLoader 9or whatever it is called) which does exactly that.

I don’t think there are any examples of making a single prediction as you would need in an interactive application.

As said above, I think predicting on a single text input would definitely be useful.

@nickl, did you add those functions to the model class or why are they using self as argument? It might be better to just add them to the script for now. Also, you should be able to just load the final model, without loading the encoder and classifier separately.

For the tokenization, you probably don’t need to partition by cores for a single text input.

Looks good otherwise.

I’m traveling from today for a week, so will be less responsive. Feel free to submit a PR once it’s ready and I’ll take a look at it once I’m back or someone else can in the meantime.

@sebastianruder I pulled that code from another thing I’m working on, which is where the self arguments come from. I’ll clean that up.

I’ll test loading the final model only. I thought I tried it and it failed, but I don’t remember the specifics.

If you are coming to ACL then welcome to Australia! I’m sadly in both Adelaide and Sydney while it is on but not in Melbourne - otherwise I’d buy you drinks/coffee/something.

Pull request with text prediction script available at https://github.com/fastai/fastai/pull/641

Tagging @sebastianruder

I think these two results may be different because rx[:, :1] will have paddings before the text?