NLP Prediction KeyError "tensor(0)"

Hello,

I followed this project to fine-tune an existing german model: https://github.com/jfilter/ulmfit-for-german.
Everything worked so far, but after learning the text_classifier_learner I cann’t predict data. Everytime I call learn.predict(str) I get this error:

What I did so far:
I load the language_model_learner like this (it is a model trained befor 1.0.53 Major new changes and features):

config = awd_lstm_lm_config.copy()
config[‘n_hid’] = 1150
learn_lm = language_model_learner(data_lm, AWD_LSTM, drop_mult=0.5, pretrained=False, config=config)
learn_lm.load(’./nets/ulmfit_for_german_jfilter’)

I fine-tune it and save it with learn_lm.save_encoder('enc').

Then I load it with the text_classifier_learner:

config = awd_lstm_clas_config.copy()
config[‘n_hid’] = 1150
learn = text_classifier_learner(data_train, AWD_LSTM, drop_mult=0.5, config=config)
learn.load_encoder(‘enc’, device=‘cuda:0’)

and learnd the classifier on existing data.

Everything works fine (accuracy (while learning) looks ok), however I cann’t predict anything with learn.predict("str") because of the already stated error.

Do anybody of you have an idea what the problem could be?

Yours,
Pa

No one can answer without knowing how you assembled your data. Learn.predict expect it in the same way.

Unfortunately, I do not know what you mean with “assembly”.
I used an existing word-embedding layer like this:

bpemb_de = BPEmb(lang=“de”, vs=25000, dim=300)
itos = dict(enumerate(bpemb_de.words + [‘xxpad’]))
voc = Vocab(itos)
df_valid = df_valid.text.apply(lambda x: bpemb_de.encode_ids_with_bos_eos(clean(x, stp_lang=‘german’)))

But I think I know what you mean… I just wrote this Method for prediction:

def con(x):
return torch.from_numpy(np.array(bpemb_de.encode_ids_with_bos_eos(clean(x, stp_lang=‘german’))))

learn.predict(con(“str”))

But now I get the error:

ValueError: only one element tensors can be converted to Python scalars

Do I get still something wrong?

Basically, I followed the instructions in this notebook.

I did exacly the same and have the same error…

did you solve it ?

I found out what it is.

The itos has to be converted to a list :slight_smile:

Could someone provide a notebook with a working example?

I also try to implement a german model, but the code on https://github.com/jfilter/ulmfit-for-german
seems very much outdated and is using an old version of FastAi.

Especially, i do not know how to use the following lines:

TextClasDataBunch.from_ids(...)

I think the class should be now TextDataLoaders., but there is no longer a ‘from_ids’ method.

Vocab(itos)

I can not find this class. Is this based on ’ torchtext.vocab. Vocab'?

Thanks for any hints.