I want to use both the forward and backward wiki103 models without fine-tuning to do generation, something similar to the example below:
TEXT = "I liked this movie because"
N_WORDS = 40
N_SENTENCES = 2
preds = [learn.predict(TEXT, N_WORDS, temperature=0.75)
for _ in range(N_SENTENCES)]
In https://github.com/fastai/fastbook/blob/master/02_production.ipynb it mentions that, when a model is exported, that “even saves the definition of how to create the Dataloaders [which] (…) is important because otherwise you’d have to redefine how to transform your data in order to use your model in production.”
So my expectation was that, after I downloaded the WIKI103_BWD and WIKI103_FWD them, I’d only have to load them with something like
learn_inference_bwd = load_learner(<path to wiki103 bwd model>)
learn_inference_fwd = load_learner(<path to wiki103 fwd model>)
and then use them in the code as shown above for the generation. Is there a way of making this work as simple as that sounds? So far, I haven’t managed…
I’ve seen examples (e.g., ULMFit without finetuning) where the suggestion, to get the learner, is to do something like:
preTrainedWt103Path = DATA_PATH/'models/wt103'
learn = language_model_learner(data_lm, AWD_LSTM, drop_mult=0.5, pretrained=False)
learn.load_pretrained(wgts_fname = preTrainedWt103Path/'fwd_wt103.h5', itos_fname =preTrainedWt103Path/'itos_wt103.pkl', strict=False )
Somewhere else, in the current docs (https://docs.fast.ai/tutorial.text.html#The-ULMFiT-approach), I can find something similar, where I’d need a ‘language_model_learner’:
learn = language_model_learner(dls_lm, AWD_LSTM, metrics=[accuracy, Perplexity()], path=path, wd=0.1).to_fp16()
But, to get any of these, I’d need dataloaders, which from the page above would be something like:
dls_lm = TextDataLoaders.from_folder(path, is_lm=True, valid_pct=0.1)
These seem like what you’d do when you want to (further) train an existing model by passing more data, but then my question would be ‘why do I need data loaders (pointing to a path), when I’m just using a pre-trained model to do some basic generation’?
If I do need dataloaders, what should I point it to, when I’m not really going to do any fine-tuning? Do you have a snippet of code that loads, for instance the WIKI103_BWD, and do a straight generation with a text prompt?
Any help would be very welcome!