After creating an Arabic MultiFiT model (based on work by @pierreguillou ) , I tried to use it for inference. I can load the learner from the export.pkl file but when I try to predict, sentencepiece is called to encode the entry from a hardcoded path as “/root/.fastai/data/[wiki-path}/tmp/spm.model”. I can, of course, create this path locally and copy spm.model but not easy to deploy. I tried to pass the text as encoded by sp but did not work
How can I get around this? Here’s the error:
/usr/local/lib/python3.6/dist-packages/sentencepiece.py in Load(self, filename)
117 def Load(self, filename):
--> 118 return _sentencepiece.SentencePieceProcessor_Load(self, filename)
120 def LoadOrDie(self, filename):
OSError: Not found: "/root/.fastai/data/.../tmp/spm.model": No such file or directory Error #2
When loading the learner, does learn.data.path give you anything?
Yes, that’s the current folder, where the exported model is loaded from.
learn = load_learner('/content/','ar_classifier_hard_sp15_multifit.pkl')
Result (from colab notebook): PosixPath(’/content’)
I guess, at least, part of the hard coded path is in learn somewhere.
The SentencePiece training model is saved in
cache_dir (an argument of
SentencePieceTokenizer you can set to whatever you like). It’s very likely to save the absolute path when exporting the learner, I can check if we can save it as a relative path, which would probably be easier for deployment.
Thanks Sylvain. That would be great. I recall switching Fastai versions (to 1.0.57) when working on the databunch b/c of sp. For now, I got it to work through docker on Heroku and it seems to work fine. Not a great coder, but here’s what I did:
data_path = Config.data_path()
name = f'arwiki/corpus2_100/tmp/'
path_t = data_path/name
Here’s the ‘polished’ app: Arabic Sentiment Analyzer
First, AbuFadl, congrats to this neat project.
I am having the same problem but I want to use several inference learners at the same time. Moving different spm.model files around for each predict call seems not optimal.
Is there a way to set this path in the inference learner instance?
Is there a way to set this path before I export my model after training, that does not affect the export?
I had some success by changing the class SPProcessor2 to this:
def process(self, ds):
I can create a working inference learner if I recreate the folder structure for sps.model and export.pkl.
I could set the correct path in processes with ds.path (cache_dir = ds.path)
However, in the method _encode_batch(), there is an error “unk is not defined”.
module versions for export and load are identical.
This should not depend on the folder structure.