dnzzl
(Thomas Legrand)
#1
Hello everyone,
I’m trying to deploy ULMFiT models for several languages (English, Spanish and French).
I have exactly used the same workflow for all languages, based on sgugger DeepFrench notebook, using related pre-trained weights.
English model is 138 MB, Spanish 170 MB, and French 93 MB.
However, when it comes to inference time, I have some big difference on the same machine in the same conditions:
- English: around 0.32 s
- Spanish: around 0.17 s
- French: around 10 s !!!
What can explain this difference? How can I improve this?
Bonus question: Was someone able to export the model with the ONNX format ?
Thank you for your help
sgugger
#2
Is this all with the same exact version of fastai? I pushed things to make the predict method for language models faster recently.
dnzzl
(Thomas Legrand)
#3
It is indeed, trained with 1.0.43.dev0.
Maybe I’ll wait for the next release to sort out CPU weights and new learner args.
dnzzl
(Thomas Legrand)
#4
Is it possible the issue is not directly related to Fastai but rather with Spacy (#3242) ?
sgugger
#5
It’ is likely since it’s the only differences I can see between languages.