Porting Text Learner to PyTorch

shensmobile · July 17, 2020, 6:17pm

Hi all!

I’ve been using Fast.AI to begin learning about NLP and have had amazing success with the text learner. I’ve developed a few models that I’d like to deploy in some python apps that I’m playing around with. For testing, I’ve just been importing everything from fastai.text, loading my saved model into a learner, and running learn.predict() but the entire fastai library is quite large and it makes the installed package too large.

I think I have two ways around this:

Hope to god that there’s a way to reduce the footprint of the fast.ai library and continue performing inference using learn.predict()
Export the model weights and import into PyTorch so I don’t need to import the full fast.ai library

I’ve been thinking about using option 2 since I have a bit more control there, but I’m un-sure how to proceed. I’ve made data loaders for images going into resnet50, but have not worked as much with RNNs in PyTorch. What is learn.predict() doing when I feed it text? Is it just tokenizing the input text and putting it into the AWD-LSTM model? How do I make sure that I tokenize my input text using the same dict/tokens as I was using during training with fast.ai?

If anyone has any advice or guidance, I’d really appreciate it. Thank you so much for your help!

morgan · July 21, 2020, 10:39am

Have you seen the work @muellerzr is doing with fastinference? If not then it should be of help I think