Using QRNN in Language Models


This was very specific since there is a part of the model where we need to use C-accelerated libraries (here cupy) because the QRNN block isn’t implemented in pytorch. For the rest, I just took the pytorch existing implementation and adapted it inside the existing modules of fastai.

I never said this, I said there was no shared pretrained QRNN model yet. On my experiments, using a QRNN model pretrained on WT-103 then finetuned on imdb worked better, as with the LSTMs.

1 Like


@sgugger I’ve installed cupy-cuda91 with pip according to and am using fastai 1.0.39. The package itself seem to be installed and import goes through, but when I run text_classifier_learner(data_clas, qrnn=True) I get FileNotFoundError: [Errno 2] No such file or directory: ‘/home/ubuntu/.local/share/virtualenvs/ubuntu-7Wf190Ea/lib/python3.6/site-packages/fastai/text/qrnn/forget_mult_cuda.cpp’
Not sure why the file is not installed, any suggestions would be most appreciated!



You don’t need cupy anymore with fastai v1 since I rewrote the c extensions. I’m not sure why you don’t have those files with fastai 1.0.39, they are definitely in the repo. Maybe try a dev install?



@sgugger I am getting an error when trying to run learn.lr_find() with QRNN:

`~/SageMaker/envs/fastai/lib/python3.7/site-packages/fastai/text/qrnn/ in compile(self)
107 if self.ptx is None:
–> 109 program = _NVRTCProgram(kernel.encode(), ‘’.encode())
110 GPUForgetMult.ptx = program.compile()

NameError: name ‘_NVRTCProgram’ is not defined`

Any idea why this could be the case? I install cupy in the right environment and can import it in my notebook.



This is the old code, and I don’t much about cupy, so I can’t help you on this one. Did you try the new version?


(dinesh) #26

Hi, I have created a language model and classification model in awd_lstm with standard parameters with flag QRNN=true. I have a problem while loading the encoder of trained language model into the classification model. It is showing that some layers are missing. Please let me know if any has any solution or share your code if you have done a similar thing. Thanks in advance!
I am new to the forum, please excuse me if my question is trivial.


(Bobak Farzin) #27

Can you share the code you are running or get it to fail with the IMDB_Sample dataset? From there it might be more clear what is going on in your case. I have had success with transfer learning and replicating the IMDB result shared in class with AWD_LSTM, so I can probably help you figure it out.


(dinesh) #28

Here’s my code, right now, I tried to run this on colab, I got another problem it’s showing that some module “forget_mult_cuda” is not available.

1 Like

(Bobak Farzin) #29

I requested access. I suspect you are either on the latest version of fastai but using syntax of the more current version. How does your code compare to the doc examples here?


(dinesh) #30

I have given access. I am using qrnn for the same example in the doc. Once again thanks for your time.Can you share any of your implementations of Qrnn


(Bobak Farzin) #31

You are not the first on to be confused by this! Rather than defining the config as a dict, you make a copy of the default and then tweak and pass to the language_model_learner constructor. This way you are sure to have all the params that are needed to train. I put that into your notebook, can you see it. It should look like this:

config = awd_lstm_lm_config.copy()
config['qrnn'] = True

learn = language_model_learner(data_lm, AWD_LSTM, drop_mult=0.5,config=awd_lstm_lm_config,pretrained=False)
learn.fit_one_cycle(1, 1e-2)

Then proceed as usual. Let me know if you have any trouble with it.

1 Like

Calling language_model_learner with the config parameter kills notebook by running out memory
(dinesh) #32

Thank you so much man… Its working now!!


(dinesh) #33

Hello, I am trying the same thing with transformer also. There I got similar error while loading encoder model. Can you look into the code if you don’t mind…

1 Like

(Bobak Farzin) #34

Looks like a typo. And QRNN is not relevant with Transformer.