Using QRNN in Language Models

Can you explain how you adapted it to the fastai library? Or generally how you would go about adapting other implementations?

How come the pretrained LM with QRNN wasn’t as good as the one without?

I’m training LMs using QRNN and so far I’m just writing down the params I used for each training session and the results after each session. I remember Jeremy changing the dropouts depending on the results from each session, so the whole thing seemed a bit “organic”.

What’s a good way to document the LM training process?

This was very specific since there is a part of the model where we need to use C-accelerated libraries (here cupy) because the QRNN block isn’t implemented in pytorch. For the rest, I just took the pytorch existing implementation and adapted it inside the existing modules of fastai.

I never said this, I said there was no shared pretrained QRNN model yet. On my experiments, using a QRNN model pretrained on WT-103 then finetuned on imdb worked better, as with the LSTMs.


@sgugger I’ve installed cupy-cuda91 with pip according to and am using fastai 1.0.39. The package itself seem to be installed and import goes through, but when I run text_classifier_learner(data_clas, qrnn=True) I get FileNotFoundError: [Errno 2] No such file or directory: ‘/home/ubuntu/.local/share/virtualenvs/ubuntu-7Wf190Ea/lib/python3.6/site-packages/fastai/text/qrnn/forget_mult_cuda.cpp’
Not sure why the file is not installed, any suggestions would be most appreciated!

You don’t need cupy anymore with fastai v1 since I rewrote the c extensions. I’m not sure why you don’t have those files with fastai 1.0.39, they are definitely in the repo. Maybe try a dev install?

@sgugger I am getting an error when trying to run learn.lr_find() with QRNN:

`~/SageMaker/envs/fastai/lib/python3.7/site-packages/fastai/text/qrnn/ in compile(self)
107 if self.ptx is None:
–> 109 program = _NVRTCProgram(kernel.encode(), ‘’.encode())
110 GPUForgetMult.ptx = program.compile()

NameError: name ‘_NVRTCProgram’ is not defined`

Any idea why this could be the case? I install cupy in the right environment and can import it in my notebook.

This is the old code, and I don’t much about cupy, so I can’t help you on this one. Did you try the new version?

Hi, I have created a language model and classification model in awd_lstm with standard parameters with flag QRNN=true. I have a problem while loading the encoder of trained language model into the classification model. It is showing that some layers are missing. Please let me know if any has any solution or share your code if you have done a similar thing. Thanks in advance!
I am new to the forum, please excuse me if my question is trivial.

Can you share the code you are running or get it to fail with the IMDB_Sample dataset? From there it might be more clear what is going on in your case. I have had success with transfer learning and replicating the IMDB result shared in class with AWD_LSTM, so I can probably help you figure it out.

Here’s my code, right now, I tried to run this on colab, I got another problem it’s showing that some module “forget_mult_cuda” is not available.

1 Like

I requested access. I suspect you are either on the latest version of fastai but using syntax of the more current version. How does your code compare to the doc examples here?

I have given access. I am using qrnn for the same example in the doc. Once again thanks for your time.Can you share any of your implementations of Qrnn

You are not the first on to be confused by this! Rather than defining the config as a dict, you make a copy of the default and then tweak and pass to the language_model_learner constructor. This way you are sure to have all the params that are needed to train. I put that into your notebook, can you see it. It should look like this:

config = awd_lstm_lm_config.copy()
config['qrnn'] = True

learn = language_model_learner(data_lm, AWD_LSTM, drop_mult=0.5,config=awd_lstm_lm_config,pretrained=False)
learn.fit_one_cycle(1, 1e-2)

Then proceed as usual. Let me know if you have any trouble with it.


Thank you so much man… Its working now!!

Hello, I am trying the same thing with transformer also. There I got similar error while loading encoder model. Can you look into the code if you don’t mind…

1 Like

Looks like a typo. And QRNN is not relevant with Transformer.

Yaa got it thanks!

I am curious: has anyone tried using SRU blocks?

They seem a slight variation of QRNN, but look simpler and might be faster. For some reason they did not get a lot of love, but apart from random “this is not innovative enough” rejections I cannot understand whether they are been actually tried


I tried to learn QRNN on AMD GPU and I’ve got this error:

2020-01-27 13:33:45 UTC -- Traceback (most recent call last): 2020-01-27 13:33:45 UTC -- File "", line 66, in <module> 2020-01-27 13:33:45 UTC -- learn = language_model_learner(data_lm, AWD_LSTM, config=config, pretrained=False, drop_mult=0.1, wd=0.1) #new 2020-01-27 13:33:45 UTC -- File "/usr/local/lib/python3.6/dist-packages/fastai/text/", line 206, in language_model_learner 2020-01-27 13:33:45 UTC -- model = get_language_model(arch, len(data.vocab.itos), config=config, drop_mult=drop_mult) 2020-01-27 13:33:45 UTC -- File "/usr/local/lib/python3.6/dist-packages/fastai/text/", line 197, in get_language_model 2020-01-27 13:33:45 UTC -- encoder = arch(vocab_sz, **config) 2020-01-27 13:33:45 UTC -- File "/usr/local/lib/python3.6/dist-packages/fastai/", line 66, in _init 2020-01-27 13:33:45 UTC -- old_init(self, *args,**kwargs) 2020-01-27 13:33:45 UTC -- File "/usr/local/lib/python3.6/dist-packages/fastai/text/models/", line 88, in __init__ 2020-01-27 13:33:45 UTC -- from .qrnn import QRNN 2020-01-27 13:33:45 UTC -- File "/usr/local/lib/python3.6/dist-packages/fastai/text/models/", line 11, in <module> 2020-01-27 13:33:45 UTC -- forget_mult_cuda = load(name='forget_mult_cuda', sources=[fastai_path/f for f in files]) 2020-01-27 13:33:45 UTC -- File "/root/.local/lib/python3.6/site-packages/torch/utils/", line 679, in load 2020-01-27 13:33:45 UTC -- is_python_module) 2020-01-27 13:33:45 UTC -- File "/root/.local/lib/python3.6/site-packages/torch/utils/", line 865, in _jit_compile 2020-01-27 13:33:45 UTC -- with_cuda=with_cuda) 2020-01-27 13:33:45 UTC -- File "/root/.local/lib/python3.6/site-packages/torch/utils/", line 899, in _write_ninja_file_and_build 2020-01-27 13:33:45 UTC -- verbose) 2020-01-27 13:33:45 UTC -- File "/root/.local/lib/python3.6/site-packages/torch/utils/", line 966, in _prepare_ldflags 2020-01-27 13:33:45 UTC -- extra_ldflags.append('-L{}'.format(_join_cuda_home('lib64'))) 2020-01-27 13:33:45 UTC -- File "/root/.local/lib/python3.6/site-packages/torch/utils/", line 1254, in _join_cuda_home 2020-01-27 13:33:45 UTC -- raise EnvironmentError('CUDA_HOME environment variable is not set. ' 2020-01-27 13:33:45 UTC -- OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root.

Is there possibility to train QRNN with AMD GPU? What should I do?

No, it uses custom CUDA kernel compiled with jit in PyTorch that I’m guessing does not work on a non-NVIDIA GPU.