Using QRNN in Language Models

sgugger · September 4, 2018, 2:15pm

This was very specific since there is a part of the model where we need to use C-accelerated libraries (here cupy) because the QRNN block isn’t implemented in pytorch. For the rest, I just took the pytorch existing implementation and adapted it inside the existing modules of fastai.

I never said this, I said there was no shared pretrained QRNN model yet. On my experiments, using a QRNN model pretrained on WT-103 then finetuned on imdb worked better, as with the LSTMs.

Serhii · January 9, 2019, 3:00pm

@sgugger I’ve installed cupy-cuda91 with pip according to https://docs-cupy.chainer.org/en/stable/install.html and am using fastai 1.0.39. The package itself seem to be installed and import goes through, but when I run text_classifier_learner(data_clas, qrnn=True) I get FileNotFoundError: [Errno 2] No such file or directory: ‘/home/ubuntu/.local/share/virtualenvs/ubuntu-7Wf190Ea/lib/python3.6/site-packages/fastai/text/qrnn/forget_mult_cuda.cpp’
Not sure why the file is not installed, any suggestions would be most appreciated!

sgugger · January 9, 2019, 3:04pm

You don’t need cupy anymore with fastai v1 since I rewrote the c extensions. I’m not sure why you don’t have those files with fastai 1.0.39, they are definitely in the repo. Maybe try a dev install?

gradstudentdescent · January 30, 2019, 6:59am

@sgugger I am getting an error when trying to run learn.lr_find() with QRNN:

`~/SageMaker/envs/fastai/lib/python3.7/site-packages/fastai/text/qrnn/forget_mult.py in compile(self)
107 if self.ptx is None:
108
–> 109 program = _NVRTCProgram(kernel.encode(), ‘recurrent_forget_mult.cu’.encode())
110 GPUForgetMult.ptx = program.compile()
111

NameError: name ‘_NVRTCProgram’ is not defined`

Any idea why this could be the case? I install cupy in the right environment and can import it in my notebook.

sgugger · January 30, 2019, 2:20pm

This is the old code, and I don’t much about cupy, so I can’t help you on this one. Did you try the new version?

msrdinesh · May 16, 2019, 4:32pm

Hi, I have created a language model and classification model in awd_lstm with standard parameters with flag QRNN=true. I have a problem while loading the encoder of trained language model into the classification model. It is showing that some layers are missing. Please let me know if any has any solution or share your code if you have done a similar thing. Thanks in advance!
I am new to the forum, please excuse me if my question is trivial.

bfarzin · May 16, 2019, 4:45pm

Can you share the code you are running or get it to fail with the IMDB_Sample dataset? From there it might be more clear what is going on in your case. I have had success with transfer learning and replicating the IMDB result shared in class with AWD_LSTM, so I can probably help you figure it out.

msrdinesh · May 16, 2019, 5:40pm

https://colab.research.google.com/drive/1xdfrJm8H0RAewvB31QdrVkIKjYA9Mj7F#scrollTo=kD3i656hhLrW

Here’s my code, right now, I tried to run this on colab, I got another problem it’s showing that some module “forget_mult_cuda” is not available.

bfarzin · May 16, 2019, 6:33pm

I requested access. I suspect you are either on the latest version of fastai but using syntax of the more current version. How does your code compare to the doc examples here?

msrdinesh · May 16, 2019, 6:47pm

I have given access. I am using qrnn for the same example in the doc. Once again thanks for your time.Can you share any of your implementations of Qrnn

bfarzin · May 16, 2019, 7:34pm

You are not the first on to be confused by this! Rather than defining the config as a dict, you make a copy of the default and then tweak and pass to the language_model_learner constructor. This way you are sure to have all the params that are needed to train. I put that into your notebook, can you see it. It should look like this:

config = awd_lstm_lm_config.copy()
config['qrnn'] = True

learn = language_model_learner(data_lm, AWD_LSTM, drop_mult=0.5,config=awd_lstm_lm_config,pretrained=False)
learn.fit_one_cycle(1, 1e-2)

Then proceed as usual. Let me know if you have any trouble with it.

msrdinesh · May 17, 2019, 2:48am

Thank you so much man… Its working now!!

msrdinesh · May 17, 2019, 5:01am

Hello, I am trying the same thing with transformer also. There I got similar error while loading encoder model. Can you look into the code if you don’t mind…
https://colab.research.google.com/drive/1xdfrJm8H0RAewvB31QdrVkIKjYA9Mj7F#scrollTo=kD3i656hhLrW

bfarzin · May 17, 2019, 2:38pm

Looks like a typo. And QRNN is not relevant with Transformer.

msrdinesh · May 23, 2019, 6:09am

Yaa got it thanks!

miko · December 2, 2019, 5:57pm

I am curious: has anyone tried using SRU blocks? https://arxiv.org/abs/1709.02755

They seem a slight variation of QRNN, but look simpler and might be faster. For some reason they did not get a lot of love, but apart from random “this is not innovative enough” rejections I cannot understand whether they are been actually tried

kontrabas · January 27, 2020, 2:32pm

Hey,

I tried to learn QRNN on AMD GPU and I’ve got this error:

2020-01-27 13:33:45 UTC -- Traceback (most recent call last): 2020-01-27 13:33:45 UTC -- File "pretrain.py", line 66, in <module> 2020-01-27 13:33:45 UTC -- learn = language_model_learner(data_lm, AWD_LSTM, config=config, pretrained=False, drop_mult=0.1, wd=0.1) #new 2020-01-27 13:33:45 UTC -- File "/usr/local/lib/python3.6/dist-packages/fastai/text/learner.py", line 206, in language_model_learner 2020-01-27 13:33:45 UTC -- model = get_language_model(arch, len(data.vocab.itos), config=config, drop_mult=drop_mult) 2020-01-27 13:33:45 UTC -- File "/usr/local/lib/python3.6/dist-packages/fastai/text/learner.py", line 197, in get_language_model 2020-01-27 13:33:45 UTC -- encoder = arch(vocab_sz, **config) 2020-01-27 13:33:45 UTC -- File "/usr/local/lib/python3.6/dist-packages/fastai/core.py", line 66, in _init 2020-01-27 13:33:45 UTC -- old_init(self, *args,**kwargs) 2020-01-27 13:33:45 UTC -- File "/usr/local/lib/python3.6/dist-packages/fastai/text/models/awd_lstm.py", line 88, in __init__ 2020-01-27 13:33:45 UTC -- from .qrnn import QRNN 2020-01-27 13:33:45 UTC -- File "/usr/local/lib/python3.6/dist-packages/fastai/text/models/qrnn.py", line 11, in <module> 2020-01-27 13:33:45 UTC -- forget_mult_cuda = load(name='forget_mult_cuda', sources=[fastai_path/f for f in files]) 2020-01-27 13:33:45 UTC -- File "/root/.local/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 679, in load 2020-01-27 13:33:45 UTC -- is_python_module) 2020-01-27 13:33:45 UTC -- File "/root/.local/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 865, in _jit_compile 2020-01-27 13:33:45 UTC -- with_cuda=with_cuda) 2020-01-27 13:33:45 UTC -- File "/root/.local/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 899, in _write_ninja_file_and_build 2020-01-27 13:33:45 UTC -- verbose) 2020-01-27 13:33:45 UTC -- File "/root/.local/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 966, in _prepare_ldflags 2020-01-27 13:33:45 UTC -- extra_ldflags.append('-L{}'.format(_join_cuda_home('lib64'))) 2020-01-27 13:33:45 UTC -- File "/root/.local/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 1254, in _join_cuda_home 2020-01-27 13:33:45 UTC -- raise EnvironmentError('CUDA_HOME environment variable is not set. ' 2020-01-27 13:33:45 UTC -- OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root.

Is there possibility to train QRNN with AMD GPU? What should I do?

sgugger · January 27, 2020, 3:22pm

No, it uses custom CUDA kernel compiled with jit in PyTorch that I’m guessing does not work on a non-NVIDIA GPU.