Learn.summary throws ...RuntimeError: Input and hidden tensors are not at the same device, found input tensor at cuda:1 and hidden tensor at cpu

wgpubs · February 12, 2020, 1:32am

When I run learn.summary() it throws this error:

RuntimeError: Input and hidden tensors are not at the same device, found input tensor at cuda:1 and hidden tensor at cpu

Here’s the code …

best_model = SaveModelCallback(monitor='accuracy', comp='max', fname=f'{m_pre}lm_bestmodel{data_suf}')
rnn_regularizer = RNNRegularizer(alpha=2., beta=1.)

learn = language_model_learner(lm_dls,
                               arch=AWD_LSTM, 
                               drop_mult=0.7,
                               opt_func=partial(Adam, wd=wd),
                               moms=(0.8, 0.7, 0.8),
                               path=LM_PATH,
                               cbs=[rnn_regularizer, best_model],
                               metrics=[accuracy, Perplexity()]).to_fp16(clip=0.12)

learn.summary()

The only thing I can think of that I’m doing differently is setting the GPU to #1 via …

torch.cuda.set_device(1)
print(f'Using GPU #{torch.cuda.current_device()}')

… at the beginning of my notebook.

muellerzr · February 12, 2020, 1:44am

wgpubs:

The only thing I can think of that I’m doing differently is setting the GPU to #1 via …
torch.cuda.set_device(1)
print(f'Using GPU #{torch.cuda.current_device()}')
… at the beginning of my notebook.

Does not doing that fix it? Otherwise how are you building your dataloaders? And if you’re using the DataBlock does dblock.summary() work okay?

wgpubs · February 12, 2020, 1:47am

lm_blocks = (TextBlock.from_df(corpus_cols, is_lm=True), )

lm_dblock = DataBlock(blocks=lm_blocks, 
                      get_x=ColReader('text'),
                      splitter=ColSplitter(col='is_valid'))

lm_dls = lm_dblock.dataloaders(df, bs=bsz, seq_len=bptt)

lm_dblock.summary(df) works fine

The stack trace seems to reference something w/r/t the model …

...
~/anaconda3/envs/playground-nlp/lib/python3.7/site-packages/torch/nn/modules/rnn.py in forward_tensor(self, input, hx)
    541         unsorted_indices = None
    542 
--> 543         output, hidden = self.forward_impl(input, hx, batch_sizes, max_batch_size, sorted_indices)
    544 
    545         return output, self.permute_hidden(hidden, unsorted_indices)

~/anaconda3/envs/playground-nlp/lib/python3.7/site-packages/torch/nn/modules/rnn.py in forward_impl(self, input, hx, batch_sizes, max_batch_size, sorted_indices)
    524         if batch_sizes is None:
    525             result = _VF.lstm(input, hx, self._get_flat_weights(), self.bias, self.num_layers,
--> 526                               self.dropout, self.training, self.bidirectional, self.batch_first)
    527         else:
    528             result = _VF.lstm(input, batch_sizes, hx, self._get_flat_weights(), self.bias,

RuntimeError: Input and hidden tensors are not at the same device, found input tensor at cuda:1 and hidden tensor at cpu

wgpubs · February 12, 2020, 3:51am

I’m getting this error even when I try running the sample code here: https://dev.fast.ai/tutorial.ulmfit

sgugger · February 12, 2020, 3:30pm

Should be fixed now.

wgpubs · February 12, 2020, 8:38pm

Works great …

DG11 · November 13, 2020, 8:35pm

The https://dev.fast.ai/tutorial.ulmfit tutorial link is now broken. Is there a different resource the team would recommend?

muellerzr · November 14, 2020, 12:48am

replace dev with docs:

https://docs.fast.ai/tutorial.ulmfit