Anyone tried training a seq2seq model using pytorch? I’ve been trying to train a model for a while but keep getting RuntimeError: cuda runtime error (2) : out of memory
no matter what I do. I’ve tried with smaller hidden size, layers, batch_size
and limiting the max_sequence_length
but still getting the same error. My next move is to try on a machine with more GPU memory, but was wondering if there’s anything I could do to check for memory leaks and/or other issues that might be making it run out of memory.
This usually means you need a smaller batch size, How much memory does your GPU have and what is your batch size?
1 Like
Using a batch size of 8 with 6GB of GPU memory.
Managed to work it out. The embedding I was using had too high of a dimensionality (300) and vocab size (around 70000). Reducing them both ended up using less memory.