Lesson 11 Memory Error pickle dump

(Jon Lo) #1

Hi there,

I am getting a memory error when I run the following code. I am using Paperspace’s P5000 machine (16GB)

def get_vecs(lang, ft_vecs):
vecd = {w:ft_vecs.get_word_vector(w) for w in ft_vecs.get_words()}
pickle.dump(vecd, open(PATH/f’wiki.{lang}.pkl’,‘wb’))
return vecd
en_vecd = get_vecs(‘en’, en_vecs)
fr_vecd = get_vecs(‘fr’, fr_vecs)

MemoryError Traceback (most recent call last)
in ()
----> 1 en_vecd = get_vecs(‘en’, en_vecs)
2 fr_vecd = get_vecs(‘fr’, fr_vecs)

in get_vecs(lang, ft_vecs)
1 def get_vecs(lang, ft_vecs):
2 vecd = {w:ft_vecs.get_word_vector(w) for w in ft_vecs.get_words()}
----> 3 pickle.dump(vecd, open(PATH/f’wiki.{lang}.pkl’,‘wb’))
4 return vecd


I am not using the english and french sentences as a dataset if it makes any difference. I don’t think it makes a difference though, as if I’m understanding the code correctly, I get a memory error when I pickle dump the fasttext english word vectors, which is almost 9GB in size.

Any help would be appreciated. Do I need to upgrade from 16GB or is there a memory leak somewhere?