CUDA OOM on text_classifier_learner after unfreezing all layers

I’m trying to train a text classifier on an 8GB Nvidia RXT and I can’t seem to unfreeze all the layers and retrain.

I’ve looked at the OOM debugger and tried using gpu_mem_restore_ctx to clean up any potential memory leaks but I’m still getting errors.

For more clarify I am using a language model encoder that I trained with a bs=32 (trained fined without any errors). Using that encoder, when I set my classifier to a bs=8 and bptt=20 I can unfreeze and train until about layer 3 (if I watch nvidia-smi every step I unfreeze at takes more and more memory) but after I unfreeze() all layers I get and OOM error.

Are these parameters still too much for an 8GB card to handle? I have some pseudo code below to illustrate when the issue occurs.

learn.freeze_to(-1)
learn.fit_one_cylce(**kwargs) # Trains fine
 
learn.unfreeze()
learn.fit_one_cylce(**kwargs) # Out of Memory on first epoch
learn.freeze_to(-1)
learn.fit_one_cylce(1, **kwargs) # Trains fine

learn.freeze_to(-2)
learn.fit_one_cylce(1, **kwargs) # Trains with more CUDA memory usage

learn.freeze_to(-3)
learn.fit_one_cylce(1, **kwargs) # Trains with more and more CUDA memory usage

learn.unfreeze()
learn.fit_one_cylce(3, **kwargs) # Out of Memory on 1st epoch

I think I’m mostly confused as to why I can train my LM with a bs=32 using around 30K records but when trying to train this classifier with around 1500 records I can’t seem to get past unfreezing all layers.

Any thoughts would be appreciated.

As a side note, I have fastai installed in editable mode and have the latest changes pulled.

Thanks,
Andrew