I followed the docs to create a language model, fine-tune on some industry-specific data and then train a classification model. I save the
- data using save_data()
- encoder (learn.save_encoder())
- model weights (learn.save())
Now for the fun bit. I’m creating a model on a new system with:
data_clas = load_data(Path(''), 'data_clas.pkl', bs=16)
learn = text_classifier_learner(data_clas, AWD_LSTM, drop_mult=0.5, metrics=[accuracy, top_k_accuracy], pretrained=True)
learn.load_encoder('clas_enc')
learn.load('stage-4_3')
And running learn.validate() to get the score. On a GPU instance, this works fine and I get an accuracy of ~60%. This is in Google Colab. If I switch to a CPU instance and run the exact same code, I get an accuracy of 55%. Tried multiple times, factory-resetting the instances.
If on the GPU instance, I change pretrained to false, I get almost the same low accuracy - ~55%. So something about how the pretrained model is loaded onto the CPU might be the issue? Can anyone shed light onto this?