How to export a classifier with SubwordTokenizer?

Hi, I’m trying to train a text classifier based on a ULMFit model with latest fastai2. But if I use a SubwordTokenizer, I’m unable to export it. For the language model itself it is fine, but the classifier needs to be deployed elsewhere so I need to export it. But it doesn’t work because the SubwordTokenizer is a SwigPyObject, like this:

File “/app/text_ml/model_implementations/fastai_ulmfit/”, line 287, in train_classifier
File “/usr/local/lib/python3.8/site-packages/fastai/”, line 375, in export, self.path/fname, pickle_module=pickle_module, pickle_protocol=pickle_protocol)
File “/usr/local/lib/python3.8/site-packages/torch/”, line 379, in save
_save(obj, opened_zipfile, pickle_module, pickle_protocol)
File “/usr/local/lib/python3.8/site-packages/torch/”, line 484, in _save
TypeError: cannot pickle ‘SwigPyObject’ object

My definitions are the following:

tok = SubwordTokenizer(lang=self._language, sp_model=self._language_model.paths["tuned_model_path"]/'spm.model')
 		dblocks = DataBlock(
 			blocks=(TextBlock.from_df('text', tok=tok, vocab=self._language_model.get_vocabulary()), CategoryBlock),
 		dls = dblocks.dataloaders(self.df, bs=batch_size, num_workers=2)
early_stopping_cb = partial(EarlyStoppingCallback, monitor='valid_loss', min_delta=0.01, patience=2)

classifier = text_classifier_learner(
	metrics=[accuracy, error_rate, Recall(average='macro')],
	cbs=[CSVLogger, early_stopping_cb()]

If I remove the tok=tok part from the TextBlock definition, the export works, but obviously does not use the tokenization.
I’ve found that I need to remove the extra metrics and callbacks to be able to export, like this:

classifier.metrics = []

Is there also a good way to also remove the tokenizer so that it doesn’t affect the rest of the model?

Never mind, actually the issue was an old version of SentencePiece (which I had downgraded to debug another issue), upgrading back to 0.1.96 solved the problem.