Hi All,
My first fast ai forum post !
For NLP applications, when we create a data bunch for the language model data_lm, or a data bunch for the classification problem data_clas, we are able to get the original data used for training from the data bunch (data_lm.train_ds[0][0] prints the first data element used for the language model, for example)
I have a need where I am not allowed to save the training data in any way in the “model/data bunch” during inference except for the model parameters
Is that possible ? (I’m actually not sure why the training data is carried along this way , is it used anywhere downstream after the model has been trained ? )
In other words, when I load a trained classifier during inference, I have to apply the following steps -
data_bunch = TextClasDataBunch.load(‘text_data_bunch_path’)
classifier = text_classifier_learner(data_bunch,drop_mult = 0.5)
classifier.load(‘trained_classifer_path’)
classifier.predict(x)
In the data_bunch step, I am loading the data bunch, which means I can view the training data used to build the model. Can I circumvent this step ?
Thanks a lot for the help !
Krishna