I am working on productionizing a computer vision model that was trained using FastAI. I was sent a saved model file which I am able to successfully load on my machine using
load_learner('filename'). I was able to test the model by predicting on a few images with
learner.predict('my_image_filename.jpg'), and found that the predictions matched what I expected.
In our production environment, I will need to perform model inference on a single image, but it is a requirement that the inference uses a pure PyTorch forward pass (the system would require significant rearchitecting to accommodate a fastai
I need to understand how
Learner.predict loads and processes a single image so that I can replicate this in pure PyTorch. Is there any way to extract that portion of the code so that I can run it to load/process/transform my images into a tensor that can then be sent through the forward method of
learner.model? I’ve been reading through the docs and forums to try and find this, but I haven’t been able to find where that specific processing code exists, though it must exist since it has to be executed for the
predict function to work.
I have tried to reverse engineer the methods reading the docs and the forums and re-create them in pure PyTorch, and I’ve got somewhat close to results that match
Learner.predict, but still significantly off such that the model performance will be greatly degraded if I use it in production.
Can anyone point me to any attributes/methods that might help uncover what the predict method is doing to load/process images?
For some more specific context, the dataloader that was used to train the model was defined as
dls = ImageDataLoaders.from_df(train_df_f, valid_col='is_valid', seed=365, fn_col='FileName', label_col='Label', y_block=CategoryBlock, bs=BATCH_SIZE, item_tfms=Resize(224) )
I’ve attempted to recreate this in pure PyTorch using several different methods, and here is the one that got the most similar output to
Learner.predict, despite still being significantly different on many test images:
from PIL import Image # Load image input_image = Image.open("my_image.jpg").convert("RGB") # Convert to torch.Tensor values in [0, 255] converter = PILToTensor() tensor_image = converter(img) # Apply necessary transforms tensor_image = (tensor_image.float() / 255.0) resize = T.Resize(224) normalize = T.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) inputs = normalize(resize(tensor_image)).unsqueeze(0) model = learner.model model.eval() with torch.no_grad(): predictions = model(inputs) predicted_probabilities = predictions.softmax(1)
Using this PyTorch code gives directionally correct predictions, but with much less confidence than the fastai predict method.
Please let me know any pointers, really hoping to find a way to get to parity without rewriting my ML inference library to accommodate fastai for this single use case.