[Nearly Solved] How to do single image normalization

I found an elegant solution to do normalization and prediction on a single image (see below). However, it relies on tfms_from_stats to do the resize and normalization which is sadly deprecated in V1, it seems. How might the same thing be done in V1? I’ve search the docs and forums for a few hrs and didn’t come across anything yet. Specifically, I’m not seeing anything in the current tfms method that would allow for using image_stats.

environment:
BUCKET_NAME: pytorch-serverless
STATE_DICT_NAME: dogscats-resnext50.h5
IMAGE_SIZE: 224
IMAGE_STATS: ([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
LABELS_PATH: lib/labels.txt

STATS = A(*eval(os.environ['IMAGE_STATS']))
SZ = int(os.environ.get('IMAGE_SIZE', '224'))
TFMS = tfms_from_stats(STATS, SZ)[-1]


class SetupModel(object):
	model = classification_model()
	labels = get_labels(os.environ['LABELS_PATH'])

	def __init__(self, f):
		self.f = f
		file_path = f'/tmp/{STATE_DICT_NAME}'
		download_file(BUCKET_NAME, STATE_DICT_NAME, file_path)
		state_dict = torch.load(file_path, map_location=lambda storage, loc: storage)
		self.model.load_state_dict(state_dict), self.model.eval()
		os.remove(file_path)

	def __call__(self, *args, **kwargs):
		return self.f(*args, **kwargs)


def build_pred(label_idx, log, prob):
	label = SetupModel.labels[label_idx]
	return dict(label=label, log=float(log), prob=float(prob))


def parse_params(params):
	image_url = urllib.parse.unquote_plus(params.get('image_url', ''))
	n_labels = len(SetupModel.labels)
	top_k = int(params.get('top_k', 3))
	if top_k < 1: top_k = n_labels
	return dict(image_url=image_url, top_k=min(top_k, n_labels))


def predict(img):
	batch = [T(TFMS(img))]
	inp = VV_(torch.stack(batch))
	return SetupModel.model(inp).mean(0)

If all you want to do is to load an image, normalize and convert to tensor, you can simply use pytorch classes from torchvision.transforms, something like:

tfms = transforms.Compose([
transforms.Resize(img_size),
transforms.CenterCrop(img_size),
transforms.ToTensor(),
transforms.Normalize(*stats)
])
img_tensor = tfms(Image.open(img_path))

1 Like

Thanks. Turning it into a pytorch sensor would require model load and predict in pytorch? I’ll test and complete a code snippet and post it here.

What I pasted already returns a tensor, that what transforms.ToTensor() bit does. So you can for example do img_tensor.unsqueeze(0) to turn it into a single element batch and pass to your model.

1 Like

Thanks. This runs … just the last little bit now to convert the logits to probs? pls help if you can!

Predicted class (preds) works beautifully and there is definitely a noticeable improvement over non-normalized.

current working code is as follows:

#imports in addition to all the fastai stuff
import torchvision
import cv2
from PIL import Image
from torch.autograd import Variable
import torch.nn.functional 

#load fastai model
global learn
learn = load_learner(model_path)
learn.model.eval()

#torch vision normalize variables
normalize = torchvision.transforms.Normalize(
   mean=[0.485, 0.456, 0.406],
   std=[0.229, 0.224, 0.225]
)
preprocess = torchvision.transforms.Compose([
   torchvision.transforms.Scale(256),
   torchvision.transforms.CenterCrop(224),
   torchvision.transforms.ToTensor(),
   normalize
])

#open image with PIL
image = Image.open(os.path.join(DIRECTORY, file))
    
#preprocess and normalized opened image
img_tensor = preprocess(image)

#unused but probably required for pytorch    
#img_tensor.unsqueeze_(0)
#img_variable = Variable(img_tensor)

#run the model and output results
output = learn.model(img_tensor[None].to('cuda'))
print(output)
_, preds = torch.max(output, 1)
print(preds)


sample 'output'
tensor([[-3.0123, -4.3282, -4.7471, -3.2989,  2.0190, -2.9281, -0.3604, -4.1799,
          5.0330, 11.0573,  3.6053,  0.8978, -3.0342, -0.5653, -1.0075, -0.6104,
         -0.8558, -5.8437, -0.1695,  0.1227, -4.5250,  2.3117, -1.5703, -1.6393,
         -3.5580, -2.1925, -2.7243, -0.8255, -1.4389, -5.4793,  3.8727, -0.4162,
         -2.9487, -5.7978, -1.1575, -1.1646,  1.5273, -0.7545, -4.6195, -1.1141,
          3.5043, -0.8944,  7.5363, -5.0029, -4.4060,  3.3839, -0.0992, -2.0310,
          4.6154, -1.2845, -2.3053, -1.8481, -0.9691, -0.9248, -0.5452,  1.5256,
          1.0301, -4.8581, -5.5591,  0.6924,  3.0580, -2.3678,  1.0883,  1.6486,
          1.3088,  1.2242, -0.5292,  0.5927,  0.2839, -4.1949, -2.9982, -0.1012,
         -1.1480,  5.5464, -0.3118, -1.7959, -2.4166,  6.6228, -1.9556, -1.0945,
          2.9208, -0.1738, -1.2499, -5.4003, -0.5292, -2.5430, -4.7775, -2.7871,
         -4.8601, -2.3245, -2.6065, -2.3421, -2.0678,  1.4453, -2.3948,  2.5799,
          8.4218,  2.2024, -2.3860, -4.6700, -1.7071, -0.9541,  2.0593, -2.3841,
         -1.0903,  3.4441,  4.2378, -3.1824, -3.6216, -3.1373, -3.4921, -1.5992,
         -5.2424, -6.0088, -2.4245,  1.9542,  0.1370,  2.8128, -1.0553,  3.7047,
         -1.8671,  2.3636,  2.2904, -4.2111, -3.2302, -0.6135, -0.9715, -5.7585,
         -1.8986,  0.8957, -2.6144, -1.4075,  0.3923, -0.7298, -0.2251, -1.8145,
         -4.7164,  9.4076, -3.0702, -5.3035, -2.2240, -4.6071, -2.3181, -4.4616,
          5.4417, -3.6285,  0.3728,  1.8761, -3.3457,  0.4523,  0.4862, -1.0786]],
       device='cuda:0', grad_fn=)