I’ve used fast.ai to train an image classifier, and it works really well. Thanks for this great software!
My question is about running the image classifier more efficiently.
My application has a two-stage process: first an object detector (YOLO) finds all of the objects in the image, and then a classifier runs on each detected object. The classifier is given a subset of the original image, cropped to include only the detected object.
(You may ask: why doesn’t the object detector do the classification? It’s because my experimentation has found the system performs better if object detection and object classification use separate models.)
At the moment, I’m doing classification serially: I simply loop over all the detected objects, pass a cropped image to fast.ai, and run the classification inference. The problem is, this is horribly inefficient, especially for images with many detected objects. I’m wondering whether I can classify all detected objects in a single inference.
So my question is: how can I design a fast.ai image classifier so that it takes an input image, plus a list of bounding boxes, and returns the classification of each bounding box — in a single model?
Surely this isn’t an uncommon situation, right?