How to go from accurate binary image classification to (multi) object detection?

Let’s say I have a well trained NN to classify images of cats and dogs, where each trained image is nicely cropped around the face of the animal (either cat or dog), 1 animal per image.

Now I want to use this trained NN to locate faces of cats and dogs in images that could have several faces of cats or dogs, even faces of other animals or other objects, or no faces at all.

Is there a straightforward way to do this with fastai? A sort of function like (making this up) locateObjects(image, model), where image is the target image with potentially many objects, and model is the trained classifier?

Or is it better instead to train the NN with the original images (before cropping), with labels that describe not only cat or dog, but also their location?

Thank you!

ps: If I am not mistaken, I’ve heard Jeremy saying this would be something covered in the future lessons, but I am not sure if I understood correctly.

The basics of object detection have been covered in part 2 V2 in one of the later lessions. You might have a look there. It is not based on fastai 1.0 but the principles are the same.

1 Like