Multiclass on a binary problem with imbalanced classes

Suppose you have a image dataset which looks like this:

Class Cat 100 images
Class Dog 10 images
Class Bird 1000 images
Class Human 10000 images
Class other animals 10 images

now you want to build a classifier which detects, if an images is human or non human.
Additionally you absolutely want to know, when there is a image which does not include a human, so there must not be any animals classified as a human.

Is it better to train a network with binary classification Human vs others
or multiclass classification and afterwards relabel, all the non human predictions.

My thoughts so far:

pro multiclass thought

adding classes increases the model complexity - and since the classes are quite imbalanced, I would hope that using multiclass, I am more likely to detect the small classes dog and cat.
But training will take longer, since the complexity of the model increases. But in the end i will get better results.

pro binary thought:
since the data is limited, I should not punish the training data on being able to separate between dog, bird and cat. The generalized version of human vs non human will be much more robust.

Did anyone make experiments on problems like this yet?

I would be thankful for your feedback