DL for Antibiotics

Has anyone read through this paper yet?https://www.sciencedirect.com/science/article/pii/S0092867420301021

They claim to train a model using graph-based embeddings using only 2500 training samples, and only ~200 that were actual positive antibiotic molecules. I just can’t understand how they can build a useful model with so few training examples, even with the ensembling they describe in the paper.

Am I missing something here?

