Drug Classification paper using fast.ai

I wrote a paper using fast.ai to classify drug molecules into their likely ‘therapeutic use’ classes using only pictures of the molecule structures generated in python using the package ‘rdkit’.

We compared the performance of the convolutional neural network (CNN) a random forest using 1024-bit molecular fingerprints. The random forest did slightly better, but the CNN was close behind. Given that we only have less than 10,000 examples, and there are huge class imbalances, I was happy with the CNN performance.

Check out the paper on bioRxiv here: