I just finished writing tabularGP, an implementation of gaussian processes built on top of fastai V1 and pytorch
It was designed so that you can trivially take a fastai tabular deep learning model and turn it into a tabular gaussian process model (the examples notebooks try to be exhautive).
For those unfamiliar with gaussian process, they have/are:
- very good with small (few dozens to few thousands) datasets
- few hyperparameters to tune
- free uncertainty estimation on the ouputs
- free feature importance computation
- easy transfert learning
- very little overfit thanks to built in bayesian regularisation
In practice, I found gaussian processes to be better than deep neural networks when you have small data (any where from 10 to 5000 training points) and a relatively simple space (no complex interaction between features).
While working on an upcoming paper, I got a 33% RMSE improvement over a carefully tuned neural network with minimal efforts (on a 4000 points training set) : so definitely worth testing
There is a catch however: on a large dataset the algorithm will both slow and beaten by neural networks due to its high complexity. But if you have a smallish dataset, its worth knowing and using