tabularGP: gaussian processes with fastai!

I just finished writing tabularGP, an implementation of gaussian processes built on top of fastai V1 and pytorch :partying_face:

It was designed so that you can trivially take a fastai tabular deep learning model and turn it into a tabular gaussian process model (the examples notebooks try to be exhautive).

For those unfamiliar with gaussian process, they have/are:

  • very good with small (few dozens to few thousands) datasets
  • few hyperparameters to tune
  • free uncertainty estimation on the ouputs
  • free feature importance computation
  • easy transfert learning
  • very little overfit thanks to built in bayesian regularisation

In practice, I found gaussian processes to be better than deep neural networks when you have small data (any where from 10 to 5000 training points) and a relatively simple space (no complex interaction between features).

While working on an upcoming paper, I got a 33% RMSE improvement over a carefully tuned neural network with minimal efforts (on a 4000 points training set) : so definitely worth testing :partying_face:

There is a catch however: on a large dataset the algorithm will both slow and beaten by neural networks due to its high complexity. But if you have a smallish dataset, its worth knowing and using :slight_smile:

3 Likes