tabularGP: gaussian processes with fastai!

I just finished writing tabularGP, an implementation of gaussian processes built on top of fastai V1 and pytorch :partying_face:

It was designed so that you can trivially take a fastai tabular deep learning model and turn it into a tabular gaussian process model (the examples notebooks try to be exhautive).

For those unfamiliar with gaussian process, they have/are:

  • very good with small (few dozens to few thousands) datasets
  • few hyperparameters to tune
  • free uncertainty estimation on the ouputs
  • free feature importance computation
  • easy transfert learning
  • very little overfit thanks to built in bayesian regularisation

In practice, I found gaussian processes to be better than deep neural networks when you have small data (any where from 10 to 5000 training points) and a relatively simple space (no complex interaction between features).

While working on an upcoming paper, I got a 33% RMSE improvement over a carefully tuned neural network with minimal efforts (on a 4000 points training set) : so definitely worth testing :partying_face:

There is a catch however: on a large dataset the algorithm will both slow and beaten by neural networks due to its high complexity. But if you have a smallish dataset, its worth knowing and using :slight_smile:

5 Likes

I just updated the code for fatai V2 :slight_smile:

There is a single error left that I am not able to solve at the moment, I get a TypeError: list indices must be integers or slices, not list when using the predict method which does not happen with fastai tabularLearner (cf example 1). Any help would be appreciated.

It took some times but all the bugs are now squashed and you can use gaussian process on Fastai :partying_face:

1 Like

thank you so much for this wonderful library. Is there a version that will work for fastai V1 ?

Thank you :slight_smile:

The code was first developped with fastai V1, you can find the lastest V1 version here in the commit history (having the corresponding commit should let you install it without to much problems).

1 Like

cholesky_cpu: U(1,1) is zero, singular U

Have you faced this error. (Not able to share more details). Please let me know if you have seen this error.

thanks much

It is mentionneed in the readme, this is usually solved by using a lower learning rate and/or more nb_training_points.

The root cause is that the points and covariance function produced a linear system that cannot be solved.

1 Like