User Guide

User Guide

There are 3 main actions needed to train and use the different models:

Initialization

GP vs VGP vs SVGP

There are currently 3 possible Gaussian Process models:

    GP(X_train,y_train,kernel)
    VGP(X_train,y_train,kernel,likelihood,inference)
    SVGP(X_train,y_train,kernel,likelihood,inference,n_inducingpoints)

Likelihood

GP can only have a Gaussian likelihood, VGP and SVGP have more choices. Here are the ones currently implemented:

Regression

For regression, four likelihoods are available :

Classification

For classification one can select among

Event Likelihoods

For likelihoods such as Poisson or Negative Binomial, we approximate a parameter by σ(f). Two Likelihoods are implemented :

Multi-class classification

There is two available likelihoods for multi-class classification:

More options

You can also write your own likelihood by using the following template.

Inference

Inference can be done in various ways.

The two next methods rely on numerical approximation of an integral and I therefore recommend using the VanillaGradDescent as it will use anyway the natural gradient updates. Adam seem to give random results.

Compatibility table

Not all inference are implemented/valid for all likelihoods, here is the compatibility table between them.

Likelihood/InferenceAnalyticVIGibbsSamplingQuadratureVIMCIntegrationVI
GaussianLikelihood
StudentTLikelihood
LaplaceLikelihood
HeteroscedasticLikelihood(dev)(dev)
LogisticLikelihood
BayesianSVM(dev)
LogisticSoftMaxLikelihood(dev)
SoftMaxLikelihood(dev)
Poisson(dev)
NegBinomialLikelihood(dev)

(dev) means that the feature is possible and may be developped and tested but is not available yet. All contributions or requests are very welcome!

Additional Parameters

Hyperparameter optimization

One can optimize the kernel hyperparameters as well as the inducing points location by maximizing the ELBO. All derivations are already hand-coded (no AD needed). One can select the optimization scheme via :

PriorMean

The mean keyword allows you to add different types of prior means:

IndependentPriors

When having multiple latent Gaussian Processes one can decide to have a common prior for all of them or to have a separate prior for each latent GP. Having a common prior has the advantage that less computations are required to optimize hyperparameters.

Training

Training is straightforward after initializing the model by running :

train!(model;iterations=100,callback=callbackfunction)

Where the callback option is for running a function at every iteration. callback function should be defined as

function callbackfunction(model,iter)
    "do things here"...
end

Prediction

Once the model has been trained it is finally possible to compute predictions. There always three possibilities :

Miscellaneous

🚧 In construction – Should be developed in the near future 🚧

Saving/Loading models

Once a model has been trained it is possible to save its state in a file by using save_trained_model(filename,model), a partial version of the file will be save in filename.

It is then possible to reload this file by using load_trained_model(filename). !!!However note that it will not be possible to train the model further!!! This function is only meant to do further predictions.

🚧 Pre-made callback functions 🚧

There is one (for now) premade function to return a a MVHistory object and callback function for the training of binary classification problems. The callback will store the ELBO and the variational parameters at every iterations included in iterpoints If `Xtestandy_test` are provided it will also store the test accuracy and the mean and median test loglikelihood