API Library

API Library

Module
Model Types
Likelihood Types
Inference Types
Functions and methods
Kernels
Kernel functions
Prior Means
Index

Module

AugmentedGaussianProcesses.AugmentedGaussianProcesses — Module.

General Framework for the data augmented Gaussian Processes

source

Model Types

AugmentedGaussianProcesses.GP — Type.

Class for Gaussian Processes models

GP(X::AbstractArray{T1,N1}, y::AbstractArray{T2,N2}, kernel::Union{Kernel,AbstractVector{<:Kernel}};
    noise::Real=1e-5, opt_noise::Bool=true, verbose::Int=0,
    optimizer::Bool=Adam(α=0.01),atfrequency::Int=1,
    mean::Union{<:Real,AbstractVector{<:Real},PriorMean}=ZeroMean(),
    IndependentPriors::Bool=true,ArrayType::UnionAll=Vector)

Argument list :

Mandatory arguments

X : input features, should be a matrix N×D where N is the number of observation and D the number of dimension
y : input labels, can be either a vector of labels for multiclass and single output or a matrix for multi-outputs (note that only one likelihood can be applied)
kernel : covariance function, can be either a single kernel or a collection of kernels for multiclass and multi-outputs models

Keyword arguments

noise : Initial noise of the model
opt_noise : Flag for optimizing the noise σ=Σ(y-f)^2/N
mean : Option for putting a prior mean
verbose : How much does the model print (0:nothing, 1:very basic, 2:medium, 3:everything)
optimizer : Optimizer for kernel hyperparameters (to be selected from GradDescent.jl)
IndependentPriors : Flag for setting independent or shared parameters among latent GPs
atfrequency : Choose how many variational parameters iterations are between hyperparameters optimization
mean : PriorMean object, check the documentation on it MeanPrior
ArrayType : Option for using different type of array for storage (allow for GPU usage)

source

AugmentedGaussianProcesses.VGP — Type.

Class for variational Gaussian Processes models (non-sparse)

VGP(X::AbstractArray{T1,N1},y::AbstractArray{T2,N2},kernel::Union{Kernel,AbstractVector{<:Kernel}},
    likelihood::LikelihoodType,inference::InferenceType;
    verbose::Int=0,optimizer::Union{Bool,Optimizer,Nothing}=Adam(α=0.01),atfrequency::Integer=1,
    mean::Union{<:Real,AbstractVector{<:Real},PriorMean}=ZeroMean(),
    IndependentPriors::Bool=true,ArrayType::UnionAll=Vector)

Argument list :

Mandatory arguments

X : input features, should be a matrix N×D where N is the number of observation and D the number of dimension
y : input labels, can be either a vector of labels for multiclass and single output or a matrix for multi-outputs (note that only one likelihood can be applied)
kernel : covariance function, can be either a single kernel or a collection of kernels for multiclass and multi-outputs models
likelihood : likelihood of the model, currently implemented : Gaussian, Bernoulli (with logistic link), Multiclass (softmax or logistic-softmax) see Likelihood Types
inference : inference for the model, can be analytic, numerical or by sampling, check the model documentation to know what is available for your likelihood see the Compatibility Table

Keyword arguments

verbose : How much does the model print (0:nothing, 1:very basic, 2:medium, 3:everything)
optimizer : Optimizer for kernel hyperparameters (to be selected from GradDescent.jl)
atfrequency : Choose how many variational parameters iterations are between hyperparameters optimization
mean : PriorMean object, check the documentation on it MeanPrior
IndependentPriors : Flag for setting independent or shared parameters among latent GPs
ArrayType : Option for using different type of array for storage (allow for GPU usage)

source

AugmentedGaussianProcesses.SVGP — Type.

Class for sparse variational Gaussian Processes

SVGP(X::AbstractArray{T1},y::AbstractArray{T2},kernel::Union{Kernel,AbstractVector{<:Kernel}},
    likelihood::LikelihoodType,inference::InferenceType, nInducingPoints::Int;
    verbose::Int=0,optimizer::Union{Optimizer,Nothing,Bool}=Adam(α=0.01),atfrequency::Int=1,
    mean::Union{<:Real,AbstractVector{<:Real},PriorMean}=ZeroMean(),
    IndependentPriors::Bool=true,Zoptimizer::Union{Optimizer,Nothing,Bool}=false,
    ArrayType::UnionAll=Vector)

Argument list :

Mandatory arguments

X : input features, should be a matrix N×D where N is the number of observation and D the number of dimension
y : input labels, can be either a vector of labels for multiclass and single output or a matrix for multi-outputs (note that only one likelihood can be applied)
kernel : covariance function, can be either a single kernel or a collection of kernels for multiclass and multi-outputs models
likelihood : likelihood of the model, currently implemented : Gaussian, Student-T, Laplace, Bernoulli (with logistic link), Bayesian SVM, Multiclass (softmax or logistic-softmax) see Likelihood
inference : inference for the model, can be analytic, numerical or by sampling, check the model documentation to know what is available for your likelihood see the Compatibility table
nInducingPoints : number of inducing points

Optional arguments

verbose : How much does the model print (0:nothing, 1:very basic, 2:medium, 3:everything)
optimizer : Optimizer for kernel hyperparameters (to be selected from GradDescent.jl)
atfrequency : Choose how many variational parameters iterations are between hyperparameters optimization
mean : PriorMean object, check the documentation on it MeanPrior
IndependentPriors : Flag for setting independent or shared parameters among latent GPs
optimizer : Optimizer for inducing point locations (to be selected from GradDescent.jl)
ArrayType : Option for using different type of array for storage (allow for GPU usage)

source

Likelihood Types

AugmentedGaussianProcesses.GaussianLikelihood — Type.

Gaussian Likelihood

Classical Gaussian noise : $p(y|f) = \mathcal{N}(y|f,\epsilon)$

GaussianLikelihood(ϵ::T=1e-3) #ϵ is the variance

There is no augmentation needed for this likelihood which is already conjugate

source

AugmentedGaussianProcesses.StudentTLikelihood — Type.

Student-T likelihood

Student-t likelihood for regression: $\frac{\Gamma((\nu+1)/2)}{\sqrt{\nu\pi}\sigma\Gamma(\nu/2)}\left(1+(y-f)^2/(\sigma^2\nu)\right)^{(-(\nu+1)/2)}$ see wiki page

StudentTLikelihood(ν::T,σ::Real=one(T)) #ν is the number of degrees of freedom
#σ is the variance for local scale of the data.

For the analytical solution, it is augmented via:

\[p(y|f,\omega) = \mathcal{N}(y|f,\sigma^2\omega)\]

Where $\omega \sim \mathcal{IG}(\frac{\nu}{2},\frac{\nu}{2})$ where $\mathcal{IG}$ is the inverse gamma distribution See paper Robust Gaussian Process Regression with a Student-t Likelihood

source

AugmentedGaussianProcesses.LaplaceLikelihood — Type.

Laplace likelihood

Laplace likelihood for regression: $\frac{1}{2\beta}\exp\left(-\frac{|y-f|}{\beta}\right)$ see wiki page

LaplaceLikelihood(β::T=1.0)  #  Laplace likelihood with scale β

For the analytical solution, it is augmented via:

\[p(y|f,\omega) = \mathcal{N}(y|f,\omega^{-1})\]

where $\omega \sim \text{Exp}\left(\omega \mid \frac{1}{2 \beta^2}\right)$, and Exp is the Exponential distribution We approximate ``q(\omega) = \mathcal{GIG}\left(\omega \mid a,b,p\right)

source

AugmentedGaussianProcesses.LogisticLikelihood — Type.

Logistic Likelihood

Bernoulli likelihood with a logistic link for the Bernoulli likelihood $p(y|f) = \sigma(yf) = \frac{1}{1+\exp(-yf)}$, (for more info see : wiki page)

LogisticLikelihood()

For the analytic version the likelihood, it is augmented via:

\[p(y|f,\omega) = \exp\left(\frac{1}{2}\left(yf - (yf)^2 \omega\right)\right)\]

where $\omega \sim \text{PG}(\omega\mid 1, 0)$, and PG is the Polya-Gamma distribution See paper : Efficient Gaussian Process Classification Using Polya-Gamma Data Augmentation

source

AugmentedGaussianProcesses.HeteroscedasticLikelihood — Type.

Heteroscedastic Likelihood

Gaussian with heteroscedastic noise given by another gp: $p(y|f,g) = \mathcal{N}(y|f,(\lambda\sigma(g))^{-1})$

HeteroscedasticLikelihood([kernel=RBFKernel(),[priormean=0.0]])

Augmentation is described here (#TODO)

source

AugmentedGaussianProcesses.BayesianSVM — Type.

Bayesian SVM

The Bayesian SVM is a Bayesian interpretation of the classical SVM. $p(y|f) \propto \exp\left(2\max(1-yf,0)\right)$

BayesianSVM()

For the analytic version of the likelihood, it is augmented via:

\[p(y|f,\omega) = \frac{1}{\sqrt{2\pi\omega}}\exp\left(-\frac{1}{2}\frac{(1+\omega-yf)^2}{\omega}\right)\]

where $\omega\sim 1_{[0,\infty]}$ has an improper prior (his posterior is however has a valid distribution (Generalized Inverse Gaussian)). For reference see this paper

source

AugmentedGaussianProcesses.SoftMaxLikelihood — Type.

SoftMax Likelihood

Multiclass likelihood with Softmax transformation: $p(y=i|{f_k}) = \exp(f_i)/ \sum_{j=1}\exp(f_j)$

There is no possible augmentation for this likelihood

source

AugmentedGaussianProcesses.LogisticSoftMaxLikelihood — Type.

The Logistic-Softmax likelihood

The multiclass likelihood with a logistic-softmax mapping: : $p(y=i|\{f_k\}) = \sigma(f_i)/ \sum_k \sigma(f_k)$ where σ is the logistic function has the same properties as softmax.

For the analytical version, the likelihood is augmented multiple times to obtain :

\[#TODO\]

Paper with details under submission

source

AugmentedGaussianProcesses.PoissonLikelihood — Type.

Poisson Likelihood

source

Inference Types

AugmentedGaussianProcesses.AnalyticVI — Type.

AnalyticVI

Variational Inference solver for conjugate or conditionally conjugate likelihoods (non-gaussian are made conjugate via augmentation) All data is used at each iteration (use AnalyticSVI for Stochastic updates)

AnalyticVI(;ϵ::T=1e-5)

Keywords arguments

- `ϵ::T` : convergence criteria

source

AugmentedGaussianProcesses.AnalyticSVI — Function.

AnalyticSVI Stochastic Variational Inference solver for conjugate or conditionally conjugate likelihoods (non-gaussian are made conjugate via augmentation)

AnalyticSVI(nMinibatch::Integer;ϵ::T=1e-5,optimizer::Optimizer=InverseDecay())

- `nMinibatch::Integer` : Number of samples per mini-batches

Keywords arguments

- `ϵ::T` : convergence criteria
- `optimizer::Optimizer` : Optimizer used for the variational updates. Should be an Optimizer object from the [GradDescent.jl](https://github.com/jacobcvt12/GradDescent.jl) package. Default is `InverseDecay()` (ρ=(τ+iter)^-κ)

source

AugmentedGaussianProcesses.GibbsSampling — Type.

GibbsSampling

Draw samples from the true posterior via Gibbs Sampling.

GibbsSampling(;ϵ::T=1e-5,nBurnin::Int=100,samplefrequency::Int=10)

Keywords arguments

- `ϵ::T` : convergence criteria
- `nBurnin::Int` : Number of samples discarded before starting to save samples
- `samplefrequency::Int` : Frequency of sampling

source

AugmentedGaussianProcesses.QuadratureVI — Type.

QuadratureVI

Variational Inference solver by approximating gradients via numerical integration via Quadrature

QuadratureVI(ϵ::T=1e-5,nGaussHermite::Integer=20,optimizer::Optimizer=Momentum(η=0.0001))

Keyword arguments

- `ϵ::T` : convergence criteria
- `nGaussHermite::Int` : Number of points for the integral estimation
- `optimizer::Optimizer` : Optimizer used for the variational updates. Should be an Optimizer object from the [GradDescent.jl](https://github.com/jacobcvt12/GradDescent.jl) package. Default is `Momentum(η=0.0001)`

source

AugmentedGaussianProcesses.QuadratureSVI — Function.

QuadratureSVI

Stochastic Variational Inference solver by approximating gradients via numerical integration via Quadrature

QuadratureSVI(nMinibatch::Integer;ϵ::T=1e-5,nGaussHermite::Integer=20,optimizer::Optimizer=Adam(α=0.1))

-`nMinibatch::Integer` : Number of samples per mini-batches

Keyword arguments

- `ϵ::T` : convergence criteria, which can be user defined
- `nGaussHermite::Int` : Number of points for the integral estimation (for the QuadratureVI)
- `optimizer::Optimizer` : Optimizer used for the variational updates. Should be an Optimizer object from the [GradDescent.jl](https://github.com/jacobcvt12/GradDescent.jl) package. Default is `Momentum(η=0.001)`

source

AugmentedGaussianProcesses.MCIntegrationVI — Type.

MCIntegrationVI(;ϵ::T=1e-5,nMC::Integer=1000,optimizer::Optimizer=Adam(α=0.1))

Constructor for Variational Inference via MC Integration approximation.

Keyword arguments

- `ϵ::T` : convergence criteria, which can be user defined
- `nMC::Int` : Number of samples per data point for the integral evaluation
- `optimizer::Optimizer` : Optimizer used for the variational updates. Should be an Optimizer object from the [GradDescent.jl]() package. Default is `Adam()`

source

AugmentedGaussianProcesses.MCIntegrationSVI — Function.

MCIntegrationSVI(;ϵ::T=1e-5,nMC::Integer=1000,optimizer::Optimizer=Adam(α=0.1))

Constructor for Stochastic Variational Inference via MC integration approximation.

Argument

-`nMinibatch::Integer` : Number of samples per mini-batches

Keyword arguments

- `ϵ::T` : convergence criteria, which can be user defined
- `nMC::Int` : Number of samples per data point for the integral evaluation
- `optimizer::Optimizer` : Optimizer used for the variational updates. Should be an Optimizer object from the [GradDescent.jl]() package. Default is `Adam()`

source

Functions and methods

AugmentedGaussianProcesses.train! — Function.

train!(model::AbstractGP;iterations::Integer=100,callback=0,conv_function=0)

Function to train the given GP model.

Keyword Arguments

there are options to change the number of max iterations,

iterations::Int : Number of iterations (not necessarily epochs!)for training
callback::Function : Callback function called at every iteration. Should be of type function(model,iter) ... end
conv_function::Function : Convergence function to be called every iteration, should return a scalar and take the same arguments as callback

source

AugmentedGaussianProcesses.predict_f — Function.

Compute the mean of the predicted latent distribution of f on X_test for the variational GP model

Return also the variance if covf=true and the full covariance if fullcov=true

source

Compute the mean of the predicted latent distribution of f on X_test for a sparse GP model Return also the variance if covf=true and the full covariance if fullcov=true

source

AugmentedGaussianProcesses.predict_y — Function.

predict_y(model::AbstractGP{T,<:RegressionLikelihood},X_test::AbstractMatrix)

Return the predictive mean of X_test

source

predict_y(model::AbstractGP{T,<:ClassificationLikelihood},X_test::AbstractMatrix)

Return the predicted most probable sign of X_test

source

predict_y(model::AbstractGP{T,<:MultiClassLikelihood},X_test::AbstractMatrix)

Return the predicted most probable class of X_test

source

predict_y(model::AbstractGP{T,<:EventLikelihood},X_test::AbstractMatrix)

Return the expected number of events for the locations X_test

source

AugmentedGaussianProcesses.proba_y — Function.

proba_y(model::AbstractGP,X_test::AbstractMatrix)

Return the probability distribution p(ytest|model,Xtest) :

- Tuple of vectors of mean and variance for regression
- Vector of probabilities of y_test = 1 for binary classification
- Dataframe with columns and probability per class for multi-class classification

source

Kernels

AugmentedGaussianProcesses.KernelModule.RBFKernel — Type.

Radial Basis Function Kernel also called RBF or SE(Squared Exponential)

source

AugmentedGaussianProcesses.KernelModule.MaternKernel — Type.

Matern Kernel

source

Kernel functions

AugmentedGaussianProcesses.KernelModule.kernelmatrix — Function.

Create the covariance matrix between the matrix X1 and X2 with the covariance function kernel

source

Compute the covariance matrix of the matrix X, optionally only compute the diagonal terms

source

AugmentedGaussianProcesses.KernelModule.kernelmatrix! — Function.

Compute the covariance matrix between the matrix X1 and X2 with the covariance function kernel in preallocated matrix K

source

Compute the covariance matrix of the matrix X in preallocated matrix K, optionally only compute the diagonal terms

source

AugmentedGaussianProcesses.KernelModule.getvariance — Function.

Return the variance of the kernel

source

AugmentedGaussianProcesses.KernelModule.getlengthscales — Function.

Return the lengthscale of the IsoKernel

source

Return the lengthscales of the ARD Kernel

source

Prior Means

AugmentedGaussianProcesses.ZeroMean — Type.

ZeroMean

ZeroMean()

Construct a mean prior set to 0 and cannot be changed.

source

AugmentedGaussianProcesses.ConstantMean — Type.

ConstantMean

ConstantMean(c::T=1.0;opt::Optimizer=Adam(α=0.01))

Construct a prior mean with constant c Optionally set an optimizer opt (Adam(α=0.01) by default)

source

AugmentedGaussianProcesses.EmpiricalMean — Type.

EmpiricalMean julia` function EmpiricalMean(c::V=1.0;opt::Optimizer=Adam(α=0.01)) where {V<:AbstractVector{<:Real}} Construct a constant mean with values c Optionally give an optimizer opt (Adam(α=0.01) by default)

source

Index

AugmentedGaussianProcesses.AnalyticVI
AugmentedGaussianProcesses.BayesianSVM
AugmentedGaussianProcesses.ConstantMean
AugmentedGaussianProcesses.EmpiricalMean
AugmentedGaussianProcesses.GP
AugmentedGaussianProcesses.GaussianLikelihood
AugmentedGaussianProcesses.GibbsSampling
AugmentedGaussianProcesses.HeteroscedasticLikelihood
AugmentedGaussianProcesses.KernelModule.MaternKernel
AugmentedGaussianProcesses.KernelModule.RBFKernel
AugmentedGaussianProcesses.LaplaceLikelihood
AugmentedGaussianProcesses.LogisticLikelihood
AugmentedGaussianProcesses.LogisticSoftMaxLikelihood
AugmentedGaussianProcesses.MCIntegrationVI
AugmentedGaussianProcesses.PoissonLikelihood
AugmentedGaussianProcesses.QuadratureVI
AugmentedGaussianProcesses.SVGP
AugmentedGaussianProcesses.SoftMaxLikelihood
AugmentedGaussianProcesses.StudentTLikelihood
AugmentedGaussianProcesses.VGP
AugmentedGaussianProcesses.ZeroMean
AugmentedGaussianProcesses.AnalyticSVI
AugmentedGaussianProcesses.KernelModule.getlengthscales
AugmentedGaussianProcesses.KernelModule.getvariance
AugmentedGaussianProcesses.KernelModule.kernelmatrix
AugmentedGaussianProcesses.KernelModule.kernelmatrix!
AugmentedGaussianProcesses.MCIntegrationSVI
AugmentedGaussianProcesses.QuadratureSVI
AugmentedGaussianProcesses.predict_f
AugmentedGaussianProcesses.predict_y
AugmentedGaussianProcesses.proba_y
AugmentedGaussianProcesses.train!