API Library
Module
AugmentedGaussianProcesses.AugmentedGaussianProcesses
— ModuleGeneral Framework for the data augmented Gaussian Processes
Model Types
AugmentedGaussianProcesses.GP
— TypeClass for Gaussian Processes models
GP(X::AbstractArray{T}, y::AbstractArray, kernel::Kernel;
noise::Real=1e-5, opt_noise::Bool=true, verbose::Int=0,
optimiser=ADAM(0.01),atfrequency::Int=1,
mean::Union{<:Real,AbstractVector{<:Real},PriorMean}=ZeroMean(),
IndependentPriors::Bool=true,ArrayType::UnionAll=Vector)
Argument list :
Mandatory arguments
X
: input features, should be a matrix N×D where N is the number of observation and D the number of dimensiony
: input labels, can be either a vector of labels for multiclass and single output or a matrix for multi-outputs (note that only one likelihood can be applied)kernel
: covariance function, can be either a single kernel or a collection of kernels for multiclass and multi-outputs models
Keyword arguments
noise
: Initial noise of the modelopt_noise
: Flag for optimizing the noise σ=Σ(y-f)^2/Nmean
: Option for putting a prior meanverbose
: How much does the model print (0:nothing, 1:very basic, 2:medium, 3:everything)optimiser
: Optimiser used for the kernel parameters. Should be an Optimiser object from the Flux.jl library, see list here Optimisers and on this list. Default isADAM(0.001)
IndependentPriors
: Flag for setting independent or shared parameters among latent GPsatfrequency
: Choose how many variational parameters iterations are between hyperparameters optimizationmean
: PriorMean object, check the documentation on itMeanPrior
ArrayType
: Option for using different type of array for storage (allow for GPU usage)
AugmentedGaussianProcesses.VGP
— TypeClass for variational Gaussian Processes models (non-sparse)
VGP(X::AbstractArray{T},y::AbstractVector,
kernel::Kernel,
likelihood::LikelihoodType,inference::InferenceType;
verbose::Int=0,optimiser=ADAM(0.01),atfrequency::Integer=1,
mean::Union{<:Real,AbstractVector{<:Real},PriorMean}=ZeroMean(),
IndependentPriors::Bool=true,ArrayType::UnionAll=Vector)
Argument list :
Mandatory arguments
X
: input features, should be a matrix N×D where N is the number of observation and D the number of dimensiony
: input labels, can be either a vector of labels for multiclass and single output or a matrix for multi-outputs (note that only one likelihood can be applied)kernel
: covariance function, a single kernel from the KernelFunctions.jl packagelikelihood
: likelihood of the model, currently implemented : Gaussian, Bernoulli (with logistic link), Multiclass (softmax or logistic-softmax) seeLikelihood Types
inference
: inference for the model, can be analytic, numerical or by sampling, check the model documentation to know what is available for your likelihood see theCompatibility Table
Keyword arguments
verbose
: How much does the model print (0:nothing, 1:very basic, 2:medium, 3:everything)optimiser
: Optimiser used for the kernel parameters. Should be an Optimiser object from the Flux.jl library, see list here Optimisers and on this list. Default isADAM(0.001)
atfrequency
: Choose how many variational parameters iterations are between hyperparameters optimizationmean
: PriorMean object, check the documentation on itMeanPrior
IndependentPriors
: Flag for setting independent or shared parameters among latent GPsArrayType
: Option for using different type of array for storage (allow for GPU usage)
AugmentedGaussianProcesses.SVGP
— TypeClass for sparse variational Gaussian Processes
SVGP(X::AbstractArray{T1},y::AbstractVector{T2},kernel::Kernel,
likelihood::LikelihoodType,inference::InferenceType, nInducingPoints::Int;
verbose::Int=0,optimiser=ADAM(0.001),atfrequency::Int=1,
mean::Union{<:Real,AbstractVector{<:Real},PriorMean}=ZeroMean(),
Zoptimiser=false,
ArrayType::UnionAll=Vector)
Argument list :
Mandatory arguments
X
: input features, should be a matrix N×D where N is the number of observation and D the number of dimensiony
: input labels, can be either a vector of labels for multiclass and single output or a matrix for multi-outputs (note that only one likelihood can be applied)kernel
: covariance function, can be either a single kernel or a collection of kernels for multiclass and multi-outputs modelslikelihood
: likelihood of the model, currently implemented : Gaussian, Student-T, Laplace, Bernoulli (with logistic link), Bayesian SVM, Multiclass (softmax or logistic-softmax) seeLikelihood
inference
: inference for the model, can be analytic, numerical or by sampling, check the model documentation to know what is available for your likelihood see theCompatibility table
nInducingPoints
: number of inducing points
Optional arguments
verbose
: How much does the model print (0:nothing, 1:very basic, 2:medium, 3:everything)optimiser
: Optimiser used for the kernel parameters. Should be an Optimiser object from the Flux.jl library, see list here Optimisers and on this list. Default isADAM(0.001)
atfrequency
: Choose how many variational parameters iterations are between hyperparameters optimizationmean
: PriorMean object, check the documentation on itMeanPrior
IndependentPriors
: Flag for setting independent or shared parameters among latent GPsZoptimiser
: Optimiser used for the inducing points locations. Should be an Optimiser object from the Flux.jl library, see list here Optimisers and on this list. Default isADAM(0.001)
ArrayType
: Option for using different type of array for storage (allow for GPU usage)
Likelihood Types
AugmentedGaussianProcesses.GaussianLikelihood
— TypeGaussianLikelihood(σ²::T=1e-3) #σ² is the variance
Gaussian noise :
There is no augmentation needed for this likelihood which is already conjugate to a Gaussian prior
AugmentedGaussianProcesses.StudentTLikelihood
— TypeStudentTLikelihood(ν::T,σ::Real=one(T))
Student-t likelihood for regression:
ν
is the number of degrees of freedom and σ
is the variance for local scale of the data.
For the analytical solution, it is augmented via:
Where ω ~ IG(0.5ν,,0.5ν)
where IG
is the inverse gamma distribution See paper Robust Gaussian Process Regression with a Student-t Likelihood
AugmentedGaussianProcesses.LaplaceLikelihood
— TypeLaplaceLikelihood(β::T=1.0) # Laplace likelihood with scale β
Laplace likelihood for regression:
see wiki page
For the analytical solution, it is augmented via:
where $ω ~ Exp(ω | 1/(2 β^2))$, and Exp
is the Exponential distribution We use the variational distribution $q(ω) = GIG(ω | a,b,p)$
AugmentedGaussianProcesses.LogisticLikelihood
— TypeLogisticLikelihood()
Bernoulli likelihood with a logistic link for the Bernoulli likelihood
(for more info see : wiki page)
For the analytic version the likelihood, it is augmented via:
where $ω ~ PG(ω | 1, 0)$, and PG
is the Polya-Gamma distribution See paper : Efficient Gaussian Process Classification Using Polya-Gamma Data Augmentation
AugmentedGaussianProcesses.HeteroscedasticLikelihood
— TypeHeteroscedasticLikelihood(λ::T=1.0)
Gaussian with heteroscedastic noise given by another gp:
Where σ
is the logistic function
Augmentation will be described in a future paper
AugmentedGaussianProcesses.BayesianSVM
— TypeBayesianSVM()
The Bayesian SVM is a Bayesian interpretation of the classical SVM.
math p(y|f,ω) = 1/(sqrt(2πω) exp(-0.5((1+ω-yf)^2/ω)) `$where$ω ∼ 𝟙[0,∞)`` has an improper prior (his posterior is however has a valid distribution, a Generalized Inverse Gaussian). For reference see this paper
AugmentedGaussianProcesses.SoftMaxLikelihood
— Type SoftMaxLikelihood()
Multiclass likelihood with Softmax transformation:
There is no possible augmentation for this likelihood
AugmentedGaussianProcesses.LogisticSoftMaxLikelihood
— Type LogisticSoftMaxLikelihood()
The multiclass likelihood with a logistic-softmax mapping: :
where σ
is the logistic function. This likelihood has the same properties as softmax. –-
For the analytical version, the likelihood is augmented multiple times. More details can be found in the paper Multi-Class Gaussian Process Classification Made Conjugate: Efficient Inference via Data Augmentation
AugmentedGaussianProcesses.PoissonLikelihood
— Type Poisson Likelihood(λ::T=1.0)
Poisson Likelihood where a Poisson distribution is defined at every point in space (careful, it's different from continous Poisson processes)
Where σ
is the logistic function Augmentation details will be released at some point (open an issue if you want to see them)
AugmentedGaussianProcesses.NegBinomialLikelihood
— Type NegBinomialLikelihood(r::Int=10)
Negative Binomial likelihood with number of failures r
Where σ
is the logistic function
Inference Types
AugmentedGaussianProcesses.AnalyticVI
— TypeAnalyticVI
Variational Inference solver for conjugate or conditionally conjugate likelihoods (non-gaussian are made conjugate via augmentation) All data is used at each iteration (use AnalyticSVI for Stochastic updates)
AnalyticVI(;ϵ::T=1e-5)
Keywords arguments - ϵ::T
: convergence criteria
AugmentedGaussianProcesses.AnalyticSVI
— FunctionAnalyticSVI Stochastic Variational Inference solver for conjugate or conditionally conjugate likelihoods (non-gaussian are made conjugate via augmentation)
AnalyticSVI(nMinibatch::Integer;ϵ::T=1e-5,optimiser=RobbinsMonro())
- `nMinibatch::Integer` : Number of samples per mini-batches
Keywords arguments
- `ϵ::T` : convergence criteria
- `optimiser` : Optimiser used for the variational updates. Should be an Optimiser object from the [Flux.jl](https://github.com/FluxML/Flux.jl) library, see list here [Optimisers](https://fluxml.ai/Flux.jl/stable/training/optimisers/) and on [this list](https://github.com/theogf/AugmentedGaussianProcesses.jl/tree/master/src/inference/optimisers.jl). Default is `RobbinsMonro()` (ρ=(τ+iter)^-κ)
AugmentedGaussianProcesses.GibbsSampling
— TypeGibbsSampling(;ϵ::T=1e-5,nBurnin::Int=100,samplefrequency::Int=1)
Draw samples from the true posterior via Gibbs Sampling.
Keywords arguments - ϵ::T
: convergence criteria - nBurnin::Int
: Number of samples discarded before starting to save samples - samplefrequency::Int
: Frequency of sampling
AugmentedGaussianProcesses.QuadratureVI
— TypeQuadratureVI
Variational Inference solver by approximating gradients via numerical integration via Quadrature
QuadratureVI(ϵ::T=1e-5,nGaussHermite::Integer=20,optimiser=Momentum(0.0001))
Keyword arguments
- `ϵ::T` : convergence criteria
- `nGaussHermite::Int` : Number of points for the integral estimation
- `natural::Bool` : Use natural gradients
- `optimiser` : Optimiser used for the variational updates. Should be an Optimiser object from the [Flux.jl](https://github.com/FluxML/Flux.jl) library, see list here [Optimisers](https://fluxml.ai/Flux.jl/stable/training/optimisers/) and on [this list](https://github.com/theogf/AugmentedGaussianProcesses.jl/tree/master/src/inference/optimisers.jl). Default is `Momentum(0.0001)`
AugmentedGaussianProcesses.QuadratureSVI
— FunctionQuadratureSVI
Stochastic Variational Inference solver by approximating gradients via numerical integration via Quadrature
QuadratureSVI(nMinibatch::Integer;ϵ::T=1e-5,nGaussHermite::Integer=20,optimiser=Momentum(0.0001))
-`nMinibatch::Integer` : Number of samples per mini-batches
Keyword arguments
- `ϵ::T` : convergence criteria, which can be user defined
- `nGaussHermite::Int` : Number of points for the integral estimation (for the QuadratureVI)
- `natural::Bool` : Use natural gradients
- `optimiser` : Optimiser used for the variational updates. Should be an Optimiser object from the [Flux.jl](https://github.com/FluxML/Flux.jl) library, see list here [Optimisers](https://fluxml.ai/Flux.jl/stable/training/optimisers/) and on [this list](https://github.com/theogf/AugmentedGaussianProcesses.jl/tree/master/src/inference/optimisers.jl). Default is `Momentum(0.0001)`
AugmentedGaussianProcesses.MCIntegrationVI
— TypeMCIntegrationVI(;ϵ::T=1e-5,nMC::Integer=1000,optimiser=Momentum(0.001))
Variational Inference solver by approximating gradients via MC Integration.
Keyword arguments
- `ϵ::T` : convergence criteria, which can be user defined
- `nMC::Int` : Number of samples per data point for the integral evaluation
- `natural::Bool` : Use natural gradients
- `optimiser` : Optimiser used for the variational updates. Should be an Optimiser object from the [Flux.jl](https://github.com/FluxML/Flux.jl) library, see list here [Optimisers](https://fluxml.ai/Flux.jl/stable/training/optimisers/) and on [this list](https://github.com/theogf/AugmentedGaussianProcesses.jl/tree/master/src/inference/optimisers.jl). Default is `Momentum(0.01)`
AugmentedGaussianProcesses.MCIntegrationSVI
— FunctionMCIntegrationSVI(;ϵ::T=1e-5,nMC::Integer=1000,optimiser=Momentum(0.0001))
Stochastic Variational Inference solver by approximating gradients via Monte Carlo integration
Argument
-`nMinibatch::Integer` : Number of samples per mini-batches
Keyword arguments
- `ϵ::T` : convergence criteria, which can be user defined
- `nMC::Int` : Number of samples per data point for the integral evaluation
- `natural::Bool` : Use natural gradients
- `optimiser` : Optimiser used for the variational updates. Should be an Optimiser object from the [Flux.jl](https://github.com/FluxML/Flux.jl) library, see list here [Optimisers](https://fluxml.ai/Flux.jl/stable/training/optimisers/) and on [this list](https://github.com/theogf/AugmentedGaussianProcesses.jl/tree/master/src/inference/optimisers.jl). Default is `Momentum()` (ρ=(τ+iter)^-κ)
Functions and methods
AugmentedGaussianProcesses.train!
— Functiontrain!(model::AbstractGP;iterations::Integer=100,callback=0,convergence=0)
Function to train the given GP model
.
Keyword Arguments
there are options to change the number of max iterations,
iterations::Int
: Number of iterations (not necessarily epochs!)for trainingcallback::Function
: Callback function called at every iteration. Should be of typefunction(model,iter) ... end
convergence::Function
: Convergence function to be called every iteration, should return a scalar and take the same arguments ascallback
train!(model::AbstractGP;iterations::Integer=100,callback=0,conv_function=0)
Function to train the given GP model
.
Keyword Arguments
there are options to change the number of max iterations,
iterations::Int
: Number of iterations (not necessarily epochs!)for trainingcallback::Function
: Callback function called at every iteration. Should be of typefunction(model,iter) ... end
conv_function::Function
: Convergence function to be called every iteration, should return a scalar and take the same arguments ascallback
AugmentedGaussianProcesses.predict_f
— FunctionCompute the mean of the predicted latent distribution of f
on X_test
for the variational GP model
Return also the diagonal variance if covf=true
and the full covariance if fullcov=true
AugmentedGaussianProcesses.predict_y
— Functionpredict_y(model::AbstractGP,X_test::AbstractMatrix)
Return - the predictive mean of X_test
for regression - the sign of X_test
for classification - the most likely class for multi-class classification - the expected number of events for an event likelihood
AugmentedGaussianProcesses.proba_y
— Functionproba_y(model::AbstractGP,X_test::AbstractMatrix)
Return the probability distribution p(ytest|model,Xtest) :
- Tuple of vectors of mean and variance for regression
- Vector of probabilities of y_test = 1 for binary classification
- Dataframe with columns and probability per class for multi-class classification
Kernels
Missing docstring for RBFKernel
. Check Documenter's build log for details.
Missing docstring for MaternKernel
. Check Documenter's build log for details.
Kernel functions
Missing docstring for kernelmatrix
. Check Documenter's build log for details.
Missing docstring for kernelmatrix!
. Check Documenter's build log for details.
Missing docstring for getvariance
. Check Documenter's build log for details.
Missing docstring for getlengthscales
. Check Documenter's build log for details.
Prior Means
AugmentedGaussianProcesses.ZeroMean
— TypeZeroMean
ZeroMean()
Construct a mean prior set to 0 and which cannot be updated.
AugmentedGaussianProcesses.ConstantMean
— TypeConstantMean
ConstantMean(c::T=1.0;opt=ADMA(0.01))
Construct a prior mean with constant c
Optionally set an optimiser opt
(ADAM(0.01)
by default)
AugmentedGaussianProcesses.EmpiricalMean
— TypeEmpiricalMean julia` function EmpiricalMean(c::V=1.0;opt=ADAM(0.01)) where {V<:AbstractVector{<:Real}}
Construct a empirical mean with values c
Optionally give an optimiser opt
(ADAM(0.01)
by default)
Index
AugmentedGaussianProcesses.AnalyticVI
AugmentedGaussianProcesses.BayesianSVM
AugmentedGaussianProcesses.ConstantMean
AugmentedGaussianProcesses.EmpiricalMean
AugmentedGaussianProcesses.GP
AugmentedGaussianProcesses.GaussianLikelihood
AugmentedGaussianProcesses.GibbsSampling
AugmentedGaussianProcesses.HeteroscedasticLikelihood
AugmentedGaussianProcesses.LaplaceLikelihood
AugmentedGaussianProcesses.LogisticLikelihood
AugmentedGaussianProcesses.LogisticSoftMaxLikelihood
AugmentedGaussianProcesses.MCIntegrationVI
AugmentedGaussianProcesses.NegBinomialLikelihood
AugmentedGaussianProcesses.PoissonLikelihood
AugmentedGaussianProcesses.QuadratureVI
AugmentedGaussianProcesses.SVGP
AugmentedGaussianProcesses.SoftMaxLikelihood
AugmentedGaussianProcesses.StudentTLikelihood
AugmentedGaussianProcesses.VGP
AugmentedGaussianProcesses.ZeroMean
AugmentedGaussianProcesses.AnalyticSVI
AugmentedGaussianProcesses.MCIntegrationSVI
AugmentedGaussianProcesses.QuadratureSVI
AugmentedGaussianProcesses.predict_f
AugmentedGaussianProcesses.predict_y
AugmentedGaussianProcesses.proba_y
AugmentedGaussianProcesses.train!