API Library
Module
AugmentedGaussianProcesses.AugmentedGaussianProcesses — ModuleGeneral Framework for the data augmented Gaussian Processes
Model Types
AugmentedGaussianProcesses.GP — TypeClass for Gaussian Processes models
GP(X::AbstractArray{T}, y::AbstractArray, kernel::Kernel;
noise::Real=1e-5, opt_noise::Bool=true, verbose::Int=0,
optimiser=ADAM(0.01),atfrequency::Int=1,
mean::Union{<:Real,AbstractVector{<:Real},PriorMean}=ZeroMean(),
IndependentPriors::Bool=true,ArrayType::UnionAll=Vector)Argument list :
Mandatory arguments
X: input features, should be a matrix N×D where N is the number of observation and D the number of dimensiony: input labels, can be either a vector of labels for multiclass and single output or a matrix for multi-outputs (note that only one likelihood can be applied)kernel: covariance function, can be either a single kernel or a collection of kernels for multiclass and multi-outputs models
Keyword arguments
noise: Initial noise of the modelopt_noise: Flag for optimizing the noise σ=Σ(y-f)^2/Nmean: Option for putting a prior meanverbose: How much does the model print (0:nothing, 1:very basic, 2:medium, 3:everything)optimiser: Optimiser used for the kernel parameters. Should be an Optimiser object from the Flux.jl library, see list here Optimisers and on this list. Default isADAM(0.001)IndependentPriors: Flag for setting independent or shared parameters among latent GPsatfrequency: Choose how many variational parameters iterations are between hyperparameters optimizationmean: PriorMean object, check the documentation on itMeanPriorArrayType: Option for using different type of array for storage (allow for GPU usage)
AugmentedGaussianProcesses.VGP — TypeVGP(X::AbstractArray{T},y::AbstractVector,
kernel::Kernel,
likelihood::LikelihoodType,inference::InferenceType;
verbose::Int=0,optimiser=ADAM(0.01),atfrequency::Integer=1,
mean::Union{<:Real,AbstractVector{<:Real},PriorMean}=ZeroMean(),
IndependentPriors::Bool=true,ArrayType::UnionAll=Vector)Argument list :
Mandatory arguments
X: input features, should be a matrix N×D where N is the number of observation and D the number of dimensiony: input labels, can be either a vector of labels for multiclass and single output or a matrix for multi-outputs (note that only one likelihood can be applied)kernel: covariance function, a single kernel from the KernelFunctions.jl packagelikelihood: likelihood of the model, currently implemented : Gaussian, Bernoulli (with logistic link), Multiclass (softmax or logistic-softmax) seeLikelihood Typesinference: inference for the model, can be analytic, numerical or by sampling, check the model documentation to know what is available for your likelihood see theCompatibility Table
Keyword arguments
verbose: How much does the model print (0:nothing, 1:very basic, 2:medium, 3:everything)optimiser: Optimiser used for the kernel parameters. Should be an Optimiser object from the Flux.jl library, see list here Optimisers and on this list. Default isADAM(0.001)atfrequency: Choose how many variational parameters iterations are between hyperparameters optimizationmean: PriorMean object, check the documentation on itMeanPriorIndependentPriors: Flag for setting independent or shared parameters among latent GPsArrayType: Option for using different type of array for storage (allow for GPU usage)
AugmentedGaussianProcesses.MCGP — TypeClass for variational Gaussian Processes models (non-sparse)
MCGP(X::AbstractArray{T1,N1},y::AbstractArray{T2,N2},kernel::Union{Kernel,AbstractVector{<:Kernel}},
likelihood::LikelihoodType,inference::InferenceType;
verbose::Int=0,optimiser=ADAM(0.01),atfrequency::Integer=1,
mean::Union{<:Real,AbstractVector{<:Real},PriorMean}=ZeroMean(),
IndependentPriors::Bool=true,ArrayType::UnionAll=Vector)Argument list :
Mandatory arguments
X: input features, should be a matrix N×D where N is the number of observation and D the number of dimensiony: input labels, can be either a vector of labels for multiclass and single output or a matrix for multi-outputs (note that only one likelihood can be applied)kernel: covariance function, can be either a single kernel or a collection of kernels for multiclass and multi-outputs modelslikelihood: likelihood of the model, currently implemented : Gaussian, Bernoulli (with logistic link), Multiclass (softmax or logistic-softmax) seeLikelihood Typesinference: inference for the model, can be analytic, numerical or by sampling, check the model documentation to know what is available for your likelihood see theCompatibility Table
Keyword arguments
verbose: How much does the model print (0:nothing, 1:very basic, 2:medium, 3:everything)optimiser: Optimiser used for the kernel parameters. Should be an Optimiser object from the Flux.jl library, see list here Optimisers and on this list. Default isADAM(0.001)atfrequency: Choose how many variational parameters iterations are between hyperparameters optimizationmean: PriorMean object, check the documentation on itMeanPriorIndependentPriors: Flag for setting independent or shared parameters among latent GPsArrayType: Option for using different type of array for storage (allow for GPU usage)
AugmentedGaussianProcesses.SVGP — TypeClass for sparse variational Gaussian Processes
SVGP(X::AbstractArray{T1},y::AbstractVector{T2},kernel::Kernel,
likelihood::LikelihoodType,inference::InferenceType, nInducingPoints::Int;
verbose::Int=0,optimiser=ADAM(0.001),atfrequency::Int=1,
mean::Union{<:Real,AbstractVector{<:Real},PriorMean}=ZeroMean(),
Zoptimiser=false,
ArrayType::UnionAll=Vector)Argument list :
Mandatory arguments
X: input features, should be a matrix N×D where N is the number of observation and D the number of dimensiony: input labels, can be either a vector of labels for multiclass and single output or a matrix for multi-outputs (note that only one likelihood can be applied)kernel: covariance function, can be either a single kernel or a collection of kernels for multiclass and multi-outputs modelslikelihood: likelihood of the model, currently implemented : Gaussian, Student-T, Laplace, Bernoulli (with logistic link), Bayesian SVM, Multiclass (softmax or logistic-softmax) seeLikelihoodinference: inference for the model, can be analytic, numerical or by sampling, check the model documentation to know what is available for your likelihood see theCompatibility tablenInducingPoints: number of inducing points
Optional arguments
verbose: How much does the model print (0:nothing, 1:very basic, 2:medium, 3:everything)optimiser: Optimiser used for the kernel parameters. Should be an Optimiser object from the Flux.jl library, see list here Optimisers and on this list. Default isADAM(0.001)atfrequency: Choose how many variational parameters iterations are between hyperparameters optimizationmean: PriorMean object, check the documentation on itMeanPriorIndependentPriors: Flag for setting independent or shared parameters among latent GPsZoptimiser: Optimiser used for the inducing points locations. Should be an Optimiser object from the Flux.jl library, see list here Optimisers and on this list. Default isADAM(0.001)ArrayType: Option for using different type of array for storage (allow for GPU usage)
AugmentedGaussianProcesses.OnlineSVGP — TypeClass for sparse variational Gaussian Processes
Likelihood Types
AugmentedGaussianProcesses.GaussianLikelihood — TypeGaussianLikelihood(σ²::T=1e-3) #σ² is the varianceGaussian noise :
\[ p(y|f) = N(y|f,σ²)\]
There is no augmentation needed for this likelihood which is already conjugate to a Gaussian prior
AugmentedGaussianProcesses.StudentTLikelihood — TypeStudentTLikelihood(ν::T,σ::Real=one(T))Student-t likelihood for regression:
\[ p(y|f,ν,σ) = Γ(0.5(ν+1))/(sqrt(νπ) σ Γ(0.5ν)) * (1+(y-f)^2/(σ^2ν))^(-0.5(ν+1))\]
ν is the number of degrees of freedom and σ is the variance for local scale of the data.
For the analytical solution, it is augmented via:
\[ p(y|f,ω) = N(y|f,σ^2 ω)\]
Where ω ~ IG(0.5ν,,0.5ν) where IG is the inverse gamma distribution See paper Robust Gaussian Process Regression with a Student-t Likelihood
AugmentedGaussianProcesses.LaplaceLikelihood — TypeLaplaceLikelihood(β::T=1.0) # Laplace likelihood with scale βLaplace likelihood for regression:
\[1/(2β) exp(-|y-f|/β)\]
see wiki page
For the analytical solution, it is augmented via:
\[p(y|f,ω) = N(y|f,ω⁻¹)\]
where $ω ~ Exp(ω | 1/(2 β^2))$, and Exp is the Exponential distribution We use the variational distribution $q(ω) = GIG(ω | a,b,p)$
AugmentedGaussianProcesses.LogisticLikelihood — TypeLogisticLikelihood()Bernoulli likelihood with a logistic link for the Bernoulli likelihood
\[ p(y|f) = \sigma(yf) = \frac{1}{1+\exp(-yf)},\]
(for more info see : wiki page)
For the analytic version the likelihood, it is augmented via:
\[ p(y|f,ω) = exp(0.5(yf - (yf)^2 ω))\]
where $ω ~ PG(ω | 1, 0)$, and PG is the Polya-Gamma distribution See paper : Efficient Gaussian Process Classification Using Polya-Gamma Data Augmentation
AugmentedGaussianProcesses.HeteroscedasticLikelihood — TypeHeteroscedasticLikelihood(λ::T=1.0)Gaussian with heteroscedastic noise given by another gp:
\[ p(y|f,g) = N(y|f,(λ σ(g))⁻¹)\]
Where σ is the logistic function
Augmentation will be described in a future paper
AugmentedGaussianProcesses.BayesianSVM — TypeBayesianSVM()The Bayesian SVM is a Bayesian interpretation of the classical SVM.
\[p(y|f) \propto \exp(2 \max(1-yf, 0)) ```` --- For the analytic version of the likelihood, it is augmented via: \]
math p(y|f, ω) = \frac{1}{\sqrt(2\pi\omega) \exp(-\frac{(1+\omega-yf)^2}{2\omega})) ```
where $ω ∼ 𝟙[0,∞)$ has an improper prior (his posterior is however has a valid distribution, a Generalized Inverse Gaussian). For reference see this paper
AugmentedGaussianProcesses.SoftMaxLikelihood — Type SoftMaxLikelihood()Multiclass likelihood with Softmax transformation:
\[p(y=i|{fₖ}) = exp(fᵢ)/ ∑ₖexp(fₖ)\]
There is no possible augmentation for this likelihood
AugmentedGaussianProcesses.LogisticSoftMaxLikelihood — TypeLogisticSoftMaxLikelihood(num_class)The multiclass likelihood with a logistic-softmax mapping: :
\[p(y=i|{fₖ}₁ᴷ) = σ(fᵢ)/∑ₖ σ(fₖ)\]
where σ is the logistic function. This likelihood has the same properties as softmax. –-
For the analytical version, the likelihood is augmented multiple times. More details can be found in the paper Multi-Class Gaussian Process Classification Made Conjugate: Efficient Inference via Data Augmentation
AugmentedGaussianProcesses.PoissonLikelihood — TypePoisson Likelihood(λ=1.0)Poisson Likelihood where a Poisson distribution is defined at every point in space (careful, it's different from continous Poisson processes)
\[ p(y|f) = Poisson(y|\lambda \sigma(f))\]
Where σ is the logistic function Augmentation details will be released at some point (open an issue if you want to see them)
AugmentedGaussianProcesses.NegBinomialLikelihood — TypeNegBinomialLikelihood(r::Real=10)Negative Binomial likelihood with number of failures r
\[ p(y|r, f) = binomial(y + r - 1, y) (1 - σ(f))ʳ σ(f)ʸ p(y|r, f) = Γ(y + r)/Γ(y + 1)Γ(r) (1 - σ(f))ʳ σ(f)ʸ\]
Where σ is the logistic function
Inference Types
AugmentedGaussianProcesses.AnalyticVI — TypeAnalyticVI(;ϵ::T=1e-5)Variational Inference solver for conjugate or conditionally conjugate likelihoods (non-gaussian are made conjugate via augmentation) All data is used at each iteration (use AnalyticSVI for Stochastic updates)
Keywords arguments - ϵ::T : convergence criteria
AugmentedGaussianProcesses.AnalyticSVI — FunctionAnalyticSVI(nMinibatch::Integer; ϵ::T=1e-5, optimiser=RobbinsMonro())Stochastic Variational Inference solver for conjugate or conditionally conjugate likelihoods (non-gaussian are made conjugate via augmentation)
nMinibatch::Integer: Number of samples per mini-batches
Keywords arguments
- `ϵ::T` : convergence criteria
- `optimiser` : Optimiser used for the variational updates. Should be an Optimiser object from the [Flux.jl](https://github.com/FluxML/Flux.jl) library, see list here [Optimisers](https://fluxml.ai/Flux.jl/stable/training/optimisers/) and on [this list](https://github.com/theogf/AugmentedGaussianProcesses.jl/tree/master/src/inference/optimisers.jl). Default is `RobbinsMonro()` (ρ=(τ+iter)^-κ)AugmentedGaussianProcesses.GibbsSampling — TypeGibbsSampling(;ϵ::T=1e-5,nBurnin::Int=100,samplefrequency::Int=1)Draw samples from the true posterior via Gibbs Sampling.
Keywords arguments - ϵ::T : convergence criteria - nBurnin::Int : Number of samples discarded before starting to save samples - samplefrequency::Int : Frequency of sampling
AugmentedGaussianProcesses.QuadratureVI — TypeQuadratureVI(ϵ::T=1e-5,nGaussHermite::Integer=20,optimiser=Momentum(0.0001))Variational Inference solver by approximating gradients via numerical integration via Quadrature
Keyword arguments
- `ϵ::T` : convergence criteria
- `nGaussHermite::Int` : Number of points for the integral estimation
- `natural::Bool` : Use natural gradients
- `optimiser` : Optimiser used for the variational updates. Should be an Optimiser object from the [Flux.jl](https://github.com/FluxML/Flux.jl) library, see list here [Optimisers](https://fluxml.ai/Flux.jl/stable/training/optimisers/) and on [this list](https://github.com/theogf/AugmentedGaussianProcesses.jl/tree/master/src/inference/optimisers.jl). Default is `Momentum(0.0001)`AugmentedGaussianProcesses.QuadratureSVI — FunctionQuadratureSVI(nMinibatch::Integer;ϵ::T=1e-5,nGaussHermite::Integer=20,optimiser=Momentum(0.0001))Stochastic Variational Inference solver by approximating gradients via numerical integration via Quadrature
-`nMinibatch::Integer` : Number of samples per mini-batchesKeyword arguments
- `ϵ::T` : convergence criteria, which can be user defined
- `nGaussHermite::Int` : Number of points for the integral estimation (for the QuadratureVI)
- `natural::Bool` : Use natural gradients
- `optimiser` : Optimiser used for the variational updates. Should be an Optimiser object from the [Flux.jl](https://github.com/FluxML/Flux.jl) library, see list here [Optimisers](https://fluxml.ai/Flux.jl/stable/training/optimisers/) and on [this list](https://github.com/theogf/AugmentedGaussianProcesses.jl/tree/master/src/inference/optimisers.jl). Default is `Momentum(0.0001)`AugmentedGaussianProcesses.MCIntegrationVI — TypeMCIntegrationVI(;ϵ::T=1e-5,nMC::Integer=1000,optimiser=Momentum(0.001))Variational Inference solver by approximating gradients via MC Integration.
Keyword arguments
- `ϵ::T` : convergence criteria, which can be user defined
- `nMC::Int` : Number of samples per data point for the integral evaluation
- `natural::Bool` : Use natural gradients
- `optimiser` : Optimiser used for the variational updates. Should be an Optimiser object from the [Flux.jl](https://github.com/FluxML/Flux.jl) library, see list here [Optimisers](https://fluxml.ai/Flux.jl/stable/training/optimisers/) and on [this list](https://github.com/theogf/AugmentedGaussianProcesses.jl/tree/master/src/inference/optimisers.jl). Default is `Momentum(0.01)`AugmentedGaussianProcesses.MCIntegrationSVI — FunctionMCIntegrationSVI(;ϵ::T=1e-5,nMC::Integer=1000,optimiser=Momentum(0.0001))Stochastic Variational Inference solver by approximating gradients via Monte Carlo integration
Argument
-`nMinibatch::Integer` : Number of samples per mini-batchesKeyword arguments
- `ϵ::T` : convergence criteria, which can be user defined
- `nMC::Int` : Number of samples per data point for the integral evaluation
- `natural::Bool` : Use natural gradients
- `optimiser` : Optimiser used for the variational updates. Should be an Optimiser object from the [Flux.jl](https://github.com/FluxML/Flux.jl) library, see list here [Optimisers](https://fluxml.ai/Flux.jl/stable/training/optimisers/) and on [this list](https://github.com/theogf/AugmentedGaussianProcesses.jl/tree/master/src/inference/optimisers.jl). Default is `Momentum()` (ρ=(τ+iter)^-κ)Functions and methods
AugmentedGaussianProcesses.train! — Functiontrain!(model::AbstractGP;iterations::Integer=100,callback=0,convergence=0)Function to train the given GP model.
Keyword Arguments
there are options to change the number of max iterations,
iterations::Int: Number of iterations (not necessarily epochs!)for trainingcallback::Function: Callback function called at every iteration. Should be of typefunction(model,iter) ... endconvergence::Function: Convergence function to be called every iteration, should return a scalar and take the same arguments ascallback
train!(model::AbstractGP, X::AbstractMatrix, y::AbstractVector;obsdim = 1, iterations::Int=10,callback=nothing,conv=0)
train!(model::AbstractGP, X::AbstractVector, y::AbstractVector;iterations::Int=20,callback=nothing,conv=0)Function to train the given GP model.
Keyword Arguments
there are options to change the number of max iterations,
iterations::Int: Number of iterations (not necessarily epochs!)for trainingcallback::Function: Callback function called at every iteration. Should be of typefunction(model,iter) ... endconv::Function: Convergence function to be called every iteration, should return a scalar and take the same arguments ascallback
AugmentedGaussianProcesses.predict_f — Functionpredict_f(m::AbstractGP, X_test, cov::Bool=true, diag::Bool=true)Compute the mean of the predicted latent distribution of f on X_test for the variational GP model
Return also the diagonal variance if cov=true and the full covariance if diag=false
AugmentedGaussianProcesses.predict_y — Functionpredict_y(model::AbstractGP, X_test::AbstractVector)
predict_y(model::AbstractGP, X_test::AbstractMatrix; obsdim = 1)Return - the predictive mean of X_test for regression - the sign of X_test for classification - the most likely class for multi-class classification - the expected number of events for an event likelihood
Missing docstring for proba_y. Check Documenter's build log for details.
Prior Means
AugmentedGaussianProcesses.ZeroMean — TypeZeroMean()Construct a mean prior set to 0 and which cannot be updated.
AugmentedGaussianProcesses.ConstantMean — TypeConstantMean(c::Real = 1.0;opt=ADAM(0.01))Construct a prior mean with constant c Optionally set an optimiser opt (ADAM(0.01) by default)
AugmentedGaussianProcesses.EmpiricalMean — TypeEmpiricalMean(c::AbstractVector{<:Real}=1.0;opt=ADAM(0.01))Construct a empirical mean with values c Optionally give an optimiser opt (ADAM(0.01) by default)
Index
AugmentedGaussianProcesses.AnalyticVIAugmentedGaussianProcesses.BayesianSVMAugmentedGaussianProcesses.ConstantMeanAugmentedGaussianProcesses.EmpiricalMeanAugmentedGaussianProcesses.GPAugmentedGaussianProcesses.GaussianLikelihoodAugmentedGaussianProcesses.GibbsSamplingAugmentedGaussianProcesses.HeteroscedasticLikelihoodAugmentedGaussianProcesses.LaplaceLikelihoodAugmentedGaussianProcesses.LogisticLikelihoodAugmentedGaussianProcesses.LogisticSoftMaxLikelihoodAugmentedGaussianProcesses.MCGPAugmentedGaussianProcesses.MCIntegrationVIAugmentedGaussianProcesses.NegBinomialLikelihoodAugmentedGaussianProcesses.OnlineSVGPAugmentedGaussianProcesses.PoissonLikelihoodAugmentedGaussianProcesses.QuadratureVIAugmentedGaussianProcesses.SVGPAugmentedGaussianProcesses.SoftMaxLikelihoodAugmentedGaussianProcesses.StudentTLikelihoodAugmentedGaussianProcesses.VGPAugmentedGaussianProcesses.ZeroMeanAugmentedGaussianProcesses.AnalyticSVIAugmentedGaussianProcesses.MCIntegrationSVIAugmentedGaussianProcesses.QuadratureSVIAugmentedGaussianProcesses.predict_fAugmentedGaussianProcesses.predict_yAugmentedGaussianProcesses.train!