Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CategoricalLikelihood compatibility with LatentGP #404

Closed
ancorso opened this issue Jul 15, 2024 · 5 comments · Fixed by JuliaGaussianProcesses/AugmentedGPLikelihoods.jl#129

Comments

@ancorso
Copy link

ancorso commented Jul 15, 2024

I would like to model a multi-class dataset using LatentGPs and the CategoricalLikelihood (from GPLikelihoods.jl). The CategoricalLikelihood requires multiple latent GPs, and expects their output to be a AbstractVector{<:AbstractVector{<:Real}}. Instead, the design choice for multi-output GPs is to concatenate the outputs into one long vector, which is not what is necessary for a LatentGP with a CategoricalLikelihood. Below is an example:

gpm = GP(IndependentMOKernel(Matern52Kernel()))
x = rand(100)
X = MOInput(x, 3)

rand(gpm(X)) # produces a 300 element vector 

cgp = LatentGP(gpm, CategoricalLikelihood(), 1e-3)
cgpx = cgp(X)

res = rand(cgpx) # produces 300 values for `f` and a single sample for `y` because the softmax is applied over the full vector

Let me know if this is the wrong way of handling categorical likelihoods or if there is a recommendation on how to get this working out of the box. Happy to work on a PR.

@theogf
Copy link
Member

theogf commented Jul 15, 2024

Hey! Yes the interface is far from ideal right now, you can find a multi class example (using a different inference approach) here: https://juliagaussianprocesses.github.io/AugmentedGPLikelihoods.jl/dev/examples/categorical/

@ancorso
Copy link
Author

ancorso commented Jul 15, 2024

Thanks for the quick reply! I did see that example, but I ran into some issues with the aug_elbo calculation (which is ultimately what I am after here).

If I'm correct, your representation for the mean and variance of the distribution over inducing points is an ArraysOfArrays object (that's why you broadcast the following line in the CAVI algorithm posts_u = u_posterior.(Ref(fz), ms, Ss). However the current call to aug_elbo (aug_elbo(lik, uposterior(fz, m, S), x, y) which isn't in a code block btw), doesn't include this broadcasting, nor does it internally handle the resulting vector of posteriors if you do broadcast. Would it be possible to clarify my understanding of how that example should work?

Thanks so much!

@ancorso
Copy link
Author

ancorso commented Jul 15, 2024

It seems like perhaps the output of this line: https://github.com/JuliaGaussianProcesses/AugmentedGPLikelihoods.jl/blob/41336971ee8882a147e996cf4e791831422da393/examples/categorical/script.jl#L138
should get transformed into a AbstractVector{<:AbstractVector{<:Normal}}?

@theogf
Copy link
Member

theogf commented Jul 15, 2024

That's a good point! I will try to fix the script in the repo directly!

@ancorso
Copy link
Author

ancorso commented Jul 16, 2024

That's a good point! I will try to fix the script in the repo directly!

That would be a big help, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants