Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

log_lik fails with matrices in models fit by ulam #432

Open
jebyrnes opened this issue Apr 16, 2024 · 6 comments
Open

log_lik fails with matrices in models fit by ulam #432

jebyrnes opened this issue Apr 16, 2024 · 6 comments

Comments

@jebyrnes
Copy link

I was trying to compare the no correlation, brownian, and OU models in the phlogenetic section, but, ran into issues with the log_lik argument

library(rethinking)
library(dplyr)

data(Primates301)

dstan <- Primates301 |>
  mutate(name = as.character(name)) |>
  filter(!is.na(group_size),
         !is.na(body),
         !is.na(brain)
  )

# for matrices
spp_obs <- dstan$name

# A list of data
dat_list <- list(
  N_spp = nrow(dstan),
  M = standardize(log(dstan$body)),
  B = standardize(log(dstan$brain)),
  G = standardize(log(dstan$group_size)),
  Imat = diag(nrow(dstan)) )

no_phylo <- ulam(
  alist(
    #likelihood
    B ~ multi_normal( mu , SIGMA ),
    
    # DGP
    mu <- a + bM*M + bG*G,
    
    # Autocorr
    matrix[N_spp,N_spp]: SIGMA <- Imat * sigma_sq,
    
    #priors
    a ~ normal( 0 , 1 ),
    c(bM,bG) ~ normal( 0 , 0.5 ),
    sigma_sq ~ exponential( 1 )
  ), data=dat_list , chains=4 , cores=4,
  log_lik = TRUE)

Yields

Compiling Stan program...
Semantic error in '/var/folders/qs/k7z4lcnx7rdg5hbhp2jr7rs00000gn/T/Rtmpr1OeQW/model-f57a72635ad9.stan', line 35, column 4 to column 11:
   -------------------------------------------------
    33:          mu[i] = a + bM * M[i] + bG * G[i];
    34:      }
    35:      log_lik = multi_normal_lpdf( B | mu , SIGMA );
             ^
    36:  }
    37:  
   -------------------------------------------------

Ill-typed arguments supplied to assignment operator =:
The left hand side has type
  vector
and the right hand side has type
  real

make: *** [/var/folders/qs/k7z4lcnx7rdg5hbhp2jr7rs00000gn/T/Rtmpr1OeQW/model-f57a72635ad9.hpp] Error 1

Error: An error occured during compilation! See the message above for more information.

And, correct me if I'm wrong, but, we need log_lik = TRUE. Otherwise, we get issues like

# from model fit where log_lik = FALSE
WAIC(no_phylo)

Error in rep(0, n_obs) : invalid 'times' argument

@rmcelreath
Copy link
Owner

This is tricky, because WAIC will need the multi_normal likelihood decomposed, not summed up as Stan does it. So the code in the gq will need to be completely different than the code in the model block. I don't believe there is a way to get multi_normal_lpdf to return a vector, which is what we need.

@jebyrnes
Copy link
Author

I'm always here to find fun bugs for you!

Alternately, if lppd could calculate the Log Likelihood matrix directly from a model (extract data and likelihood function) that could solve it.... but, I can see from playing with that, it's a PITA. But, a thought.

@jebyrnes
Copy link
Author

Ah, yes, I see even using sim() on the above with log_lik = FALSE fails. Interesting.

Error in Imat * sigma_sq : non-conformable arrays

@jebyrnes
Copy link
Author

BTW: you get a much more workable model if you move where you insert the covariance matrix and make it more similar to the GP models earlier in the chapter. Then again, I don't love using variance again - but, the coefs are the same otherwise.

AND now you can get simulated values and compare models. Haven't gone much further, but, perhaps worth looking at?

dat_list <- list(
  N_spp = nrow(dstan),
  sp_id = 1:nrow(dstan),
  M = standardize(log(dstan$body)),
  B = standardize(log(dstan$brain)),
  G = standardize(log(dstan$group_size)),
  Imat = diag(nrow(dstan)) )

no_phylo_withlik <- ulam(
  alist(
    #likelihood
    B ~ dnorm( mu , sigma ),
    
    # DGP
    mu <- a[sp_id] + bM*M + bG*G,
    
    # Autocorr
    vector[N_spp]:a ~ multi_normal(0, K),
    matrix[N_spp,N_spp]: K <- Imat * sigma_sq,
    
    #priors
    a ~ normal( 0 , 1 ),
    c(bM,bG) ~ normal( 0 , 0.5 ),
    sigma_sq ~ exponential( 1 ),
    sigma ~ exponential (1)
  ), data=dat_list , chains=4 , cores=4, log_lik = TRUE)

@rmcelreath
Copy link
Owner

Yeah that form is great for computation and extension to GLMs. But less transparent for teaching, because it has an extra distribution. Tradeoffs.

@jebyrnes
Copy link
Author

I mean, yes and no. It's the same form as the GP stuff above (i.e., like a random intercept), so, if anything, I think there is some good scaffolding there. Just have to explain the extra distribution. Potato Potahto. I'll try it with my students today and see if it works!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants