Processing math: 100%

Exploring priors

Michael A. Spence, James A. Martindale and Michael J. Thomson

2023-04-24

Introduction

In this document we demonstrate the prior distributions and how to we chose the default priors, based on the example of four species in the North Sea.

Truth

Random walk parameter

The truth follows a multivariate random walk with covariance parameter Λy, ytN(yt1,Λy). In EcoEnsemble, the prior distribution for this parameter is the inverse Wishart distribution. The inverse Wishart distribution is descried by two parameters, the scale matrix Ψ and the degrees of freedom ν. we were unsure of correlations of the four species and therefore set the off diagonal elements of Ψ to zero.

To get an idea of the diagonal elements were examined the marginal distribution of the diagonal elements of Λy. To do this we explored a univariate random walk, yt,iN(yt1,i,λ2). If y0,i=0, then the distribution after T times is yT,iN(0,Tλ2). The example in EcoEnsemble runs for 33 years without any observations from single-species models, and we can investigate what we believe the change in biomass is in 2050 relative to 2017, without any other information. To do this we will look sample from an inverse Wishart distribution, using rstan, with varying degrees of freedom.

inv_wish_st_mod <- stan_model(model_code="data {
  int<lower=0> N;
  real nu;
  matrix[N,N] mat_cov;
}
parameters {
  cov_matrix[N] Sigma;
}
model {
  Sigma ~ inv_wishart(nu, mat_cov);
}
")

The lower the degrees of freedom, the less informative the prior. However, this comes at the expense of computational efficiency and ν>d1 is required. We therefore want ν to be as large as possible without excluding any plausible values. We look at 8 degrees of freedom:

fit.iw <- sampling(inv_wish_st_mod,data=list(N=4,nu=2*4,mat_cov=diag(4)))

We look at the distribution of the first diagonal element

ex.fit.iw <- extract(fit.iw)
plot(density(ex.fit.iw$Sigma[,1,1]),main="",xlab=expression(lambda),xlim=c(0,2))

which in 2050 leads to

sd.2050 <- sqrt(33*ex.fit.iw$Sigma[,1,1])
plot(density(sqrt(33*ex.fit.iw$Sigma[,1,1])),main="",
     xlab="standard deviation in 2050",xlim=c(0,5))

For example, this means that the SSB of cod in 2050, assuming in 2017 it is exactly the value from the assessment (ICES (2020)), will be

x <- seq(5,20,length.out=1000)
plot(x,colMeans(sapply(x, function(xs){dnorm(xs,tail(SSB_obs$Cod,n=1),sd.2050)})),
     type="l",axes=F,ylab="density",xlab="Cod in 2050 (1000 tonnes)")
labsx <- log(signif(exp(seq(5,20,length.out=7)),1))
axis(1,at=labsx,labels=c(exp(labsx)/1000))
axis(2)
abline(v=tail(SSB_obs$Cod,n=1))
abline(v=range(SSB_obs$Cod),lty=2)
box()

The solid vertical line is the assessment’s SSB in 2017, with the dashed lines being the maximum and minimum SSB values from 1984 until 2017. Therefore we used the fairly uninformative prior λyinv-Wishart(I,8).

Initial distribution of SSB

We also require a prior on the distribution of SSB in 1983, that is y0N(0,1000).

Individual discrepancies

Long-term Individual discrepancy

The long-term individual discrepancy for the kth model is γkN(0,Cγ) with Cγ=ΠγPγΠγ, where Pγ is a correlation matrix and Πγ=diag(πγ,1,,πγ,4). We put a uniform distribution on all correlation matrices, PγLKJ(1), and π2γ,igamma(1,1). This prior gives more weight around zero but has an expectation of 1.

xs <- seq(0,5,0.001)
plot(xs,dgamma(xs,1,1),xlab=expression(pi[gamma,i]^2),ylab="density",type="l")

The difference between two simulators, γk,iγl,iN(0,2π2γ,i), for kl. The standard deviation of the difference of two models is

pis <- rgamma(1e5,1,1)
plot(density(sqrt(2 * pis)),xlab="Standard deviation",ylab="Density",main="")

and the distribution of the absolute difference is

xs <- seq(0,5,length.out=1000)
plot(xs,2*colMeans(sapply(xs, function(x){dnorm(x,0,sqrt(2*pis))})),
     type="l",axes=T,ylab="density",xlab="Difference of two models")

We do not expect the models to differ by much than one order of magnitude and therefore we used this prior.

Short-term Individual discrepancy

The short-term individual discrepancy for the kth model follows an auto-regressive process, z(t)kN(Rkz(t1)k,Λk) with Rk=diag(rk,1,,rk,4) Λk=ΠkPkΠk, where Pk is a correlation matrix and Πk=diag(πk,1,,πk,4).

The distribution was f(Pk|α,β){N1i=1Nj=i+11B(αρ,i,j,βρ,i,j)s(Pk,i,j)αρ,i,j1(1s(Pk,i,j))βρ,i,j1if P is positive definite0otherwise. with s(ρ)=ρ+12, and hyperpriors αρ,i,jgamma(0.1,0.1) and βρ,i,jgamma(0.1,0.1). Sampling from this prior

cor_pri_st <- stan_model(model_code = " functions{
  real priors_cor_hierarchical_beta(matrix ind_st_cor, int N, matrix M){
    real log_prior = 0;
    for (i in 1:(N-1)){
      for (j in (i+1):N){
        log_prior += beta_lpdf(0.5*(ind_st_cor[i, j] + 1)| M[i, j], M[j, i] );
      }
    }
    return log_prior;
  }
}  
data {

  vector[4] cor_p;
}

parameters {
  matrix <lower=0>[5,5] beta_pars;
  corr_matrix[5] rho[4];
}
model {
  
  for (i in 1:4){
    for (j in (i+1):5){
      beta_pars[i,j] ~ gamma(cor_p[1],cor_p[2]);
      beta_pars[j,i] ~ gamma(cor_p[3],cor_p[4]);
    }
  }
  for(i in 1:4){
    target += priors_cor_hierarchical_beta(rho[i],5,beta_pars);
    diagonal(beta_pars) ~ std_normal();
  }
}
")
fit_cor <- sampling(cor_pri_st,data=list(cor_p=0.1 * c(1,1,1,1)),chains=4)
ex.fit <- extract(fit_cor)
def.par <- par(no.readonly=TRUE) #Old pars to reset afterwards
par(mfrow=c(2,2))
plot(density(ex.fit$beta_pars[,1,2]),xlab=expression(alpha[rho]),main="")
plot(density(log(ex.fit$beta_pars[,1,2]/ex.fit$beta_pars[,2,1])),main="",
     xlab="Expected log odds")
plot(density(ex.fit$rho[,1,1,2]),xlab=expression(rho),main="")
xs <- seq(-1,1,length.out=100)
lines(xs,dbeta((xs+1)/2,2,2)/2,col="red")
plot(density(apply(ex.fit$rho[,,1,2],1,var)),xlim=c(0,0.35),main="",
     xlab=expression(var(rho)))
par(def.par)

The above plot shows the different aspects of the emergent distribution of this prior. The top left plot shows the marginal distribution of parameter of the beta distribution with the top right shows the log odds of the expectation of the beta distribution. The bottom left plot shows the marginal distribution of ρk,i,i, with the red curve being the distribution that gives a uniform distribution over all correlation matrices. We gave slightly more weight to correlation matrices that are less correlated. The bottom right plot shows the variance of ρk,i,i over k. The variance of a uniform distribution between -1 and 1 is 1/3 and the variance of the marginal distribution of uniform correlation parameters is 1/5. We allowed the prior to be fairly uninformative over this space, as we were unsure how ρk,i,i and ρl,i,i related to one another, however, we do believe they may be similar and therefore we increased our beliefs around 0.

The ith diagonal element of Πk, πk,i, was ln(π2k,i)N(μπ,i,σ2π,i), with zeros on the off diagonals. The hyperpriors were μπ,iN(3,12) and σ2π,iinv-gamma(8,4). Sampling from this leads to

st_mod1 <- stan_model(model_code = "data {
  vector[4] gam_p;
}
parameters {
  vector [5] log_sha_st_var[4];
  vector[5] gamma_mean;
  vector<lower=0> [5] gamma_var;
}
model {
  gamma_mean ~ normal(gam_p[1],gam_p[2]);
  gamma_var ~ inv_gamma(gam_p[3],gam_p[4]);
  for (i in 1:4){
    log_sha_st_var[i] ~ normal(gamma_mean,sqrt(gamma_var));
    
  }
}
generated quantities{
  vector[5] sha_st_var[4];
  for (i in 1:4){
    sha_st_var[i] = exp(log_sha_st_var[i]);
  }
}
")
test.fit_norm <- sampling(st_mod1,data=list(gam_p=c(-3,1,8,4)),chains=4)
ex.fit <- extract(test.fit_norm)
def.par <- par(no.readonly=TRUE) #old pars
par(mfrow=c(2,2))
plot(density(ex.fit$gamma_mean[,1]),main="",xlab=expression(mu[pi]))
plot(density(ex.fit$gamma_var[,1]),xlim=c(0,1),main="",xlab=expression(sigma[pi]^2))
plot(density(ex.fit$sha_st_var[,1,1]),xlim=c(0,1),main="",xlab=expression(pi["k,i"]^2))
plot(density(apply(ex.fit$sha_st_var[,,1],1,var)),xlim=c(0,0.2),
     xlab=expression(var(pi["k,i"]^2)) ,main="")
par(def.par)

The top left plot shows the marginal distribution of μπ,i and the top right plot being σ2π,i. The bottom left plot shows the prior distribution of pi2k,i. For the same reason that describes the variance of the random walk on the truth, we felt that σ2π,i should not be larger than 0.2. For a similar reason to the correlation parameters, the variance of σ2π,i will also be small.

For the auto regressive parameter, rk,i, we set a boundary avoiding prior rk,i+12beta(2,2).

hist(rbeta(1e5,2,2)*2 - 1,probability = T,xlab=expression(r["k,i"]),main="")

Shared discrepancies

Long-term shared discrepancy

We set the prior on the long-term shared discrepancy to be δN(0,52I). This is a fairly vague prior as we did not expect the simulators to differ by the truth by more than 5 degrees of magnitude.

Short-term shared discrepancy

The shared short-term discrepancy for the kth model follows an auto-regressive process, η(t)N(Rηη(t1),Λη) with Rη=diag(rη,1,,rη,4) Λk=ΠkPkΠk, where Pη is a correlation matrix and Πη=diag(πη,1,,πη,4).

We put a uniform prior on all correlation matrices, PηLKJ(1), and, for similar reasons for the variance of the random walk on the truth we set π2η,igamma(1,10), which has an expectation of 0.1, and is

plot(density(rgamma(1e5,1,10)),xlim=c(0,0.5), main="",xlab=expression(pi[eta]^2))

and the standard deviation of

plot(density(sqrt(rgamma(1e5,1,10))),xlim=c(0,0.5), main="",xlab=expression(pi[eta]))

For the auto regressive parameter, rk,i, we set a boundary avoiding prior rη,i+12beta(2,2) or

hist(rbeta(1e5,2,2)*2 - 1,probability = T,xlab=expression(r["k,i"]),main="")

Prior predictive distribution

The prior predictive distribution is then

priors <- EnsemblePrior(4,
                        ind_st_params =IndSTPrior("hierarchical",list(-3,1,8,4),
                                                  list(0.1,0.1,0.1,0.1),AR_params=c(2,2)),
                        ind_lt_params = IndLTPrior("lkj",list(1,1),1),
                        sha_st_params = ShaSTPrior("lkj",list(1,10),1,AR_params=c(2,2)),
                        sha_lt_params = 5
)
prior_density <- prior_ensemble_model(priors, M = 4)
samples <- sample_prior(observations = list(SSB_obs, Sigma_obs),
             simulators = list(list(SSB_ewe, Sigma_ewe,"EwE"),
                   list(SSB_fs ,  Sigma_fs,"FishSums"),
                   list(SSB_lm ,  Sigma_lm,"LeMans"),
                   list(SSB_miz, Sigma_miz,"mizer")),
             priors=priors,
             sam_priors = prior_density)
plot(samples)

References

ICES. 2020. “Report of the Working Group on the Assessment of Demersal Stocks in the North Sea and Skagerrak.” ICES Scientific Reports. 2:61. 1353, ICES, Copenhagen. https://doi.org/10.17895/ices.pub.6092.