This vignette demonstrates how to access most of data stored in a stanfit object. A stanfit object (an object of class "stanfit"
) contains the output derived from fitting a Stan model using Markov chain Monte Carlo or one of Stan’s variational approximations (meanfield or full-rank). Throughout the document we’ll use the stanfit object obtained from fitting the Eight Schools example model:
library(rstan)
fit <- stan_demo("eight_schools", refresh = 0)
Warning: There were 2 divergent transitions after warmup. Increasing adapt_delta above 0.8 may help. See
http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
Warning: Examine the pairs() plot to diagnose sampling problems
class(fit)
[1] "stanfit"
attr(,"package")
[1] "rstan"
There are several functions that can be used to access the draws from the posterior distribution stored in a stanfit object. These are extract
, as.matrix
, as.data.frame
, and as.array
, each of which returns the draws in a different format.
The extract
function (with its default arguments) returns a list with named components corresponding to the model parameters.
list_of_draws <- extract(fit)
print(names(list_of_draws))
[1] "mu" "tau" "eta" "theta" "lp__"
In this model the parameters mu
and tau
are scalars and theta
is a vector with eight elements. This means that the draws for mu
and tau
will be vectors (with length equal to the number of post-warmup iterations times the number of chains) and the draws for theta
will be a matrix, with each column corresponding to one of the eight components:
head(list_of_draws$mu)
[1] 8.2844122 10.4875959 14.5663896 16.9862816 6.1280491 0.6397778
head(list_of_draws$tau)
[1] 3.7212721 7.9431579 7.4726591 2.1506500 0.8270646 23.9183048
head(list_of_draws$theta)
iterations [,1] [,2] [,3] [,4] [,5] [,6]
[1,] 5.65264 8.405672 8.244079 4.858207 1.780556 5.4690908
[2,] 26.10733 13.100819 7.789508 3.938870 7.037026 0.6458839
[3,] 19.61089 16.904865 10.481563 14.742122 11.756919 9.8503130
[4,] 12.55857 13.721895 13.981145 18.673081 13.433269 14.8926715
[5,] 5.31915 4.231046 5.055644 5.361466 6.129745 5.6578080
[6,] 18.29333 8.602399 8.709651 -14.487728 12.621866 -13.3436096
iterations [,7] [,8]
[1,] 13.996459 9.011893
[2,] 13.531454 5.038733
[3,] 15.648081 10.754291
[4,] 15.906008 17.339201
[5,] 6.782138 6.877611
[6,] 28.327480 6.919374
The as.matrix
, as.data.frame
, and as.array
functions can also be used to retrieve the posterior draws from a stanfit object:
matrix_of_draws <- as.matrix(fit)
print(colnames(matrix_of_draws))
[1] "mu" "tau" "eta[1]" "eta[2]" "eta[3]" "eta[4]"
[7] "eta[5]" "eta[6]" "eta[7]" "eta[8]" "theta[1]" "theta[2]"
[13] "theta[3]" "theta[4]" "theta[5]" "theta[6]" "theta[7]" "theta[8]"
[19] "lp__"
df_of_draws <- as.data.frame(fit)
print(colnames(df_of_draws))
[1] "mu" "tau" "eta[1]" "eta[2]" "eta[3]" "eta[4]"
[7] "eta[5]" "eta[6]" "eta[7]" "eta[8]" "theta[1]" "theta[2]"
[13] "theta[3]" "theta[4]" "theta[5]" "theta[6]" "theta[7]" "theta[8]"
[19] "lp__"
array_of_draws <- as.array(fit)
print(dimnames(array_of_draws))
$iterations
NULL
$chains
[1] "chain:1" "chain:2" "chain:3" "chain:4"
$parameters
[1] "mu" "tau" "eta[1]" "eta[2]" "eta[3]" "eta[4]"
[7] "eta[5]" "eta[6]" "eta[7]" "eta[8]" "theta[1]" "theta[2]"
[13] "theta[3]" "theta[4]" "theta[5]" "theta[6]" "theta[7]" "theta[8]"
[19] "lp__"
The as.matrix
and as.data.frame
methods essentially return the same thing except in matrix and data frame form, respectively. The as.array
method returns the draws from each chain separately and so has an additional dimension:
print(dim(matrix_of_draws))
print(dim(df_of_draws))
print(dim(array_of_draws))
[1] 4000 19
[1] 4000 19
[1] 1000 4 19
By default all of the functions for retrieving the posterior draws return the draws for all parameters (and generated quantities). The optional argument pars
(a character vector) can be used if only a subset of the parameters is desired, for example:
mu_and_theta1 <- as.matrix(fit, pars = c("mu", "theta[1]"))
head(mu_and_theta1)
parameters
iterations mu theta[1]
[1,] 6.885489 9.488512
[2,] 10.482859 10.680251
[3,] 6.351460 5.359162
[4,] 8.458084 10.278552
[5,] 2.932725 9.897608
[6,] 6.693939 3.495184
Summary statistics are obtained using the summary
function. The object returned is a list with two components:
fit_summary <- summary(fit)
print(names(fit_summary))
[1] "summary" "c_summary"
In fit_summary$summary
all chains are merged whereas fit_summary$c_summary
contains summaries for each chain individually. Typically we want the summary for all chains merged, which is what we’ll focus on here.
The summary is a matrix with rows corresponding to parameters and columns to the various summary quantities. These include the posterior mean, the posterior standard deviation, and various quantiles computed from the draws. The probs
argument can be used to specify which quantiles to compute and pars
can be used to specify a subset of parameters to include in the summary.
For models fit using MCMC, also included in the summary are the Monte Carlo standard error (se_mean
), the effective sample size (n_eff
), and the R-hat statistic (Rhat
).
print(fit_summary$summary)
mean se_mean sd 2.5% 25%
mu 7.770245541 0.13765216 5.2594172 -2.8486394 4.5287242
tau 6.755682436 0.15191247 5.7732529 0.2960321 2.5602897
eta[1] 0.403269632 0.01450684 0.9174930 -1.4656835 -0.1971825
eta[2] 0.005374107 0.01385929 0.8765383 -1.7584611 -0.5516050
eta[3] -0.176228133 0.01427253 0.9026743 -1.8924412 -0.7989253
eta[4] -0.034664075 0.01420498 0.8984021 -1.8829474 -0.6222808
eta[5] -0.364837092 0.01368433 0.8654733 -2.0589947 -0.9305557
eta[6] -0.213148827 0.01388587 0.8782192 -1.8677761 -0.8227277
eta[7] 0.343032306 0.01417097 0.8962510 -1.4535196 -0.2361785
eta[8] 0.064963225 0.01450109 0.9171292 -1.7740982 -0.5433603
theta[1] 11.566795772 0.15424593 8.2528233 -1.5791540 6.1535357
theta[2] 7.846049194 0.09793331 6.1938464 -4.9071872 4.0747315
theta[3] 6.154196370 0.12201919 7.7171711 -11.7021529 2.0143652
theta[4] 7.611080767 0.10369559 6.5582851 -5.7462150 3.6532580
theta[5] 4.987566923 0.10333770 6.5356500 -9.2813210 1.2844960
theta[6] 5.981117518 0.10749754 6.7987415 -8.5771651 2.0238109
theta[7] 10.673833541 0.11487101 6.9596098 -1.1177540 5.9626370
theta[8] 8.516756196 0.12558625 7.6371340 -6.2369708 4.1016994
lp__ -39.443632837 0.07376289 2.6397883 -45.5118497 -41.0216339
50% 75% 97.5% n_eff Rhat
mu 7.89363478 11.1375647 17.873010 1459.853 1.0000213
tau 5.31016481 9.3174222 21.755167 1444.290 1.0001265
eta[1] 0.43180538 1.0244886 2.132712 4000.000 1.0001984
eta[2] 0.00489788 0.5676211 1.714141 4000.000 0.9996242
eta[3] -0.19623300 0.4257311 1.621619 4000.000 1.0002980
eta[4] -0.02331332 0.5460867 1.734768 4000.000 0.9996089
eta[5] -0.37357803 0.1877611 1.358239 4000.000 1.0011933
eta[6] -0.23361531 0.3595369 1.559855 4000.000 0.9997669
eta[7] 0.37466690 0.9320453 2.029201 4000.000 0.9999762
eta[8] 0.06316176 0.6761046 1.859379 4000.000 1.0001782
theta[1] 10.41210203 15.6633759 31.347733 2862.712 0.9999234
theta[2] 7.79958012 11.6780950 20.612174 4000.000 0.9992580
theta[3] 6.64078394 10.7976316 20.258949 4000.000 0.9995698
theta[4] 7.68433816 11.6971702 20.847699 4000.000 1.0000845
theta[5] 5.44637698 9.3590565 16.626620 4000.000 0.9994977
theta[6] 6.33226227 10.3782163 18.754118 4000.000 0.9994949
theta[7] 9.92787896 14.5613726 26.419400 3670.700 1.0011985
theta[8] 8.31578329 12.8689000 24.546756 3698.083 0.9999282
lp__ -39.20229304 -37.5522164 -35.117664 1280.744 0.9999940
If, for example, we wanted the only quantiles included to be 10% and 90%, and for only the parameters included to be mu
and tau
, we would specify that like this:
mu_tau_summary <- summary(fit, pars = c("mu", "tau"), probs = c(0.1, 0.9))$summary
print(mu_tau_summary)
mean se_mean sd 10% 90% n_eff Rhat
mu 7.770246 0.1376522 5.259417 1.658820 14.15110 1459.853 1.000021
tau 6.755682 0.1519125 5.773253 1.017328 14.12485 1444.290 1.000127
Since mu_tau_summary
is a matrix we can pull out columns using their names:
mu_tau_80pct <- mu_tau_summary[, c("10%", "90%")]
print(mu_tau_80pct)
10% 90%
mu 1.658820 14.15110
tau 1.017328 14.12485
For models fit using MCMC the stanfit object will also contain the values of parameters used for the sampler. The get_sampler_params
function can be used to access this information.
The object returned by get_sampler_params
is a list with one component (a matrix) per chain. Each of the matrices has number of columns corresponding to the number of sampler parameters and the column names provide the parameter names. The optional argument inc_warmup (defaulting to TRUE
) indicates whether to include the warmup period.
sampler_params <- get_sampler_params(fit, inc_warmup = FALSE)
sampler_params_chain1 <- sampler_params[[1]]
colnames(sampler_params_chain1)
[1] "accept_stat__" "stepsize__" "treedepth__" "n_leapfrog__"
[5] "divergent__" "energy__"
To do things like calculate the average value of accept_stat__
for each chain (or the maximum value of treedepth__
for each chain if using the NUTS algorithm, etc.) the sapply
function is useful as it will apply the same function to each component of sampler_params
:
mean_accept_stat_by_chain <- sapply(sampler_params, function(x) mean(x[, "accept_stat__"]))
print(mean_accept_stat_by_chain)
[1] 0.9003524 0.8687311 0.8546695 0.8594700
max_treedepth_by_chain <- sapply(sampler_params, function(x) max(x[, "treedepth__"]))
print(max_treedepth_by_chain)
[1] 4 5 5 5
The Stan program itself is also stored in the stanfit object and can be accessed using get_stancode
:
code <- get_stancode(fit)
The object code
is a single string and is not very intelligible when printed:
print(code)
[1] "data {\n int<lower=0> J; // number of schools \n real y[J]; // estimated treatment effects\n real<lower=0> sigma[J]; // s.e. of effect estimates \n}\nparameters {\n real mu; \n real<lower=0> tau;\n vector[J] eta;\n}\ntransformed parameters {\n vector[J] theta;\n theta = mu + tau * eta;\n}\nmodel {\n target += normal_lpdf(eta | 0, 1);\n target += normal_lpdf(y | theta, sigma);\n}"
attr(,"model_name2")
[1] "schools"
A readable version can be printed using cat
:
cat(code)
data {
int<lower=0> J; // number of schools
real y[J]; // estimated treatment effects
real<lower=0> sigma[J]; // s.e. of effect estimates
}
parameters {
real mu;
real<lower=0> tau;
vector[J] eta;
}
transformed parameters {
vector[J] theta;
theta = mu + tau * eta;
}
model {
target += normal_lpdf(eta | 0, 1);
target += normal_lpdf(y | theta, sigma);
}
The get_inits
function returns initial values as a list with one component per chain. Each component is itself a (named) list containing the initial values for each parameter for the corresponding chain:
inits <- get_inits(fit)
inits_chain1 <- inits[[1]]
print(inits_chain1)
$mu
[1] -1.158445
$tau
[1] 1.869233
$eta
[1] 1.64329656 -1.78660708 -1.09556178 0.71384694 -0.03934986 -1.86726194
[7] 0.54581501 -0.63981707
$theta
[1] 1.9132580 -4.4980294 -3.2063051 0.1759005 -1.2319994 -4.6487921
[7] -0.1381902 -2.3544123
The get_seed
function returns the (P)RNG seed as an integer:
print(get_seed(fit))
[1] 864024368
The get_elapsed_time
function returns a matrix with the warmup and sampling times for each chain:
print(get_elapsed_time(fit))
warmup sample
chain:1 0.034945 0.032241
chain:2 0.036331 0.031823
chain:3 0.031634 0.028942
chain:4 0.030773 0.030251