Simulation-based calibration(SBC) is a technique used to validate and calibrate Bayesian estimation methods and models (Talts et al. 2018). SBC checks whether a model correctly accounts for uncertainty and whether the posterior distributions it produces are consistent with the true data-generating process. First we outline how to perform SBC in theory, after which we describe our experiences with SBC in EMC2, which will also illustrate how to perform SBC for your own model.
Here we briefly outline the steps to performing SBC, for more detail we refer the reader to Talts et al. (2018).
EMC2 allows for both hierarchical and non-hierarchical estimation. Given the desirable statistical and estimation properties of hierarchical models, we normally recommend to use hierarchical models. However, given that estimating hundreds of models is quite time consuming and practically unfeasible without a computational server, we recommend running SBC on non-hierarchical models for typical applications. Here we use the LBA and DDM to illustrate non-hierarchical SBC. Later we’ll also perform hierarchical SBC for the LBA, DDM, RDM and LNR. First we clear our workspace and load EMC2
rm(list = ls())
library(EMC2)
Before we start with the SBC, we want to give a quick note on data compression. By default EMC2 compresses RTs to bins of 20ms, since typical lab set-ups do not allow response times to be measured with a higher resolution. However, generated data is not subject to the same physical constraints. Therefore, and to avoid biases, compression is turned off. To turn on compression regardless, you can set compression = TRUE
, and set the resolution using rt_resolution = 0.02
(for example) in the run_sbc
function, which will speed up computation but potentially introduces biases.
To run SBC we specify a design and a prior for the model. EMC2 assumes normal priors on the parameters for non-hierarchical models. Thus, some parameters are transformed to better match the normal prior and to satisfy the assumption of the normal distribution to have full support on the real line. To check the parameter transformations we use ?LBA
.
matchfun <- function(d) d$S == d$lR
design_LBA <- design(factors=list(subjects=1,S=c("left", "right")),
Rlevels = c("left", "right"),
matchfun = matchfun,
formula =list(v~lM,B~1, t0~1, sv~1, A~1),
constants=c(sv=log(1)),
model = LBA)
prior_LBA <- prior(design_LBA, type = "single",
pmean = c(1.3, .7, log(.8), log(.2), log(.3)),
psd = c(.2, .1, .1, .05, .05))
Now we plot the prior, with map = TRUE
(the default) to see the implied prior on the transformed parameters.
plot_prior(prior_LBA, design_LBA)
Next we can simply call run_sbc
with our specified design and prior. We can use ?run_sbc
to see the description of the function arguments. The trials argument refers to the number of trials per cell. Since we have an 2-level S factor in our design (and all other designs used), we will thus simulate 200 trials per data set. Also note that by default run_sbc will estimate till the point-scale reduction factor is smaller than 1.1 and the minimum effective sample size of any of the estimated parameters is larger than 100.
SBC_LBA_single <- run_sbc(design_LBA, prior_LBA, replicates = 500, trials = 100, plot_data = FALSE,
iter = 1000, n_post = 1000, fileName = "SBC_data/SBC_LBA_single.RData",
cores_per_chain = 30)
In non-hierarchical models, setting cores_per_chain = 30
will estimate 30 simulated data-sets simultaneously. Which is feasible on the computational server we used. But given that the LBA is so quick to compute, non-hierarchical SBC is also feasible on a personal computer. It is recommended to specify fileName
to store intermediary results in case of crashes.
EMC2 also comes with plotting functions that allow us to visualize the results. First we can plot the standard histograms, with a number of equally spaced bins between 0 and 1 of the normalized rank statistic.
plot_sbc_hist(SBC_LBA_single, bins = 10)
Here the gray lines indicate the 2.5, 50 and 97.5% percentile of where n samples from a uniform distribution are expected to fall. As we can see most of the bins are are nicely within these bounds and no clear pattern emerges. However, this visualization very much depends on bin size, as pointed out by Talts et al. (2018). Therefore EMC also comes with plots that visualize the difference in the empirical cumulative density function (ecdf) and the cumulative density function of a uniform distribution.
plot_sbc_ecdf(SBC_LBA_single)
In these dinosaur eggs the blue area visualizes the 95% area of expected cumulative values. As with real eggs, the observed line, (i.e. the cracks) should obviously lie somewhere on the egg. For an explanation and guide to interpretation to these plots we recommend the following article https://hyunjimoon.github.io/SBC/articles/rank_visualizations.html. In the remainder of this article we’ll use these plots to visualize SBC results.
Next we test non-hierarchical for the DDM as well. Initial results actually showed bias in the estimation of non-decision time and non-decision time variability. Whereas we first based our DDM implementation on the rtdists package (Singmann et al. 2022), the bias pushed us to test with alternative packages, after which we stumbled on the WienR package (Hartmann and Klauer 2023). Which not only boasts faster likelihood evaluations, but also no longer shows bias in the SBC results. Hence in the newest CRAN release we adapted our C++ code to instead use the WienR C++ routines.
For the SBC itself we used the following prior and design settings, again being mindful of the required transformations.
#?DDM
design_DDM <- design(factors=list(subjects=1,S=c("left", "right")),
Rlevels = c("left", "right"),
formula =list(v~1,a~1, t0~1, s~1, Z~1, sv~1, SZ~1, st0 ~ 1),
constants=c(s=log(1)),
model = DDM)
prior_DDM <- prior(design_DDM, type = "single",
pmean = c(1, log(.8), log(.3), qnorm(.5), log(.1), qnorm(.05), log(.05)),
psd = c(.15, .15, .1, .05, .15, .15, .15))
SBC_DDM_single <- run_sbc(design_DDM, prior_DDM, replicates = 500, trials = 100,
fileName = "SBC_data/SBC_DDM_single.RData", cores_per_chain = 30)
Next we plot the ECDF difference plot again.
plot_sbc_ecdf(SBC_DDM_single)