11-REMA-ME.Rmd

# (PART) Evidence synthesis and measurements with error {-}
# \index{Meta-analysis} Meta-analysis and \index{Measurement-error model} measurement error models  {#ch-remame}

In this chapter, we introduce two relatively underutilized modeling approaches that are potentially very important for cognitive science: meta-analysis and  measurement-error models.

Meta-analysis can be very informative when carrying out systematic reviews, and measurement-error models are able to take into account uncertainty in one's dependent or independent variable (or both). What's common to these two classes of model is that they both assume that the $n$-th measured data point  $y_n$  has a location parameter, say $\zeta_n$ (pronounced *zeta en*), that is measured with some uncertainty that can be represented by the standard error  $SE_n$ of the measurement $y_n$:

$y_n \sim \mathit{Normal}(\zeta_n,SE_n)$

In both classes of model, the goal is to obtain a posterior distribution of a latent parameter $\zeta$ which is assumed to generate the $\zeta_n$, with some standard deviation $\tau$. The parameter $\tau$ quantifies the noise in the measurement process or the between-study variability in a meta-analysis.

$\zeta_n \sim \mathit{Normal}(\zeta,\tau)$

The main parameter of interest is usually $\zeta$, but the posterior distributions of $\tau$ and $\zeta_n$ can also be informative. The above model specification should remind you of the hierarchical models we saw in earlier chapters.

## Meta-analysis

Once a number of studies have accumulated on a particular topic, it can be very informative to synthesize the data. Here is a commonly used approach--a random-effects meta-analysis.

### A meta-analysis of similarity-based interference in sentence comprehension

The model is set up as follows. For each study $n$, let effect$_n$ be the effect of interest, and let $SE_n$ be the \index{Standard error of the effect} standard error of the effect. A concrete example of a recent meta-analysis is the effect of \index{Similarity-based interference} similarity-based interference in sentence comprehension [@JaegerEngelmannVasishth2017]; when two nouns are more similar to each other, there is greater processing difficulty (i.e., longer \index{Reading time} reading times in milliseconds) when an attempt is made to retrieve one of the nouns to complete a linguistic dependency (such as a subject-verb dependency).  The estimate of the effect and its standard error is the information we have from each study $n$.

First, load the data, and add an id variable that identifies each experiment.


```{r, message = FALSE}
data("df_sbi")
(df_sbi <- df_sbi %>%
  mutate(study_id = 1:n()))
```

The effect size and standard errors were estimated from published summary statistics in the respective article. In some cases, this involved a certain amount of guesswork; the details are documented in the online material accompanying @JaegerEngelmannVasishth2017.

We begin with the assumption that  there  is a true (unknown) effect $\zeta_n$ that lies behind each of these studies. Each of the observed  effects has an uncertainty associated with it, $SE_n$. We can therefore assume that each observed effect, effect$_n$, is generated as follows:

\begin{equation}
\text{effect}_n \sim \mathit{Normal}(\zeta_n,SE_n)
\end{equation}

Each study is assumed to have a  different true effect $\zeta_n$ because each study will have been carried out under different conditions: in a different lab with different protocols and workflows, with different subjects, possibly with different languages, with slightly different experimental designs, etc.^[In the current example, the dependent variable is either self-paced reading time or first-pass reading time from eyetracking. A possible criticism here is that these two different dependent measures should not appear in the same meta-analysis. Despite these being quite different dependent variables, the simplifying assumption we made was that the dependent measure, which is in milliseconds in both the self-paced reading studies and eyetracking studies, is a difference of means between two (sets of) conditions, and therefore gives an estimate of the effect of interest. An alternative approach would have been to standardize the effect size (Cohen's d) in each study; but this would be difficult to carry out in this example as much of the published work did not provide the original data; in many of the cases, only summary statistics from the publication were available.  Such standardized effect sizes would also be open to criticism, because the effect size in self-paced reading and eyetracking may also not be comparable.]

Further, each of the true  underlying effects $\zeta_n$ has behind it some true unknown value $\zeta$. The parameter  $\zeta$ represents the underlying effect of similarity-based interference across experiments. Our goal is to obtain the posterior distribution of this overall effect.

We can write the above statement as follows:

\begin{equation}
\zeta_n \sim\mathit{Normal}(\zeta,\tau)
\end{equation}

$\tau$ is the between-study standard deviation; this expresses the assumption that there will be some variability between the true effects $\zeta_n$.

To summarize the model:

- effect$_n$ is the observed effect (in this example, in milliseconds) in the $n$-th study.
- $\zeta_n$ is the true (unknown) effect in each study.
- $\zeta$ is the true (unknown) effect of the experimental manipulation, namely, the similarity-based interference effect.
- Each $SE_n$ is estimated from the standard error available from study $n$.
- The parameter $\tau$ represents \index{Between-study standard deviation} between-study standard deviation.

We can construct a \index{Hierarchical model} hierarchical model as follows:

\begin{equation}
\begin{aligned}
\text{effect}_n \sim & \mathit{Normal}(\zeta_n, SE_n) \quad n=1,\dots, N_{studies}\\
\zeta_n \sim & \mathit{Normal}(\zeta, \tau) \\
\zeta \sim & \mathit{Normal}(0,100)\\
 \tau \sim & \mathit{Normal}_+(0,100)
\end{aligned}
(\#eq:ma0)
\end{equation}

The priors are based on domain knowledge; it seems reasonable to allow the effect to range a priori from $-200$ to $+200$ ms with probability $95$%.  Of course, a sensitivity analysis is necessary (but skipped here).

This model can be implemented in \index{brms} `brms` in a relatively straightforward way as shown below. We show the Stan version later in the chapter (section \@ref(sec-stanma));  the Stan version presents some interesting challenges that can be useful for the reader interested in deepening their Stan modeling knowledge.

#### `brms` version of the meta-analysis model {#sec-brmsmeta}

First, define the priors:

```{r}
priors <- c(prior(normal(0, 100), class = Intercept),
            prior(normal(0, 100), class = sd))
```

Fit the model as follows. Because of our relatively uninformative priors and the few data points, the models of this chapter require us to tune the `control` parameter, increasing \index{\texttt{adapt\_delta}} `adapt_delta` and \index{\texttt{max\_treedepth}} `max_treedepth`.

```{r fitsbi, message = FALSE, eval = !file.exists("dataR/fit_sbi.RDS")}
fit_sbi <- brm(effect | resp_se(`SE`, sigma = FALSE) ~
                 1 + (1 | study_id),
               data = df_sbi,
               prior = priors,
               control = list(adapt_delta = 0.99, max_treedepth = 10))
```

```{r, echo= FALSE, eval = TRUE}
if(!file.exists("dataR/fit_sbi.RDS")){
  saveRDS(fit_sbi, "dataR/fit_sbi.RDS")
} else {
  fit_sbi <- readRDS("dataR/fit_sbi.RDS")
}
```


The posterior of $\zeta$ and $\tau$ are summarized below as `Intercept` and `sd(Intercept)`.

```{r, eval = FALSE}
fit_sbi
```

```{r, echo = FALSE}
short_summary(fit_sbi)
```

The `sigma` parameter does not play any role in this model, but appears in the `brms` output anyway. In the model specification, `sigma` was explicitly removed by writing `sigma = FALSE`. For this reason, we can ignore that parameter in the model summary output above. Online section \@ref(app-sigmatrue) explains what happens if we set `sigma = TRUE`.

As theory predicts, the overall effect from these studies has a positive sign.

One advantage of such a meta-analysis  is that the posterior can now be used as  an informative prior for a future study. This is especially important when doing an analysis using Bayes factors. But this meta-analysis posterior could also be used as an informative prior in a future experiment; that would allow the researcher to build on what is known so far from published studies.


Another interesting by-product of a random-effects meta-analysis is the possibility of displaying a \index{Forest plot} forest plot (Figure \@ref(fig:forest)). A forest plot shows the meta-analytic estimate (the parameter `b_Intercept` in `brms`) alongside the original estimates effect$_n$ (and their SE$_n$) and the posterior distributions of the $\zeta_n$ for each study (we reconstruct these estimates by adding `b_Intercept` to the parameters starting with `r_` in `brms`). The original estimates are the ones fed to the model as data and the posterior distributions of the $\zeta_n$ are calculated, as in previous hierarchical models, after the information from all studies is pooled together. The  $\zeta_n$ estimates are shrunken estimates of each study's (unknown) true effect, shrunken towards the grand mean $\zeta$, and weighted by the standard error observed in each study $n$. The $\zeta_n$ for a particular study is shrunk more towards the grand mean $\zeta$ when the study's standard error is large (i.e., when the estimate is very imprecise). The code below shows how to build a forest plot step by step.

First, change the format of the data so that it looks like the output of `brms`:

```{r}
df_sbi <- df_sbi %>%
  mutate(Q2.5 = effect - 2 * SE,
         Q97.5 = effect + 2 * SE,
         Estimate = effect,
         type = "original")
```

Extract the meta-analytical estimate:

```{r}
df_Intercept <- posterior_summary(fit_sbi,
                                  variable = c("b_Intercept")) %>%
  as.data.frame() %>%
  mutate(publication = "M.A. estimate", type = "")
```

For the pooled estimated effect (or fitted value) of the individual studies, we need the sum of the  meta-analytical estimate (intercept) and each of the by-study adjustment. Obtain this with the `fitted()` function:

```{r}
df_model <- fitted(fit_sbi) %>%
  # Convert matrix to data frame:
  as.data.frame() %>%
  # Add a column to identify the estimates,
  # and another column to identify the publication:
  mutate(type = "adjusted",
         publication = df_sbi$publication)
```


Bind the observed effects, the meta-analytical estimate, and the fitted values of the studies together, and plot the data:

(ref:forest) Forest plot showing the original and the adjusted estimates computed from each study from the random-effects meta-analysis. The error bars on the original estimates show 95% confidence intervals, and those on the adjusted estimates show 95% credible intervals.

```{r forest, fig.cap = "(ref:forest)"}
# the adjusted estimates and the meta-analysis estimate:
bind_rows(df_sbi, df_model, df_Intercept) %>%
  # Plot:
  ggplot(aes(x = Estimate,
             y = publication,
             xmin = Q2.5,
             xmax = Q97.5,
             color = type)) +
  geom_point(position = position_dodge(.5)) +
  geom_errorbarh(position = position_dodge(.5)) +
  # Add the meta-analytic estimate and Credible Interval:
  geom_vline(xintercept = df_Intercept$Q2.5,
             linetype = "dashed",
             alpha = .3) +
  geom_vline(xintercept = df_Intercept$Q97.5,
             linetype = "dashed",
             alpha = .3) +
  geom_vline(xintercept = df_Intercept$Estimate,
             linetype = "dashed",
             alpha = .5) +
  scale_color_discrete(breaks = c("adjusted", "original"))
```

It is important to keep in mind that a meta-analysis is always going to yield  \index{Biased estimate} biased estimates as long as we have \index{Publication bias} publication bias: if a field has a tendency to allow only "big news" studies to be published, then the literature that will appear in the public domain will be biased, and any meta-analysis based on such information will be biased. Despite this limitation, a meta-analysis is still a useful way to synthesize the known evidence; one just has to remember that the estimate from the meta-analysis is almost certain to be biased.

#### Stan version of the meta-analysis model {#sec-stanma}

Even though `brms` can handle meta-analyses, fitting them in Stan allows us for more flexibility, which might be necessary in some cases. As a first attempt we could build a model that closely follows the formal specification given in Equation \@ref(eq:ma0).

```{r, echo = FALSE}
ma0 <- system.file("stan_models",
                   "meta-analysis0.stan",
                   package = "bcogsci")
```

```{stan output.var = "ma0_stan", code = readLines(ma0),  tidy = TRUE, comment="", eval = FALSE, cache = FALSE, cache.lazy = FALSE}
```

Fit the model as follows:

```{r ma0stan, message=FALSE, results="hide"}
ma0 <- system.file("stan_models",
                   "meta-analysis0.stan",
                   package = "bcogsci")
ls_sbi <- list(N = nrow(df_sbi),
               effect = df_sbi$effect,
               SE = df_sbi$SE,
               study_id = df_sbi$study_id)
fit_sbi0 <- stan(ma0,
                 data = ls_sbi,
                 control = list(adapt_delta = 0.999, max_treedepth = 12))
```

We see that there are warnings. As discussed in section \@ref(sec-uncorrstan), we can use pairs plots to uncover pathologies in the sampling. Here we see the samples of `zeta` and `tau` are highly correlated:

```{r, warning = FALSE, fig.height = 3.5}
pairs(fit_sbi0, pars = c("zeta", "tau"))
```

We face a similar problem as we faced in section \@ref(sec-uncorrstan), namely, the sampler cannot properly explore the neck of the funnel-shaped space, because of the strong correlation between the parameters. The solution is, as in section \@ref(sec-uncorrstan), a non-centered parameterization. Re-write Equation \@ref(eq:ma0) as follows:

\begin{equation}
\begin{aligned}
z_n & \sim \mathit{Normal}(0, 1)\\
\zeta_n &= z_n \cdot \tau + \zeta \\
\text{effect}_n & \sim  \mathit{Normal}(\zeta_n, SE_n)\\
\zeta &\sim \mathit{Normal}(0,100)\\
\tau &\sim  \mathit{Normal}_+(0,100)
\end{aligned}
(\#eq:ma1)
\end{equation}

This works because if $X \sim\mathit{Normal}(a, b)$ and $Y \sim \mathit{Normal}(0, 1)$, then $X = a + Y \cdot b$.  You can re-visit section \@ref(sec-uncorrstan) for more details.

Translate Equation \@ref(eq:ma1) into Stan code as follows in `meta-analysis1.stan`:

```{r, echo = FALSE}
ma1 <- system.file("stan_models",
                   "meta-analysis1.stan",
                   package = "bcogsci")
```

```{stan output.var = "ma0_stan", code = readLines(ma1),  tidy = TRUE, comment="", eval = FALSE, cache = FALSE, cache.lazy = FALSE}
```


The model converges with values virtually identical to the ones of the `brms` model.

```{r ma1stan, message=FALSE, results="hide",eval = !file.exists("dataR/fit_sbi1.RDS") }
ma1 <- system.file("stan_models",
                   "meta-analysis1.stan",
                   package = "bcogsci")
fit_sbi1 <- stan(ma1,
                 data = ls_sbi,
                 control = list(adapt_delta = 0.999,
                                max_treedepth = 12))
```

```{r, echo = FALSE}
if (!file.exists("dataR/fit_sbi1.RDS")) {
  saveRDS(fit_sbi1, "dataR/fit_sbi1.RDS")
} else {
  fit_sbi1 <- readRDS("dataR/fit_sbi1.RDS")
}
```


```{r}
print(fit_sbi1, pars = c("zeta", "tau"))
```


We can also reparameterize the model slightly differently, if we set $U_{n} \sim\mathit{Normal}(0 , SE_n)$ then,

\begin{equation}
\text{effect}_n =  U_n + \zeta_n
\end{equation}

Then, given that $\zeta_n \sim \mathit{Normal}(\zeta, \tau)$,

\begin{equation}
\text{effect}_n \sim \mathit{Normal}(\zeta, \sqrt{SE^2 + \tau^2}) (\#eq:ma2-again2)
\end{equation}

See online section \@ref(app-sigmatrue) if it's not clear why this reparameterization works.

This is equivalent to the `brms` model where `sigma = TRUE`. Parameterizing the model in this way causes us to lose the possibility of estimating the posterior of the true effect of the individual studies.

Write this in Stan as follows; this code is available in the file  `meta-analysis2.stan` within the `bcogsci` package:

```{r, echo = FALSE}
ma2 <- system.file("stan_models",
  "meta-analysis2.stan",
  package = "bcogsci")
```

```{stan output.var = "ma2_stan", code = readLines(ma2),  tidy = TRUE, comment="", eval = FALSE, cache = FALSE, cache.lazy = FALSE}
```

Fit the model:

```{r ma2sbi2, message=FALSE, results="hide", cache.lazy = FALSE, cache = FALSE}
ma2 <- system.file("stan_models",
                   "meta-analysis2.stan",
                   package = "bcogsci")
fit_sbi2 <- stan(ma2,
                 data = ls_sbi,
                 control = list(adapt_delta = 0.9))
```

```{r}
print(fit_sbi2, pars = c("zeta", "tau"))
```

This summary could be reported in an article by displaying the posterior means and 95% credible intervals of the parameters.

## \index{Measurement-error model} Measurement-error models

Measurement error models deal with the situation where some predictor or the dependent variable, or both, are observed with measurement error. This measurement error could arise because a variable is an average (i.e., its standard error can also be estimated), or because we know that our measurement is noisy due to limitations of our equipment (e.g., delays in the signal from the keyboard to the motherboard, impedance in the electrodes in an EEG system, etc.).

### Accounting for measurement error in individual differences in working memory capacity and reading fluency

As a motivating example, consider the following data from @NicenboimEtAlCogSci2018.
For each subject, we have the partial-credit unit (PCU) scores of an operation span task as a  measure of their \index{Working memory capacity} working memory capacity [@conway2005working] along with their \index{Standard error} standard error. In addition, the reading fluency of each subject is calculated from a separate set of data based on the mean reading speeds (character/second) in a \index{Rapid automatized naming task} rapid automatized naming task [RAN, @DenklaRudel1976]; the standard error of the reading speed is also available.

Of interest here is the extent of the association between  working memory capacity (measured as PCU) and \index{Reading fluency} reading fluency (measured as reading speed in 50 characters per second). We avoid making any causal claims: It could be that  our measure of working memory capacity really affects reading fluency or it could be the other way around. A third possibility is that there is a third variable (or several) that affects both reading fluency and working memory capacity. A treatment of causal inference in Bayesian models can be found in chapters 5 and 6 of @mcelreath2015statistical.

```{r}
data("df_indiv")
df_indiv
```

At first glance, we see a relationship between mean PCU scores and mean reading speed; see Figure \@ref(fig:relmeanVOTvdur). However, this relationship seems to be driven by two extreme data points on the top left corner of the plot.

```{r relmeanVOTvdur, fig.cap = "The relationship between (centered) mean PCU scores and mean reading speed.",  message = FALSE, fold = TRUE, fig.height = 3.5}
df_indiv <- df_indiv %>%
  mutate(c_mean_pcu = mean_pcu - mean(mean_pcu))
ggplot(df_indiv, aes(x = c_mean_pcu, y = mean_rspeed)) +
  geom_point() +
  geom_smooth(method = "lm")
```

A simple linear model shows a somewhat weak association between mean reading speed and centered mean PCU. The priors are relatively arbitrary but they are in the right order of magnitude given that reading speeds are quite short and well below 1.

```{r votbrms, message = FALSE, eval = TRUE}
df_indiv <- df_indiv %>%
  mutate(c_mean_pcu = mean_pcu - mean(mean_pcu))
priors <- c(prior(normal(0, 0.5), class = Intercept),
  prior(normal(0, 0.5), class = b),
  prior(normal(0, 0.5), class = sigma))
fit_indiv <- brm(mean_rspeed ~ c_mean_pcu,
                 data = df_indiv,
                 family = gaussian(),
                 prior = priors)
```

```{r, eval = FALSE}
fit_indiv
```

```{r, echo = FALSE}
short_summary(fit_indiv)
```
```{r}
# Proportion of samples below zero
(Pb <- mean(as_draws_df(fit_indiv)$b_c_mean_pcu < 0))
```

Figure \@ref(fig:meplotstanplot)(a) shows the posterior distribution of the slope in this model. Most of the probability mass is negative (`r Pb*100`%), suggesting that a better PCU score is associated with slower reading speed rather than faster;  that is, that a larger working memory capacity is associated with less reading fluency. This is not a very intuitive result and it could be the case that is driven by the two extreme data points. Rather than removing these data points, we'll examine what happens when the uncertainty of the measurements is taken into account.


Taking this uncertainty of the measurement is important; in many practical research problems, researchers will often take average measurements like these and examine the correlation between them. However, each of those data points is being measured with some error (uncertainty), but this error is being ignored when we take the averaged values. Ignoring this uncertainty leads to over-enthusiastic inferences. A measurement-error model solves this issue.

The measurement error model is stated as follows. There is assumed to be a true unobserved value $y_{n,TRUE}$ for the dependent variable, and a true unobserved value $x_{n,TRUE}$ for the predictor, where $n$ is indexing the observation number.
The observed values $y_n$ and the predictor $x_n$ are assumed to be generated with some error:

\begin{equation}
\begin{aligned}
y_n &\sim\mathit{Normal}(y_{n,TRUE},SE_{y_n}) \\
x_n &\sim\mathit{Normal}(x_{n,TRUE},SE_{x_n})
\end{aligned}
\end{equation}

The regression is fit to the (unknown) *true* values of the dependent and independent variables:

\begin{equation}
y_{n,TRUE} \sim\mathit{Normal}(\alpha + \beta x_{n,TRUE},\sigma)
(\#eq:masquare)
\end{equation}

In addition, there is also an unknown standard deviation (standard error) of the latent unknown means that generate the underlying PCU means. I.e., we assume that each of the observed centered PCU scores is normally distributed with an underlying mean, $\chi$, and a standard deviation $\tau$. This is very similar to the meta-analysis situation we saw earlier: $\zeta_n \sim\mathit{Normal}(\zeta,\tau)$, where $\zeta_n$ was the location parameter of each study, and $\zeta$ was the (unknown) location  parameter representing the effect of interest, and $\tau$ was the between-study variability.

\begin{equation}
x_{n,TRUE} \sim\mathit{Normal}(\chi,\tau)
\end{equation}


The goal of the modeling is to obtain posterior distributions for the intercept and slope $\alpha$ and $\beta$ (and the residual error standard deviation $\sigma$).

We need to decide on priors for all the parameters now. We use relatively vague priors, which can still be considered regularizing priors based on our knowledge of the order of magnitude of the measurements. In situations where not much is known about a research question, one could use such vague priors.

\begin{equation}
\begin{aligned}
\alpha &\sim\mathit{Normal}(0, 0.5)\\
\beta &\sim\mathit{Normal}(0, 0.5)\\
\chi &\sim\mathit{Normal}(0, 0.5)\\
\sigma &\sim\mathit{Normal}_+(0, 0.5)\\
\tau &\sim\mathit{Normal}_+(0, 0.5)
\end{aligned}
(\#eq:me)
\end{equation}


#### The `brms` version of the measurement error model

In `brms`, the model specification would be as follows:

```{r}
priors_me <- c(prior(normal(0, 0.5), class = Intercept),
               prior(normal(0, 0.5), class = b),
               prior(normal(0, 0.5), class = meanme),
               prior(normal(0, 0.5), class = sdme),
               prior(normal(0, 0.5), class = sigma))
```

Here, the parameter with class \index{\texttt{meanme}} `meanme` and \index{\texttt{sdme}} `sdme` refer to the unknown mean and standard deviation (standard error) of the \index{Latent mean} latent unknown means that generate the underlying PCU means, $\chi$ and $\tau$ in Equation \@ref(eq:me). Once we decide on the priors, we use \index{\texttt{resp\_se}} `resp_se(.)` with `sigma = TRUE` (i.e, we don't estimate $y_{n,TRUE}$ explicitly) and we use `me(c_meanpcu, se_pcu)` to indicate that the dependent variable `c_mean_pcu` is measured with error and `se_pcu` is its SE.

```{r fitmemodel ,message=FALSE, results = "hide"}
fit_indiv_me <- brm(mean_rspeed | resp_se(se_rspeed, sigma = TRUE) ~
                      me(c_mean_pcu, se_pcu),
                    data = df_indiv,
                    family = gaussian(),
                    prior = priors_me)
```


```{r, eval = FALSE}
fit_indiv_me
```
```{r, echo = FALSE}
short_summary(fit_indiv_me)
```
```{r}
# Proportion of samples below zero
# Parameter names can be found out with `variables(fit_indiv_me)`
(Pb_me <- mean(as_draws_df(fit_indiv_me)$bsp_mec_mean_pcuse_pcu < 0))
```


The posterior for the slope is plotted in Figure \@ref(fig:meplotstanplot)(b); this figure shows that the association between PCU scores and reading speed is much weaker once measurement error is taken into account: The posterior is much more uncertain (much more widely distributed) than in the simple linear model we fit above (compare Figure  \@ref(fig:meplotstanplot)(b) with Figure  \@ref(fig:meplotstanplot)(a)), and the direction of the association is now unclear, with `r round(Pb_me *100)`% of the probability mass below zero, rather than `r round(Pb *100)`%.

(ref:meplotstanplot) (a) Posterior distribution of the slope for the effect of centered mean PCU on mean reading speed (50 characters per second) in a model without measurement error (`fit_indiv`). (b) Posterior distribution of the slope for the same dependent variable (mean reading speed) and predictor (centered mean PCU) in a model that accounts for measurement error (`fit_indiv_me`).

```{r meplotstanplot, fold = TRUE, message = FALSE,fig.cap = "(ref:meplotstanplot)", fig.show = "hold", out.width="49%", fig.width = 3.9, fig.height =3.2 }
# Plot a
mcmc_plot(fit_indiv,
          variable = "^b_c",
          regex = TRUE,
          type = "hist") +
  ggtitle("(a) No  measurement error") +
  xlim(c(-0.05, 0.03))

# Plot b
mcmc_plot(fit_indiv_me, variable = "^bsp", regex = TRUE, type = "hist") +
  ggtitle("(b) With measurement error") +
  xlim(c(-0.05, 0.03))

```

Figure \@ref(fig:seerrors) visualizes the main reason why we have no clear association in the measurement error analysis: the two points at the top left part of the plot that were driving the effect have very large SE for the measurement of reading speed. The code to produce Figure \@ref(fig:seerrors) appears below and overlays several (250) regression lines that correspond to different samples of the posterior distribution with the measurements of reading speed and PCU.

```{r seerrors, fig.cap = "The relationship between centered mean PCU scores and  mean reading speed accounting for measurement error. The error bars represent two standard errors. The regression lines are produced with 250 samples of the intercept and slope from the posterior distribution.",  message = FALSE, warning = FALSE, fig.height = 3.5}
df_reg <- as_draws_df(fit_indiv_me) %>%
  select(alpha = b_Intercept, beta = bsp_mec_mean_pcuse_pcu) %>%
  slice(1:250)
ggplot(df_indiv, aes(x = c_mean_pcu, y = mean_rspeed)) +
  geom_point() +
  geom_errorbarh(aes(xmin = c_mean_pcu - 2 * se_pcu,
                     xmax = c_mean_pcu + 2 * se_pcu),
                 alpha = .5, linetype = "dotted") +
  geom_errorbar(aes(ymin = mean_rspeed - 2 * se_rspeed,
                    ymax = mean_rspeed + 2 * se_rspeed),
                alpha = .5, linetype = "dotted") +
  geom_abline(aes(intercept = alpha, slope = beta),
              data = df_reg,
              alpha = .02)
```


Of course, the conclusion here cannot be that there is no association between PCU scores and reading speed. In order to claim an absence of an effect, we would need to use Bayes factors (see chapter \@ref(ch-bf)) or cross-validation (see chapter \@ref(ch-cv)).

#### The Stan version of the measurement error model  {#sec-stanme}

As it happened when we carried out the meta-analysis, the main difficulty for modeling measurement error models directly in Stan is that we need to reparameterize the models to avoid correlations between samples of different parameters.
The two changes that we need to do to the parameterization of our model presented in Equation \@ref(eq:me) are the following.

1. Sample from an auxiliary parameter $z_n$ rather than directly from $x_{n,TRUE}$, as we did in Equation \@ref(eq:ma1):

\begin{equation}
\begin{aligned}
z_n & \sim\mathit{Normal}(0, 1)\\
x_{n,TRUE} &= z_n \cdot \tau + \chi \\
x_n & \sim \mathit{Normal}(x_{n,TRUE}, SE_{x_n})
\end{aligned}
\end{equation}

2. Don't model $y_{n,TRUE}$ explicitly as in Equation \@ref(eq:masquare); rather take into account the SE and the variation on $y_{n,TRUE}$ in the following way:

\begin{equation}
y_{n} \sim\mathit{Normal}\left(\alpha + \beta x_{n,TRUE},\sqrt{SE_{y_n}^2 + \sigma^2}\right)
\end{equation}


We are now ready to write this in Stan; the code is in the model called `me.stan`:
```{r, echo = FALSE}
me <- system.file("stan_models",
  "me.stan",
  package = "bcogsci")
```

```{stan output.var = "me_stan", code = readLines(me),  tidy = TRUE, comment="", eval = FALSE, cache = FALSE, cache.lazy = FALSE}
```

Fit the model:

```{r mestan, message=FALSE, results="hide", cache = FALSE, eval = !file.exists("dataR/fit_mindiv.RDS") }
me <- system.file("stan_models",
                  "me.stan",
                  package = "bcogsci")
ls_me <- list(N = nrow(df_indiv),
              y = df_indiv$mean_rspeed,
              SE_y = df_indiv$se_rspeed,
              x = df_indiv$c_mean_pcu,
              SE_x = df_indiv$se_pcu)
fit_indiv_me_stan <- stan(me, data = ls_me)
```
```{r readfitmindiv, echo= FALSE, cache = FALSE}
if (!file.exists("dataR/fit_mindiv.RDS")) {
  saveRDS(fit_indiv_me_stan, "dataR/fit_mindiv.RDS")
} else {
  fit_indiv_me_stan <- readRDS("dataR/fit_mindiv.RDS")
}
```

```{r}
print(fit_indiv_me_stan, pars = c("alpha", "beta", "sigma"))
```


The posterior distributions are similar to those that we obtained with `brms`.

## Summary

This chapter introduced two statistical tools that are potentially of great relevance to cognitive science: random-effects meta-analysis and measurement error models. Despite the inherent limitations of meta-analysis, these should be used routinely to accumulate knowledge through systematic evidence synthesis. Measurement errors can also prevent over-enthusiastic conclusions that are often made based on noisy data.

## Further reading

For some examples of Bayesian meta-analyses in psycholinguistics, see @VasishthetalPLoSOne2013,  @JaegerEngelmannVasishth2017,
@NicenboimRoettgeretal, @NicenboimPreactivation2019,  @BuerkiEtAl2020, @cox2022bayesian, and @Buerki2022. A frequentist meta-analysis of priming effects in psycholinguistics appears in @mahowald2016meta. @sutton2012evidence and @cochrane are two useful general introductions that discuss systematic reviews, meta-analysis, and evidence synthesis; these  two references are from medicine, where meta-analysis is more widely used than in cognitive science. A potentially important article for meta-analysis introduces a methodology for modeling bias, to adjust for different kinds of bias in the data [@turner2008bias]. Meta-analyses have important limitations and should not be taken at face value; this point is  discussed in, e.g., @spector1991potential. A practical book-length introduction to Bayesian meta-analysis is @grantmetaanalysis.