## 6.5 Mixed models

A mixed model is a regression model of observations that allows for random variation at two different levels. In this section we will focus on mixed models in an exponential family context. Mixed models can be considered in greater generality but there will then be little shared structure and one has to deal with the models much more on a case-by-case manner.

In a mixed model we have observations \(y_j \in \mathcal{Y}\) and \(z \in \mathcal{Z}\) such that:

- the distribution of \(z\) is an exponential family with canonical parameter \(\theta_0\)
*conditionally*on \(z\) the \(y_j\)s are independent with a distribution from an exponential family with sufficient statistics \(t_j( \cdot \mid z)\).

This definition emphasizes how the \(y_j\)s have variation at two levels.
There is variation in the underlying \(z\), which is the first level of variation
(often called the *random effect*),
and then there is variation among the \(y_j\)s given the \(z\), which is the
second level of variation. It is a special case of hierarchical models
(Bayesian networks with tree graphs) also known as *multilevel* models, with the
mixed model having only two levels. When we observe data from such a model
we typically observe independent replications, \((y_{ij})_{j=1, \ldots, m_i}\)
for \(i = 1, \ldots, n\), of the \(y_j\)s only. Note that we allow for a different
number, \(m_i\), of \(y_j\)s for each \(i\).

The simplest class of mixed models is obtained by \(t_j = t\) not
depending on \(j\), and
\[t(y_j \mid z) = \left(\begin{array}{cc} t_1(y_j) \\ t_2(y_j, z) \end{array} \right)\]
for some fixed maps \(t_1 : \mathcal{Y} \mapsto \mathbb{R}^{p_1}\) and \(t_2 : \mathcal{Y} \times \mathcal{Z} \mapsto \mathbb{R}^{p_2}\).
This is called a *random effects* model (this is a model where there are no *fixed effects* in
the sense that \(t_j\) does not depend on \(j\), and given the random effect
\(z\) the \(y_j\)s are i.i.d.). The canonical parameters associated with such a model
are \(\theta_0\) that enters into the distribution of the random effect,
\(\theta_1 \in \mathbb{R}^{p_1}\) and \(\theta_2 \in \mathbb{R}^{p_2}\) that enter
into the conditional distribution of \(y_j\) given \(z\).

**Example 6.4 **The special case of a *Gaussian, linear* random effects model is the model where
\(z\) is \(\mathcal{N}(0, 1)\) distributed, \(\mathcal{Y} = \mathbb{R}\)
(with base measure proportional to Lebesgue measure)
and the sufficient statistic is
\[t(y_j \mid z) = \left(\begin{array}{cc} y_j \\ - y_j^2 \\ zy_j \end{array}\right).\]
There are no free parameters in the distribution of \(z\).

From Example 6.2
it follows that the conditional variance of \(y\) given \(z\) is
\[\sigma^2 = \frac{1}{2\theta_2}\]
and the conditional mean of \(y\) given \(z\) is
\[\frac{\theta_1 + \theta_3 z}{2 \theta_2} = \sigma^2 \theta_1 + \sigma^2 \theta_3 z.\]
Reparametrizing in terms of \(\sigma^2\), \(\beta_0 = \sigma^2 \theta_1\) and
\(\nu = \sigma^2 \theta_3\) we see how the conditional distribution of \(y_j\)
given \(z\) is \(\mathcal{N}(\beta_0 + \nu z, \sigma^2)\). From this it is clear
how the mixed model of \(y_j\) *conditionally on \(z\)* is a regression model. However,
we do not observe \(z\) in practice.

Using the above distributional result we can see that the Gaussian random effects model of observations \(y_{ij}\) can equivalently be stated as

\[Y_{ij} = \beta_0 + \nu Z_i + \varepsilon_{ij}\]

for \(i = 1, \ldots, n\) and \(j = 1, \ldots, m_i\) where \(Z_1, \ldots, Z_n\) are i.i.d. \(\mathcal{N}(0, 1)\)-distributed and independent of \(\varepsilon_{11}, \varepsilon_{12}, \ldots, \varepsilon_{1n_1}, \ldots, \varepsilon_{mn_m}\) that are themselves i.i.d. \(\mathcal{N}(0, \sigma^2)\)-distributed.