# 6.8: Appendix - Deriving an OU model under stabilizing selection

- Page ID
- 21767

We can first consider the evolution of the trait on the “stem” branch, before the divergence of species A and B. We model stabilizing selection with a single optimal trait value, located at *θ*. An example of such a surface is plotted as Figure 6.3. We can describe fitness of an individual with phenotype *z* as:

\[W = e^{−γ(z − θ)^2} \label{6.11}\]

We have introduced a new variable, \(γ\), which captures the curvature of the selection surface. To simplify the calculations, we will assume that stabilizing selection is weak, so that *γ* is small.

We can use a Taylor expansion of this function to approximate Equation \ref{6.7} using a polynomial. Our assumption that \(γ\) is small means that we can ignore terms of order higher than *γ*^{2}:

\[ W = 1 − γ(z − θ)^2 \label{6.12}\]

This makes good sense, since a quadratic equation is a good approximation of the shape of a normal distribution near its peak. The mean fitness in the population is then:

$$ \begin{align} \bar{W} &= E[W] = E[1 - \gamma (z - \theta)^2] \\ &= E[1 - \gamma (z^2 - 2 z \theta + \theta^2)] \\ &= 1 - \gamma (E[z^2] - E[2 z \theta] + E[\theta^2]) \\ &= 1 - \gamma (\bar{z}^2 - V_z - 2 \bar{z} \theta + \theta ^2) \end{align} \label{6.13} $$

We can find the rate of change of fitness with respect to changes in the trait mean by taking the derivative of (6.9) with respect to $\bar{z}$:

$$ \frac{\partial \bar{W}}{\partial \bar{z}} = -2 \gamma \bar{z} + 2 \gamma \theta = 2 \gamma (\theta - \bar{z}) \label{6.14} $$

We can now use Lande’s (1976) equation for the dynamics of the population mean through time for a trait under selection:

$$ \Delta \bar{z} = \frac{G}{\bar{W}} \frac{\partial \bar{W}}{\partial \bar{z}} \label{6.15} $$

Substituting Equations \ref{6.9} and \ref{6.10} into Equation \ref{6.11}, we have:

$$ \Delta \bar{z} = \frac{G}{\bar{W}} \frac{\partial \bar{W}}{\partial \bar{z}} = \frac{G}{1 - \gamma (\bar{z}^2 - V_z - 2 \bar{z} \theta + \theta ^2)} 2 \gamma (\theta - \bar{z}) \label{6.16} $$

Then, simplifying further with another Taylor expansion, we obtain:

$$ \bar{z}' = \bar{z} + 2 G \gamma (\theta - \bar{z}) + \delta \label{6.17} $$

Here, \(\bar{z}\) is the species’ trait value in the previous generation and \(\bar{z}'\) in the next, while G is the additive genetic variance in the population, *γ* the curvature of the selection surface, *θ* the optimal trait value, and *δ* a random component capturing the effect of genetic drift. We can find the expected mean of the trait over time by taking the expectation of this equation:

$$ E[\bar{z}'] = \mu_z' = \mu_z + 2 G \gamma (\theta - \mu_z) \label{6.18} $$

We can then solve this differential equations given the starting condition $\mu_z(0) = \bar{z}(0)$. Doing so, we obtain:

$$ \mu_z(t) = \theta + e^{-2 G t \gamma} (\bar{z}(0)-\theta) \label{6.19} $$

We can take a similar approach to calculate the expected variance of trait values across replicates. We use a standard expression for variance:

$$ \begin{array}{l} V_z' = E[\bar{z}'^2] + E[\bar{z}']^2 \\ V_z' = E[(\bar{z} + 2 G \gamma (\theta - \bar{z}) + \delta)^2] - E[\bar{z} + 2 G \gamma (\theta - \bar{z}) + \delta]^2 \\ V_z' = G/n + (1 - 2 G \gamma)^2 V_z \\ \end{array} \label{6.20} $$

If we assume that stabilizing selection is weak, we can simplify the above expression using a Taylor series expansion:

$$ V_z' = G/n + (1 - 4 G \gamma) V_z \label{6.23}$$

We can then solve this differential equation with starting point *V*_{z}(0)=0:

$$ V_z(t) = \frac{e^{-4 G t \gamma}-1}{4 n \gamma} \label{6.24} $$

Equations \ref{6.15} and \ref{6.20} are equivalent to a standard stochastic model for constrained random walks called an **Ornstein-Uhlenbeck process**. Typical Ornstein-Uhlenbeck processes have four parameters: the starting value (\(\bar{z}(0)\)), the optimum ( *θ*), the drift parameter ( *σ*^{2}), and a parameter describing the strength of constraints ( *α*). In our parameterization, \(\bar{z}(0)\) and *θ* are as given, *α* = 2*G*, and *σ*^{2} = *G*/*n*.

We now need to know how OU models behave when considered along the branches of a phylogenetic tree. In the simplest case, we can describe the joint distribution of two species, A and B, descended from a common ancestor, z. Using equation 6.17, expressions for trait values of species A and B are:

$$ \begin{array}{l} \bar{a}' = \bar{a} + 2 G \gamma (\theta - \bar{a}) + \delta \\ \bar{b}' = \bar{b} + 2 G \gamma (\theta - \bar{b}) + \delta \\ \end{array} \label{6.25} $$

Expected values of these two equations give equations for the means, using equation 6.19:

$$ \begin{array}{l} \mu_a' = \mu_a + 2 G \gamma (\theta - \mu_a) \\ \mu_b' = \mu_b + 2 G \gamma (\theta - \mu_b) \\ \end{array} \label{6.27}$$

We can solve this system of differential equations, given starting conditions $\mu_a(0)=\bar{a}_0$ and $\mu_b(0)=\bar{b}_0$:

$$ \begin{array}{l} \mu_a'(t) = \theta + e^{-2 G t \gamma} (\bar{a}_0 - \theta) \\ \mu_b'(t) = \theta + e^{-2 G t \gamma} (\bar{b}_0 - \theta) \\ \end{array} \label{6.29} $$

However, we can also note that the starting value for both *a* and *b* is the same as the ending value for species *z* on the root branch of the tree. If we denote the length of that branch as *t*_{1} then:

$$ E[\bar{a}_0] = E[\bar{b}_0] = E[\bar{z}(t_1)] = e^{-2 G t_1 \gamma} (\bar{z}_0 - \theta) \label{6.31} $$

Substituting this into equations (6.25-26):

$$ \begin{array}{l} \mu_a'(t) = \theta + e^{-2 G \gamma (t_1 + t)} (\bar{z}_0 - \theta) \\ \mu_b'(t) = \theta + e^{-2 G \gamma (t_1 + t)} (\bar{z}_0 - \theta) \\ \end{array} \label{6.32}$$

Equations

We can calculate the expected variance across replicates of species A and B, as above:

$$ \begin{array}{l} V_a' = E[\bar{a}'^2] + E[\bar{a}']^2 \\ V_a' = E[(\bar{a} + 2 G \gamma (\theta - \bar{a}) + \delta)^2] + E[\bar{a} + 2 G \gamma (\theta - \bar{a}) + \delta]^2 \\ V_a' = G / n + (1 - 2 G \gamma)^2 V_a \\ \end{array} \label{6.34}$$

Similarly,

$$ \begin{array}{l} V_b' = E[\bar{b}'^2] + E[\bar{b}']^2 \\ V_b' = G / n + (1 - 2 G \gamma)^2 V_b \\ \end{array} \label{6.37} $$

Again we can assume that stabilizing selection is weak, and simplify these expressions using a Taylor series expansion:

$$ \begin{array}{l} V_a' = G / n + (1 - 4 G \gamma) V_a \\ V_b' = G / n + (1 - 4 G \gamma) V_b \\ \end{array} \label{6.39} $$

We have a third term to consider, the covariance between species A and B due to their shared ancestry. We can use a standard expression for covariance to set up a third differential equation:

$$ \begin{array}{l} V_{ab}' = E[\bar{a}' \bar{b}'] + E[\bar{a}'] E[\bar{b}'] \\ V_{ab}' = E[(\bar{a} + 2 G \gamma (\theta - \bar{a}) + \delta)(\bar{b} + 2 G \gamma (\theta - \bar{b}) + \delta)] + E[\bar{a} \\ + 2 G \gamma (\theta - \bar{a}) + \delta] E[\bar{a} + 2 G \gamma (\theta - \bar{a}) + \delta] \\ V_{ab}' = V_{ab} (1 - 2 G \gamma)^2 \end{array} \label{6.41} $$

We again use a Taylor series expansion to simplify:

\[V_{ab}′= − 4V_{ab}Gγ \label{6.44} \]

Note that under this model the covariance between A and B decreases through time following their divergence from a common ancestor.

We now have a system of three differential equations. Setting initial conditions *V*_{a}(0)=*V*_{a0}, *V*_{b}(0)=*V*_{b0}, and *V*_{ab}(0)=*V*_{ab0}, we solve to obtain:

$$ \begin{array}{l} V_a(t) = \frac{1 - e^{-4 G \gamma t}}{4 n \gamma} + V_{a0} \\ V_b(t) = \frac{1 - e^{-4 G \gamma t}}{4 n \gamma} + V_{b0} \\ V_{ab}(t) = V_{ab0} e^{-4 G \gamma t} \\ \end{array} \label{6.45} $$

We can further specify the starting conditions by noting that both the variance of A and B and their covariance have an initial value given by the variance of *z* at time *t*_{1}:

$$ V_{a0} = V_{ab0} = V_{ab0} = V_z(t_1) = \frac{e^{-4 G \gamma t_1}-1}{4 n \gamma} \label{6.48} $$

Substituting 6.44 into 6.41-43, we obtain:

$$ \begin{array}{l} V_a(t) = \frac{e^{-4 G \gamma (t_1 + t)} -1}{4 n \gamma} \\ V_b(t) = \frac{e^{-4 G \gamma (t_1 + t)} -1}{4 n \gamma} \\ V_{ab}(t) = \frac{e^{-4 G \gamma t} -e^{-4 G \gamma (t_1 + t)}}{4 n \gamma} \\ \end{array} \label{6.49} $$

Under this model, the trait values follow a multivariate normal distribution; one can calculate that all of the other moments of this distribution are zero. Thus, the set of means, variances, and covariances completely describes the distribution of A and B. Also, as *γ* goes to zero, the selection surface becomes flatter and flatter. Thus at the limit as *γ* approaches 0, these equations are equal to those for Brownian motion (see Chapter 4).

This quantitative genetic formulation – which follows Lande (1976) – is different from the typical parameterization of the OU model for comparative methods. We can obtain the “normal” OU equations by substituting *α* = 2*G**γ* and *σ*^{2} = *G*/*n*:

$$ \begin{array}{l} V_a(t) = \frac{\sigma^2}{2 \alpha} (e^{-2 \alpha (t_1 + t)} - 1) \\ V_b(t) = \frac{\sigma^2}{2 \alpha} (e^{-2 \alpha (t_1 + t)} - 1) \\ V_ab(t) = \frac{\sigma^2}{2 \alpha} e^{-2 \alpha t} (1-e^{-2 \alpha t_1}) \\ \end{array} \label{6.52} $$

These equations are mathematically equivalent to the equations in Butler et al. (2004) applied to a phylogenetic tree with two species.

We can easily generalize this approach to a full phylogenetic tree with \(n\) taxa. In that case, the \(n\) species trait values will all be drawn from a multivariate normal distribution. The mean trait value for species *i* is then:

$$ \mu_i(t) = \theta + e^{-2 G \gamma T_i}(\bar{z}_0 - \theta) \label{6.55} $$

Here *T*_{i} represents the total branch length separating that species from the root of the tree. The variance of species *i* is:

$$ V_i(t) = \frac{e^{-4 G \gamma T_i} - 1}{4 n \gamma} \label{6.56} $$

Finally, the covariance between species *i* and *j* is:

$$ V_{ij}(t) = \frac{e^{-4 G \gamma (T_i - s_{ij})}-e^{-4 G \gamma T_i}}{4 n \gamma} \label{6.57} $$

Note that the above equation is only true when *T*_{i} = *T*_{j} – which is only true for all *i* and *j* if the tree is ultrametric. We can substitute the normal OU parameters, *α* and *σ*^{2}, into these equations:

$$ \begin{array}{l} \mu_i(t) = \theta + e^{- \alpha T_i}(\bar{z}_0 - \theta) \\ V_i(t) = \frac{\sigma^2}{2 \alpha} e^{-2 \alpha T_i} - 1 \\ V_{ij}(t) = \frac{\sigma^2}{2 \alpha} (e^{-2 \alpha (T_i - s_{ij})}-e^{-2 \alpha T_i}) \\ \end{array} \label{6.58} $$