# 6.8: Appendix - Deriving an OU model under stabilizing selection

We can first consider the evolution of the trait on the “stem” branch, before the divergence of species A and B. We model stabilizing selection with a single optimal trait value, located at θ. An example of such a surface is plotted as Figure 6.3. We can describe fitness of an individual with phenotype z as:

$W = e^{−γ(z − θ)^2} \label{6.11}$

We have introduced a new variable, $$γ$$, which captures the curvature of the selection surface. To simplify the calculations, we will assume that stabilizing selection is weak, so that γ is small.

We can use a Taylor expansion of this function to approximate Equation \ref{6.7} using a polynomial. Our assumption that $$γ$$ is small means that we can ignore terms of order higher than γ2:

$W = 1 − γ(z − θ)^2 \label{6.12}$

This makes good sense, since a quadratic equation is a good approximation of the shape of a normal distribution near its peak. The mean fitness in the population is then:

\begin{align} \bar{W} &= E[W] = E[1 - \gamma (z - \theta)^2] \\ &= E[1 - \gamma (z^2 - 2 z \theta + \theta^2)] \\ &= 1 - \gamma (E[z^2] - E[2 z \theta] + E[\theta^2]) \\ &= 1 - \gamma (\bar{z}^2 - V_z - 2 \bar{z} \theta + \theta ^2) \end{align} \label{6.13}

We can find the rate of change of fitness with respect to changes in the trait mean by taking the derivative of (6.9) with respect to $\bar{z}$:

$$\frac{\partial \bar{W}}{\partial \bar{z}} = -2 \gamma \bar{z} + 2 \gamma \theta = 2 \gamma (\theta - \bar{z}) \label{6.14}$$

We can now use Lande’s (1976) equation for the dynamics of the population mean through time for a trait under selection:

$$\Delta \bar{z} = \frac{G}{\bar{W}} \frac{\partial \bar{W}}{\partial \bar{z}} \label{6.15}$$

Substituting Equations \ref{6.9} and \ref{6.10} into Equation \ref{6.11}, we have:

$$\Delta \bar{z} = \frac{G}{\bar{W}} \frac{\partial \bar{W}}{\partial \bar{z}} = \frac{G}{1 - \gamma (\bar{z}^2 - V_z - 2 \bar{z} \theta + \theta ^2)} 2 \gamma (\theta - \bar{z}) \label{6.16}$$

Then, simplifying further with another Taylor expansion, we obtain:

$$\bar{z}' = \bar{z} + 2 G \gamma (\theta - \bar{z}) + \delta \label{6.17}$$

Here, $$\bar{z}$$ is the species’ trait value in the previous generation and $$\bar{z}'$$ in the next, while G is the additive genetic variance in the population, γ the curvature of the selection surface, θ the optimal trait value, and δ a random component capturing the effect of genetic drift. We can find the expected mean of the trait over time by taking the expectation of this equation:

$$E[\bar{z}'] = \mu_z' = \mu_z + 2 G \gamma (\theta - \mu_z) \label{6.18}$$

We can then solve this differential equations given the starting condition $\mu_z(0) = \bar{z}(0)$. Doing so, we obtain:

$$\mu_z(t) = \theta + e^{-2 G t \gamma} (\bar{z}(0)-\theta) \label{6.19}$$

We can take a similar approach to calculate the expected variance of trait values across replicates. We use a standard expression for variance:

$$\begin{array}{l} V_z' = E[\bar{z}'^2] + E[\bar{z}']^2 \\ V_z' = E[(\bar{z} + 2 G \gamma (\theta - \bar{z}) + \delta)^2] - E[\bar{z} + 2 G \gamma (\theta - \bar{z}) + \delta]^2 \\ V_z' = G/n + (1 - 2 G \gamma)^2 V_z \\ \end{array} \label{6.20}$$

If we assume that stabilizing selection is weak, we can simplify the above expression using a Taylor series expansion:

$$V_z' = G/n + (1 - 4 G \gamma) V_z \label{6.23}$$

We can then solve this differential equation with starting point Vz(0)=0:

$$V_z(t) = \frac{e^{-4 G t \gamma}-1}{4 n \gamma} \label{6.24}$$

Equations \ref{6.15} and \ref{6.20} are equivalent to a standard stochastic model for constrained random walks called an Ornstein-Uhlenbeck process. Typical Ornstein-Uhlenbeck processes have four parameters: the starting value ($$\bar{z}(0)$$), the optimum ( θ), the drift parameter ( σ2), and a parameter describing the strength of constraints ( α). In our parameterization, $$\bar{z}(0)$$ and θ are as given, α = 2G, and σ2 = G/n.

We now need to know how OU models behave when considered along the branches of a phylogenetic tree. In the simplest case, we can describe the joint distribution of two species, A and B, descended from a common ancestor, z. Using equation 6.17, expressions for trait values of species A and B are:

$$\begin{array}{l} \bar{a}' = \bar{a} + 2 G \gamma (\theta - \bar{a}) + \delta \\ \bar{b}' = \bar{b} + 2 G \gamma (\theta - \bar{b}) + \delta \\ \end{array} \label{6.25}$$

Expected values of these two equations give equations for the means, using equation 6.19:

$$\begin{array}{l} \mu_a' = \mu_a + 2 G \gamma (\theta - \mu_a) \\ \mu_b' = \mu_b + 2 G \gamma (\theta - \mu_b) \\ \end{array} \label{6.27}$$

We can solve this system of differential equations, given starting conditions $\mu_a(0)=\bar{a}_0$ and $\mu_b(0)=\bar{b}_0$:

$$\begin{array}{l} \mu_a'(t) = \theta + e^{-2 G t \gamma} (\bar{a}_0 - \theta) \\ \mu_b'(t) = \theta + e^{-2 G t \gamma} (\bar{b}_0 - \theta) \\ \end{array} \label{6.29}$$

However, we can also note that the starting value for both a and b is the same as the ending value for species z on the root branch of the tree. If we denote the length of that branch as t1 then:

$$E[\bar{a}_0] = E[\bar{b}_0] = E[\bar{z}(t_1)] = e^{-2 G t_1 \gamma} (\bar{z}_0 - \theta) \label{6.31}$$

Substituting this into equations (6.25-26):

$$\begin{array}{l} \mu_a'(t) = \theta + e^{-2 G \gamma (t_1 + t)} (\bar{z}_0 - \theta) \\ \mu_b'(t) = \theta + e^{-2 G \gamma (t_1 + t)} (\bar{z}_0 - \theta) \\ \end{array} \label{6.32}$$
Equations

We can calculate the expected variance across replicates of species A and B, as above:

$$\begin{array}{l} V_a' = E[\bar{a}'^2] + E[\bar{a}']^2 \\ V_a' = E[(\bar{a} + 2 G \gamma (\theta - \bar{a}) + \delta)^2] + E[\bar{a} + 2 G \gamma (\theta - \bar{a}) + \delta]^2 \\ V_a' = G / n + (1 - 2 G \gamma)^2 V_a \\ \end{array} \label{6.34}$$

Similarly,

$$\begin{array}{l} V_b' = E[\bar{b}'^2] + E[\bar{b}']^2 \\ V_b' = G / n + (1 - 2 G \gamma)^2 V_b \\ \end{array} \label{6.37}$$

Again we can assume that stabilizing selection is weak, and simplify these expressions using a Taylor series expansion:

$$\begin{array}{l} V_a' = G / n + (1 - 4 G \gamma) V_a \\ V_b' = G / n + (1 - 4 G \gamma) V_b \\ \end{array} \label{6.39}$$

We have a third term to consider, the covariance between species A and B due to their shared ancestry. We can use a standard expression for covariance to set up a third differential equation:

$$\begin{array}{l} V_{ab}' = E[\bar{a}' \bar{b}'] + E[\bar{a}'] E[\bar{b}'] \\ V_{ab}' = E[(\bar{a} + 2 G \gamma (\theta - \bar{a}) + \delta)(\bar{b} + 2 G \gamma (\theta - \bar{b}) + \delta)] + E[\bar{a} \\ + 2 G \gamma (\theta - \bar{a}) + \delta] E[\bar{a} + 2 G \gamma (\theta - \bar{a}) + \delta] \\ V_{ab}' = V_{ab} (1 - 2 G \gamma)^2 \end{array} \label{6.41}$$

We again use a Taylor series expansion to simplify:

$V_{ab}′= − 4V_{ab}Gγ \label{6.44}$

Note that under this model the covariance between A and B decreases through time following their divergence from a common ancestor.

We now have a system of three differential equations. Setting initial conditions Va(0)=Va0, Vb(0)=Vb0, and Vab(0)=Vab0, we solve to obtain:

$$\begin{array}{l} V_a(t) = \frac{1 - e^{-4 G \gamma t}}{4 n \gamma} + V_{a0} \\ V_b(t) = \frac{1 - e^{-4 G \gamma t}}{4 n \gamma} + V_{b0} \\ V_{ab}(t) = V_{ab0} e^{-4 G \gamma t} \\ \end{array} \label{6.45}$$

We can further specify the starting conditions by noting that both the variance of A and B and their covariance have an initial value given by the variance of z at time t1:

$$V_{a0} = V_{ab0} = V_{ab0} = V_z(t_1) = \frac{e^{-4 G \gamma t_1}-1}{4 n \gamma} \label{6.48}$$

Substituting 6.44 into 6.41-43, we obtain:

$$\begin{array}{l} V_a(t) = \frac{e^{-4 G \gamma (t_1 + t)} -1}{4 n \gamma} \\ V_b(t) = \frac{e^{-4 G \gamma (t_1 + t)} -1}{4 n \gamma} \\ V_{ab}(t) = \frac{e^{-4 G \gamma t} -e^{-4 G \gamma (t_1 + t)}}{4 n \gamma} \\ \end{array} \label{6.49}$$

Under this model, the trait values follow a multivariate normal distribution; one can calculate that all of the other moments of this distribution are zero. Thus, the set of means, variances, and covariances completely describes the distribution of A and B. Also, as γ goes to zero, the selection surface becomes flatter and flatter. Thus at the limit as γ approaches 0, these equations are equal to those for Brownian motion (see Chapter 4).

This quantitative genetic formulation – which follows Lande (1976) – is different from the typical parameterization of the OU model for comparative methods. We can obtain the “normal” OU equations by substituting α = 2Gγ and σ2 = G/n:

$$\begin{array}{l} V_a(t) = \frac{\sigma^2}{2 \alpha} (e^{-2 \alpha (t_1 + t)} - 1) \\ V_b(t) = \frac{\sigma^2}{2 \alpha} (e^{-2 \alpha (t_1 + t)} - 1) \\ V_ab(t) = \frac{\sigma^2}{2 \alpha} e^{-2 \alpha t} (1-e^{-2 \alpha t_1}) \\ \end{array} \label{6.52}$$

These equations are mathematically equivalent to the equations in Butler et al. (2004) applied to a phylogenetic tree with two species.

We can easily generalize this approach to a full phylogenetic tree with $$n$$ taxa. In that case, the $$n$$ species trait values will all be drawn from a multivariate normal distribution. The mean trait value for species i is then:

$$\mu_i(t) = \theta + e^{-2 G \gamma T_i}(\bar{z}_0 - \theta) \label{6.55}$$

Here Ti represents the total branch length separating that species from the root of the tree. The variance of species i is:

$$V_i(t) = \frac{e^{-4 G \gamma T_i} - 1}{4 n \gamma} \label{6.56}$$

Finally, the covariance between species i and j is:

$$V_{ij}(t) = \frac{e^{-4 G \gamma (T_i - s_{ij})}-e^{-4 G \gamma T_i}}{4 n \gamma} \label{6.57}$$

Note that the above equation is only true when Ti = Tj – which is only true for all i and j if the tree is ultrametric. We can substitute the normal OU parameters, α and σ2, into these equations:

$$\begin{array}{l} \mu_i(t) = \theta + e^{- \alpha T_i}(\bar{z}_0 - \theta) \\ V_i(t) = \frac{\sigma^2}{2 \alpha} e^{-2 \alpha T_i} - 1 \\ V_{ij}(t) = \frac{\sigma^2}{2 \alpha} (e^{-2 \alpha (T_i - s_{ij})}-e^{-2 \alpha T_i}) \\ \end{array} \label{6.58}$$