6.7: Appendix - Deriving an OU model under stabilizing selection

Last updated
Save as PDF

Page ID: 21767

$ \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } $ $ \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} $$\newcommand{\id}{\mathrm{id}}$ $ \newcommand{\Span}{\mathrm{span}}$ $ \newcommand{\kernel}{\mathrm{null}\,}$ $ \newcommand{\range}{\mathrm{range}\,}$ $ \newcommand{\RealPart}{\mathrm{Re}}$ $ \newcommand{\ImaginaryPart}{\mathrm{Im}}$ $ \newcommand{\Argument}{\mathrm{Arg}}$ $ \newcommand{\norm}[1]{\| #1 \|}$ $ \newcommand{\inner}[2]{\langle #1, #2 \rangle}$ $ \newcommand{\Span}{\mathrm{span}}$ $\newcommand{\id}{\mathrm{id}}$ $ \newcommand{\Span}{\mathrm{span}}$ $ \newcommand{\kernel}{\mathrm{null}\,}$ $ \newcommand{\range}{\mathrm{range}\,}$ $ \newcommand{\RealPart}{\mathrm{Re}}$ $ \newcommand{\ImaginaryPart}{\mathrm{Im}}$ $ \newcommand{\Argument}{\mathrm{Arg}}$ $ \newcommand{\norm}[1]{\| #1 \|}$ $ \newcommand{\inner}[2]{\langle #1, #2 \rangle}$ $ \newcommand{\Span}{\mathrm{span}}$$\newcommand{\AA}{\unicode[.8,0]{x212B}}$

We can first consider the evolution of the trait on the “stem” branch, before the divergence of species A and B. We model stabilizing selection with a single optimal trait value, located at θ. An example of such a surface is plotted as Figure 6.3. We can describe fitness of an individual with phenotype z as:

\[W = e^{−γ(z − θ)^2} \label{6.11}\]

We have introduced a new variable, $γ$, which captures the curvature of the selection surface. To simplify the calculations, we will assume that stabilizing selection is weak, so that γ is small.

We can use a Taylor expansion of this function to approximate Equation \ref{6.7} using a polynomial. Our assumption that $γ$ is small means that we can ignore terms of order higher than γ²:

\[ W = 1 − γ(z − θ)^2 \label{6.12}\]

This makes good sense, since a quadratic equation is a good approximation of the shape of a normal distribution near its peak. The mean fitness in the population is then:

\[ \begin{align} \bar{W} &= E[W] = E[1 - \gamma (z - \theta)^2] \\ &= E[1 - \gamma (z^2 - 2 z \theta + \theta^2)] \\ &= 1 - \gamma (E[z^2] - E[2 z \theta] + E[\theta^2]) \\ &= 1 - \gamma (\bar{z}^2 - V_z - 2 \bar{z} \theta + \theta ^2) \end{align} \label{6.13} \]

We can find the rate of change of fitness with respect to changes in the trait mean by taking the derivative of (6.9) with respect to $\bar{z}$:

\[ \frac{\partial \bar{W}}{\partial \bar{z}} = -2 \gamma \bar{z} + 2 \gamma \theta = 2 \gamma (\theta - \bar{z}) \label{6.14} \]

We can now use Lande’s (1976) equation for the dynamics of the population mean through time for a trait under selection:

\[ \Delta \bar{z} = \frac{G}{\bar{W}} \frac{\partial \bar{W}}{\partial \bar{z}} \label{6.15} \]

Substituting Equations \ref{6.9} and \ref{6.10} into Equation \ref{6.11}, we have:

\[ \Delta \bar{z} = \frac{G}{\bar{W}} \frac{\partial \bar{W}}{\partial \bar{z}} = \frac{G}{1 - \gamma (\bar{z}^2 - V_z - 2 \bar{z} \theta + \theta ^2)} 2 \gamma (\theta - \bar{z}) \label{6.16} \]

Then, simplifying further with another Taylor expansion, we obtain:

\[ \bar{z}' = \bar{z} + 2 G \gamma (\theta - \bar{z}) + \delta \label{6.17} \]

Here, $\bar{z}$ is the species’ trait value in the previous generation and $\bar{z}'$ in the next, while G is the additive genetic variance in the population, γ the curvature of the selection surface, θ the optimal trait value, and δ a random component capturing the effect of genetic drift. We can find the expected mean of the trait over time by taking the expectation of this equation:

\[ E[\bar{z}'] = \mu_z' = \mu_z + 2 G \gamma (\theta - \mu_z) \label{6.18} \]

We can then solve this differential equations given the starting condition $\mu_z(0) = \bar{z}(0)$. Doing so, we obtain:

\[ \mu_z(t) = \theta + e^{-2 G t \gamma} (\bar{z}(0)-\theta) \label{6.19} \]

We can take a similar approach to calculate the expected variance of trait values across replicates. We use a standard expression for variance:

\[ \begin{array}{l} V_z' = E[\bar{z}'^2] + E[\bar{z}']^2 \\ V_z' = E[(\bar{z} + 2 G \gamma (\theta - \bar{z}) + \delta)^2] - E[\bar{z} + 2 G \gamma (\theta - \bar{z}) + \delta]^2 \\ V_z' = G/n + (1 - 2 G \gamma)^2 V_z \\ \end{array} \label{6.20} \]

If we assume that stabilizing selection is weak, we can simplify the above expression using a Taylor series expansion:

\[ V_z' = G/n + (1 - 4 G \gamma) V_z \label{6.23}\]

We can then solve this differential equation with starting point V_z(0)=0:

\[ V_z(t) = \frac{e^{-4 G t \gamma}-1}{4 n \gamma} \label{6.24} \]

Equations \ref{6.15} and \ref{6.20} are equivalent to a standard stochastic model for constrained random walks called an Ornstein-Uhlenbeck process. Typical Ornstein-Uhlenbeck processes have four parameters: the starting value ($\bar{z}(0)$), the optimum ( θ), the drift parameter ( σ²), and a parameter describing the strength of constraints ( α). In our parameterization, $\bar{z}(0)$ and θ are as given, α = 2G, and σ² = G/n.

We now need to know how OU models behave when considered along the branches of a phylogenetic tree. In the simplest case, we can describe the joint distribution of two species, A and B, descended from a common ancestor, z. Using equation 6.17, expressions for trait values of species A and B are:

\[ \begin{array}{l} \bar{a}' = \bar{a} + 2 G \gamma (\theta - \bar{a}) + \delta \\ \bar{b}' = \bar{b} + 2 G \gamma (\theta - \bar{b}) + \delta \\ \end{array} \label{6.25} \]

Expected values of these two equations give equations for the means, using equation 6.19:

\[ \begin{array}{l} \mu_a' = \mu_a + 2 G \gamma (\theta - \mu_a) \\ \mu_b' = \mu_b + 2 G \gamma (\theta - \mu_b) \\ \end{array} \label{6.27}\]

We can solve this system of differential equations, given starting conditions $\mu_a(0)=\bar{a}_0$ and $\mu_b(0)=\bar{b}_0$:

\[ \begin{array}{l} \mu_a'(t) = \theta + e^{-2 G t \gamma} (\bar{a}_0 - \theta) \\ \mu_b'(t) = \theta + e^{-2 G t \gamma} (\bar{b}_0 - \theta) \\ \end{array} \label{6.29} \]

However, we can also note that the starting value for both a and b is the same as the ending value for species z on the root branch of the tree. If we denote the length of that branch as t₁ then:

\[ E[\bar{a}_0] = E[\bar{b}_0] = E[\bar{z}(t_1)] = e^{-2 G t_1 \gamma} (\bar{z}_0 - \theta) \label{6.31} \]

Substituting this into equations (6.25-26):

\[ \begin{array}{l} \mu_a'(t) = \theta + e^{-2 G \gamma (t_1 + t)} (\bar{z}_0 - \theta) \\ \mu_b'(t) = \theta + e^{-2 G \gamma (t_1 + t)} (\bar{z}_0 - \theta) \\ \end{array} \label{6.32}\[
Equations

We can calculate the expected variance across replicates of species A and B, as above:

\[ \begin{array}{l} V_a' = E[\bar{a}'^2] + E[\bar{a}']^2 \\ V_a' = E[(\bar{a} + 2 G \gamma (\theta - \bar{a}) + \delta)^2] + E[\bar{a} + 2 G \gamma (\theta - \bar{a}) + \delta]^2 \\ V_a' = G / n + (1 - 2 G \gamma)^2 V_a \\ \end{array} \label{6.34}\]

Similarly,

\[ \begin{array}{l} V_b' = E[\bar{b}'^2] + E[\bar{b}']^2 \\ V_b' = G / n + (1 - 2 G \gamma)^2 V_b \\ \end{array} \label{6.37} \]

Again we can assume that stabilizing selection is weak, and simplify these expressions using a Taylor series expansion:

\[ \begin{array}{l} V_a' = G / n + (1 - 4 G \gamma) V_a \\ V_b' = G / n + (1 - 4 G \gamma) V_b \\ \end{array} \label{6.39} \]

We have a third term to consider, the covariance between species A and B due to their shared ancestry. We can use a standard expression for covariance to set up a third differential equation:

\[ \begin{array}{l} V_{ab}' = E[\bar{a}' \bar{b}'] + E[\bar{a}'] E[\bar{b}'] \\ V_{ab}' = E[(\bar{a} + 2 G \gamma (\theta - \bar{a}) + \delta)(\bar{b} + 2 G \gamma (\theta - \bar{b}) + \delta)] + E[\bar{a} \\ + 2 G \gamma (\theta - \bar{a}) + \delta] E[\bar{a} + 2 G \gamma (\theta - \bar{a}) + \delta] \\ V_{ab}' = V_{ab} (1 - 2 G \gamma)^2 \end{array} \label{6.41} \]

We again use a Taylor series expansion to simplify:

\[V_{ab}′= − 4V_{ab}Gγ \label{6.44} \]

Note that under this model the covariance between A and B decreases through time following their divergence from a common ancestor.

We now have a system of three differential equations. Setting initial conditions V_a(0)=V_a0, V_b(0)=V_b0, and V_ab(0)=V_ab0, we solve to obtain:

\[ \begin{array}{l} V_a(t) = \frac{1 - e^{-4 G \gamma t}}{4 n \gamma} + V_{a0} \\ V_b(t) = \frac{1 - e^{-4 G \gamma t}}{4 n \gamma} + V_{b0} \\ V_{ab}(t) = V_{ab0} e^{-4 G \gamma t} \\ \end{array} \label{6.45} \]

We can further specify the starting conditions by noting that both the variance of A and B and their covariance have an initial value given by the variance of z at time t₁:

\[ V_{a0} = V_{ab0} = V_{ab0} = V_z(t_1) = \frac{e^{-4 G \gamma t_1}-1}{4 n \gamma} \label{6.48} \]

Substituting 6.44 into 6.41-43, we obtain:

\[ \begin{array}{l} V_a(t) = \frac{e^{-4 G \gamma (t_1 + t)} -1}{4 n \gamma} \\ V_b(t) = \frac{e^{-4 G \gamma (t_1 + t)} -1}{4 n \gamma} \\ V_{ab}(t) = \frac{e^{-4 G \gamma t} -e^{-4 G \gamma (t_1 + t)}}{4 n \gamma} \\ \end{array} \label{6.49} \]

Under this model, the trait values follow a multivariate normal distribution; one can calculate that all of the other moments of this distribution are zero. Thus, the set of means, variances, and covariances completely describes the distribution of A and B. Also, as γ goes to zero, the selection surface becomes flatter and flatter. Thus at the limit as γ approaches 0, these equations are equal to those for Brownian motion (see Chapter 4).

This quantitative genetic formulation – which follows Lande (1976) – is different from the typical parameterization of the OU model for comparative methods. We can obtain the “normal” OU equations by substituting α = 2Gγ and σ² = G/n:

\[ \begin{array}{l} V_a(t) = \frac{\sigma^2}{2 \alpha} (e^{-2 \alpha (t_1 + t)} - 1) \\ V_b(t) = \frac{\sigma^2}{2 \alpha} (e^{-2 \alpha (t_1 + t)} - 1) \\ V_ab(t) = \frac{\sigma^2}{2 \alpha} e^{-2 \alpha t} (1-e^{-2 \alpha t_1}) \\ \end{array} \label{6.52} \]

These equations are mathematically equivalent to the equations in Butler et al. (2004) applied to a phylogenetic tree with two species.

We can easily generalize this approach to a full phylogenetic tree with $n$ taxa. In that case, the $n$ species trait values will all be drawn from a multivariate normal distribution. The mean trait value for species i is then:

\[ \mu_i(t) = \theta + e^{-2 G \gamma T_i}(\bar{z}_0 - \theta) \label{6.55} \]

Here T_i represents the total branch length separating that species from the root of the tree. The variance of species i is:

\[ V_i(t) = \frac{e^{-4 G \gamma T_i} - 1}{4 n \gamma} \label{6.56} \]

Finally, the covariance between species i and j is:

\[ V_{ij}(t) = \frac{e^{-4 G \gamma (T_i - s_{ij})}-e^{-4 G \gamma T_i}}{4 n \gamma} \label{6.57} \]

Note that the above equation is only true when T_i = T_j – which is only true for all i and j if the tree is ultrametric. We can substitute the normal OU parameters, α and σ², into these equations:

\[ \begin{array}{l} \mu_i(t) = \theta + e^{- \alpha T_i}(\bar{z}_0 - \theta) \\ V_i(t) = \frac{\sigma^2}{2 \alpha} e^{-2 \alpha T_i} - 1 \\ V_{ij}(t) = \frac{\sigma^2}{2 \alpha} (e^{-2 \alpha (T_i - s_{ij})}-e^{-2 \alpha T_i}) \\ \end{array} \label{6.58} \]