8.5: Exploring Mk - the "total garbage" test
One problem that arises sometimes in maximum likelihood optimization happens when instead of a peak, the likelihood surface has a long flat “ridge” of equally likely parameter values. In the case of the Mk model, it is common to find that all values of q greater than a certain value have the same likelihood. This is because above a certain rate, evolution has been so rapid that all traces of the history of evolution of that character have been obliterated. After this point, character states of each lineage are random, and have no relationship to the shape of the phylogenetic tree. Our optimization techniques will not work in this case because there is no value of q that has a higher likelihood than other values. Once we get onto the ridge, all values of q have the same likelihood.
For Mk models, there is a simple test that allows us to recognize when the likelihood surface has a long ridge, and q values cannot be estimated. I like to call this test the “total garbage” test because it can tell you if your data are “garbage” with respect to historical inference – that is, your data have no information about historical patterns of trait change. One can predict states just as well by choosing each species at random.
To carry out the total garbage test, imagine that you are just drawing trait values at random. That is, each species has some probability p of having character state 0, and some probability ( 1 − p ) of having state 1 (one can also generalize this test to multi-state models). This likelihood is easy to write down. For a tree of size \(n\), the probability of drawing n 0 species with state 0 is:
\[L_{garbage} = p^{n_0}(1 − p)^{n − n_0} \label{8.2}\]
This equation gives the likelihood of the “total garbage” model for any value of p . Equation 8.1 is related to a binomial distribution (lacking only the factorial term). We also know from probability theory that the ML estimate of p is n 0 / n , with likelihood given by the above formula.
Now consider the likelihood surface of the Mk model. When Mk likelihood surfaces have long ridges, they are nearly always for high values of q – and when the transition rate of character changes is high, this model converges to our “drawing from a hat” (or “garbage”) model. The likelihood ridge lies at the value that is exactly taken from equation 8.10 above.
Thus, one can compare the likelihood of our Mk model to the total garbage model. If the maximum likelihood value of q has the same likelihood as our garbage model, then we know that we are on a ridge of the likelihood surface and q cannot be estimated. We also have no ability to make any statements about the past evolution of our character – in particular, we cannot estimate ancestral character state with any precision. By contrast, if the likelihood of the Mk model is greater than the total garbage model, then our data contains some historical information. We can also make this comparison using AIC, considering the total garbage model as having a single parameter p.
For the squamates, we have n = 258 and n 0 = 207. We calculate p = n 0 / n = 207/258 = 0.8023256. So the likelihood of our garbage model is
L g a r b a g e = p n 0 (1 − p ) n − n 0 = 0.8023256 207 (1 − 0.8023256) 51 = 1.968142 e − 56.
This calculation is both easier and more useful, though, on a natural-log scale:
l n L g a r b a g e = n 0 ⋅ l n ( p )+( n − n 0 )⋅ l n (1 − p )=207 ⋅ l n (0.8023256)+51 ⋅ l n (1 − 0.8023256)= − 128.2677.
Compare this to the log-likelihood of our Mk model, l n L = −80.487176, and you will see that the garbage model is a terrible fit to these data. There is, in fact, some historical information about species' traits in our data.