7.2: Modeling the evolution of discrete states
So far, we have only dealt with continuously varying characters. However, many characters of interest to biologists are best defined as characters with a set number of fixed states. For limblessness in squamates, each species is either legless (state 0) or not (state 1; actually, there are some species that might be considered “intermediate” Brandley et al. 2008, but we will ignore those for now) . We might have particular questions about the evolution of limblessness in squamates. For example, how many times this character has changed in the evolutionary history of squamates? How often does limblessness evolve? Do limbs ever re-evolve? Is the evolution of limblessness related to some other aspect of the lives of these reptiles?
We will consider discrete characters where each species might exhibit one of k states. (In the limbless example above, k = 2). For characters with more than two states, there is a key distinction between ordered and unordered characters. Ordered characters can be placed in an order so that transitions only occur between adjacent states. For example, I might include “intermediate” species that are somewhere in between limbed and limbless – for example, the “mermaid skinks” ( Sirenoscincus ) from Madagascar, so called because they lack hind limbs (Figure 7.2, Moch and Senter 2011) . An ordered model might only allow transitions between limbless and intermediate, and intermediate and limbed; it would be impossible under such a model to go directly from limbed to limbless without first becoming intermediate. For unordered characters, any state can change into any other state. In this chapter, I will focus mainly on unordered characters; we will return to ordered characters later in the book.
Most work on the evolution of discrete characters on phylogenetic trees has focused on the evolution of gene or protein sequences. Gene sequences are made up of four character states (A, C, T, and G for DNA). Models of sequence evolution allow transitions among all of these states at certain rates, and may allow transition rates to vary across sites, among clades, or through time. There are a huge number of named models that have been applied to this problem (e.g. Jukes-Cantor, JC; General Time-Reversible, GTR; and many more, Yang 2006) , and a battery of statistical approaches are available to fit these models to data (e.g. Posada 2008) .
Any discrete character can be modeled in a similar way as gene sequences. When considering phenotypic characters, we should keep in mind two main differences from the analysis of DNA sequences. First, arbitrary discrete characters may have any number of states (beyond the four associated with DNA sequence data). Second, characters are typically analyzed independently rather than combining long sets of characters and assuming that they share the same model of change.