corHMM 2.1: Generalized hidden Markov models
James D. Boyko and Jeremy M. Beaulieu
The vignette is comprised of three sections, where we demonstrate all new extensions as well as other new and useful features:
Background information
Section 1 Default use of corHMM
- 1.1: No hidden rate categores
- 1.2: Any number of hidden rate categories
Section 2 How to make and interpret custom models
- 2.1: Creating and using custom rate matrices
- 2.1.1: One rate category
- 2.1.2: Any number of rate categories
- 2.2: Some examples of “biologically informed” models
- 2.2.1: Ordered habitat change
- 2.2.2: Precursor model
- 2.2.3: Ontological relationship of multiple characters
- 2.1: Creating and using custom rate matrices
Section 3 Estimating models when node states are fixed
- 3.1: Fixing a single node
- 3.2: Estimating rates under a parsimony reconstruction
- 3.3: Fixing nodes when the model contains hidden states
#Background information
The original version of corHMM
contained a number of
distinct functions for conducting analyses of discrete morphological
characters. This included the corHMM()
function for fitting
a hidden rates model, which uses “hidden” states as a means of allowing
transition rates in a binary character to vary across a tree. In
reality, the hidden rates model falls within a general class of models,
hidden Markov models (HMM), that may also be applied to multistate
characters. So, whether the focal trait is binary or contains multiple
states, or whether the observed states represents a set of binary and
multistate characters, hidden states can be applied as a means of
allowing heterogeneity in the transition model. Choosing a model
specific to your question is of utmost importance in any comparative
method, and in this new version of corHMM
we provide users
with the tools to create their own hidden Markov models.
Before delving into this further it may be worth showing a little of what is underneath the hood. To begin, consider a single binary character with states 0 and 1. If the question was to understand the asymmetry in the transition between these two states, the model, Q, would be a simple 2x2 matrix,
\[ Q= \begin{bmatrix} - & q_{0 \rightarrow 1} \\ q_{1 \rightarrow 0} & - \\ \end{bmatrix} \] This transition rate matrix is read as describing the transition rate from ROW to COLUMN. Thus, there are only two states, 0 and 1, and two transitions going from 0 \(\rightarrow\) 1, and from 1 \(\rightarrow\) 0. However, if we introduce a second binary character, the number of possible states you could observe is expanded to account for all the combination of states between two characters – that is, you could observe 00, 01, 10, or 11. To accommodate this, we need to expand our model such that it becomes a 4x4 matrix,
\[ Q = \begin{bmatrix} - & q_{00 \rightarrow 01} & q_{00 \rightarrow 10} & q_{00 \rightarrow 11}\\ q_{01 \rightarrow 00} & - & q_{01 \rightarrow 10} & q_{01 \rightarrow 11}\\ q_{10 \rightarrow 00} & q_{10 \rightarrow 01} & - & q_{10 \rightarrow 11}\\ q_{11 \rightarrow 00} & q_{11 \rightarrow 01} & q_{11 \rightarrow 10} & -\\ \end{bmatrix} \]
This model is considerably more complex, as the number of transitions in this rate matrix now goes from 2 to 12. However, with these models we often make a simplifying assumption that we do not allow for transitions in two states to occur at the same time. In other words, if a lineage is in state 00 it must first transition to either state 01 or 10, before transitioning to state 11. Therefore, we can simplify the matrix by removing these “dual” transitions from the model completely,
\[ Q = \begin{bmatrix} - & q_{00 \rightarrow 01} & q_{00 \rightarrow 10} & -\\ q_{01 \rightarrow 00} & - & - & q_{01 \rightarrow 11}\\ q_{10 \rightarrow 00} & - & - & q_{10 \rightarrow 11}\\ - & q_{11 \rightarrow 01} & q_{11 \rightarrow 10} & -\\ \end{bmatrix} \]
What we just described is the popular model of Pagel (1994), which tests for correlated evolution between two binary characters. But, one thing that is not obvious: the states in the model need not be represented as combinations of binary characters. For example, the focal character may be two characters, like say, flowers that are red with and without petals, and blue flowers with and without petals. One could just code it as a single multistate character, where 1=red petals, 2=red with no petals (i.e., sepals are red), 3=blue petals, and 4=blue with no petals (i.e., sepals are blue). The model would then be,
\[ Q = \begin{bmatrix} - & q_{1 \rightarrow 2} & q_{1 \rightarrow 3} & q_{1 \rightarrow 4}\\ q_{2 \rightarrow 1} & - & q_{2 \rightarrow 3} & q_{2 \rightarrow 4}\\ q_{3 \rightarrow 1} & q_{3 \rightarrow 2} & - & q_{3 \rightarrow 4}\\ q_{4 \rightarrow 1} & q_{4 \rightarrow 2} & q_{4 \rightarrow 3} & -\\ \end{bmatrix} \]
Notice it is the same as before, but the states are transformed from binary combinations to a multistate character. As before, we may assume that transitions in two states cannot occur at the same time and remove the “dual” transitions.
\[ Q = \begin{bmatrix} - & q_{1 \rightarrow 2} & q_{1 \rightarrow 3} & -\\ q_{2 \rightarrow 1} & - & - & q_{2 \rightarrow 4}\\ q_{3 \rightarrow 1} & - & - & q_{3 \rightarrow 4}\\ - & q_{4 \rightarrow 2} & q_{4 \rightarrow 3} & -\\ \end{bmatrix} \]
Again, exactly the same.
The updated version of corHMM()
now lets users transform
a set of characters into a single multistate character. This
means that two characters need not have the same number of character
states – that is, one trait could have four states, and the other could
be binary. We also allow any model to be expanded to accomodate an
arbitrary number of hidden states. Thus, corHMM()
is
completely flexible and naturally contains rayDISC()
and
corDISC()
capabilities - standalone functions in previous
versions of corHMM
that may have been mistaken as different
“methods.” As this vignette will show, they are indeed nested within a
broader class of HMMs.