# corHMM 2.1: Generalized hidden Markov models

#### James D. Boyko and Jeremy M. Beaulieu

The vignette is comprised of three sections, where we demonstrate all new extensions as well as other new and useful features:

**Background information****Section 1 Default use of corHMM**- 1.1: No hidden rate categores
- 1.2: Any number of hidden rate categories

**Section 2 How to make and interpret custom models**- 2.1: Creating and using custom rate matrices
- 2.1.1: One rate category
- 2.1.2: Any number of rate categories

- 2.2: Some examples of “biologically informed” models
- 2.2.1: Ordered habitat change
- 2.2.2: Precursor model
- 2.2.3: Ontological relationship of multiple characters

- 2.1: Creating and using custom rate matrices
**Section 3 Estimating models when node states are fixed**- 3.1: Fixing a single node
- 3.2: Estimating rates under a parsimony reconstruction
- 3.3: Fixing nodes when the model contains hidden states

#Background information

The original version of `corHMM`

contained a number of
distinct functions for conducting analyses of discrete morphological
characters. This included the `corHMM()`

function for fitting
a hidden rates model, which uses “hidden” states as a means of allowing
transition rates in a binary character to vary across a tree. In
reality, the hidden rates model falls within a general class of models,
hidden Markov models (HMM), that may also be applied to multistate
characters. So, whether the focal trait is binary or contains multiple
states, or whether the observed states represents a set of binary and
multistate characters, hidden states can be applied as a means of
allowing heterogeneity in the transition model. Choosing a model
specific to your question is of utmost importance in any comparative
method, and in this new version of `corHMM`

we provide users
with the tools to create their own hidden Markov models.

Before delving into this further it may be worth showing a little of
what is underneath the hood. To begin, consider a single binary
character with states *0* and *1*. If the question was to
understand the asymmetry in the transition between these two states, the
model, **Q**, would be a simple 2x2 matrix,

\[
Q=
\begin{bmatrix}
- & q_{0 \rightarrow 1} \\
q_{1 \rightarrow 0} & - \\
\end{bmatrix}
\] This *transition rate matrix* is read as describing the
transition rate *from* ROW *to* COLUMN. Thus, there are
only two states, 0 and 1, and two transitions going from 0 \(\rightarrow\) 1, and from 1 \(\rightarrow\) 0. However, if we introduce a
second binary character, the number of possible states you
*could* observe is expanded to account for all the combination of
states between two characters – that is, you could observe *00*,
*01*, *10*, or *11*. To accommodate this, we need
to expand our model such that it becomes a 4x4 matrix,

\[ Q = \begin{bmatrix} - & q_{00 \rightarrow 01} & q_{00 \rightarrow 10} & q_{00 \rightarrow 11}\\ q_{01 \rightarrow 00} & - & q_{01 \rightarrow 10} & q_{01 \rightarrow 11}\\ q_{10 \rightarrow 00} & q_{10 \rightarrow 01} & - & q_{10 \rightarrow 11}\\ q_{11 \rightarrow 00} & q_{11 \rightarrow 01} & q_{11 \rightarrow 10} & -\\ \end{bmatrix} \]

This model is considerably more complex, as the number of transitions
in this rate matrix now goes from 2 to 12. However, with these models we
often make a simplifying assumption that we do not allow for transitions
in two states to occur at the same time. In other words, if a lineage is
in state *00* it must first transition to either state
*01* or *10*, before transitioning to state *11*.
Therefore, we can simplify the matrix by removing these “dual”
transitions from the model completely,

\[ Q = \begin{bmatrix} - & q_{00 \rightarrow 01} & q_{00 \rightarrow 10} & -\\ q_{01 \rightarrow 00} & - & - & q_{01 \rightarrow 11}\\ q_{10 \rightarrow 00} & - & - & q_{10 \rightarrow 11}\\ - & q_{11 \rightarrow 01} & q_{11 \rightarrow 10} & -\\ \end{bmatrix} \]

What we just described is the popular model of Pagel (1994), which
tests for correlated evolution between two binary characters. But, one
thing that is not obvious: the states in the model need not be
represented as combinations of binary characters. For example, the focal
character may be two characters, like say, flowers that are red with and
without petals, and blue flowers with and without petals. One could just
code it as a single multistate character, where *1*=red petals,
*2*=red with no petals (i.e., sepals are red), *3*=blue
petals, and *4*=blue with no petals (i.e., sepals are blue). The
model would then be,

\[ Q = \begin{bmatrix} - & q_{1 \rightarrow 2} & q_{1 \rightarrow 3} & q_{1 \rightarrow 4}\\ q_{2 \rightarrow 1} & - & q_{2 \rightarrow 3} & q_{2 \rightarrow 4}\\ q_{3 \rightarrow 1} & q_{3 \rightarrow 2} & - & q_{3 \rightarrow 4}\\ q_{4 \rightarrow 1} & q_{4 \rightarrow 2} & q_{4 \rightarrow 3} & -\\ \end{bmatrix} \]

Notice it is the same as before, but the states are transformed from binary combinations to a multistate character. As before, we may assume that transitions in two states cannot occur at the same time and remove the “dual” transitions.

\[ Q = \begin{bmatrix} - & q_{1 \rightarrow 2} & q_{1 \rightarrow 3} & -\\ q_{2 \rightarrow 1} & - & - & q_{2 \rightarrow 4}\\ q_{3 \rightarrow 1} & - & - & q_{3 \rightarrow 4}\\ - & q_{4 \rightarrow 2} & q_{4 \rightarrow 3} & -\\ \end{bmatrix} \]

Again, exactly the same.

The updated version of `corHMM()`

now lets users transform
a set of characters into a *single* multistate character. This
means that two characters need not have the same number of character
states – that is, one trait could have four states, and the other could
be binary. We also allow any model to be expanded to accomodate an
arbitrary number of hidden states. Thus, `corHMM()`

is
completely flexible and naturally contains `rayDISC()`

and
`corDISC()`

capabilities - standalone functions in previous
versions of `corHMM`

that may have been mistaken as different
“methods.” As this vignette will show, they are indeed nested within a
broader class of HMMs.