Transforming the Variance-Covariance Matrix

Pagel's branch-length transformations

Pagel developed three transformations that modify the variance-covariance matrix $\mathbf{C}$ to test for deviations from Brownian motion.
Each transformation has a parameter that, when equal to 1, recovers the original BM tree.
The three transformations:
- $\lambda$ (lambda) — phylogenetic signal
- $\delta$ (delta) — tempo of evolution
- $\kappa$ (kappa) — speciation vs. gradual change

Multiplies all off-diagonal (shared branch) elements of $\mathbf{C}$ by a value from 0 to 1.
This compresses the internal (deeper) branches; tips remain unaffected.
Interpretation:
- $\lambda = 1$: no change — trait evolves as expected under BM on the true tree.
- $\lambda = 0$: star phylogeny — every tip is effectively independent (no phylogenetic signal).
Often used to measure phylogenetic signal: how much does the phylogeny predict trait similarity?

All elements of $\mathbf{C}$ are raised to the power $\delta$. This changes node heights.
Interpretation:
- $\delta = 1$: unchanged BM tree.
- $\delta < 1$: node heights compressed — deep branches reduced less than shallow ones.
- $\delta > 1$: shallower branches stretched more than deeper ones.
Designed to capture variation in rates of evolution through time:
- $\delta < 1$ suggests trait evolution was faster early in the phylogeny.
- $\delta > 1$ suggests trait evolution accelerated more recently.

Raises all branch lengths to the power $\kappa$.
Effect depends on $\kappa$ and the number of branches between root and MRCA of each pair of tips.
Interpretation:
- $\kappa = 1$: standard BM tree.
- $\kappa = 0$: all branch lengths become 1 — only topology matters.
Interpreted as a model where characters mostly change during speciation events (punctuated evolution) rather than gradually along branches.

Rate Variation Across Clades

BM assumes $\sigma^2$ is constant across the whole tree. What if this is wrong?
These methods ask whether specific clades have different $\sigma^2$.
Three approaches:
1. PIC-based rate test: compare magnitudes of squared contrasts between clades.
2. ML + AIC: fit multi-rate BM models with separate $\sigma^2$ for each clade; compare via AIC.
3. Bayesian MCMC: each branch can have its own rate, with posteriors identifying rate shifts.
Note: methods typically require pre-defined groups suspected of having different rates.

A Brownian-like model where the trait is drawn towards a fitness optimum $\theta$.
Analogous to stabilizing selection — but more precisely, models tracking the movement of an adaptive optimum.
Adds a pull parameter $\alpha$ that determines how strongly the trait is attracted to $\theta$: $$dX = \alpha(\theta - X)\,dt + \sigma\,dW$$
Key properties:
- $\alpha = 0$: pure Brownian motion.
- Large $\alpha$: trait tightly constrained near $\theta$.
- Variance reaches a stationary value $\sigma^2/(2\alpha)$ rather than growing indefinitely.
Can be used within ML or Bayesian frameworks to test whether stabilizing selection fits better than BM.

Models of adaptive radiation: some phenomenon creates an opportunity (migration, evolution of a new trait, extinction of competitors).
Taxa rapidly evolve and diversify to fill available niches, then slow down as niches fill up.
These models start from BM and attach a decay parameter that slows the evolutionary rate over time: $$\sigma^2(t) = \sigma^2_0 \, e^{rt}$$ where $r < 0$ produces the "early burst" pattern.
Evolution starts rapidly near the root and gradually slows to the rate at the tips.
Can be compared to BM using AIC or Bayes Factors.

Models of punctuated evolution: long periods of relative stasis punctuated by brief periods of intense change.
These periods of change are often associated with adaptive radiations or shifts in selective regime.
The punctuated periods can follow one of three regimes:
- Random
- Fixed interval
- Associated with changes in other traits on the tree
Can identify parts of the tree evolving differently from the rest.
Approach is similar to the multi-rate methods described earlier.

Discrete traits have fixed states (e.g., 0/1, red/yellow/blue, legs/no legs) — these do not work with continuous-trait methods like BM.
The Mk model (Lewis, 2001) provides a general approach to modelling discrete character evolution on phylogenies.
Properties:
- Unordered: traits can change to any other state without intermediate steps.
- Transitions are independent across branches.
- All transition rates are equal (in the simplest version).
Extended versions relax the equal-rates assumption.
Analogous to the Jukes-Cantor model for DNA, but for arbitrary discrete characters.

The simulation process:
1. Assign a state to the root (equal probability of all states, or from observed proportions).
2. Assign that value to all daughter branches.
3. For each branch, "roll the dice" — determine if a transition occurs (probability depends on branch length and rate).
4. Repeat down the tree to the tips.
These simulated trees allow us to calculate likelihoods for comparative data.
But uncertainty makes exact computation difficult. Felsenstein's pruning algorithm works backwards from tips to root, making likelihood calculation tractable.

SSE models test correlations between trait evolution and speciation/extinction rates.
An "alphabet soup" of methods:
- BiSSE: Binary State Speciation and Extinction
- HiSSE: Hidden State SSE — allows for hidden (unmeasured) states
- MuSSE: Multiple State SSE
When you are interested in whether a particular trait affects the rate of diversification and/or extinction, SSE models provide a formal framework.
Caution: these models can be sensitive to model violations and have known issues with false positives.

Model	What it captures	Key parameter(s)
Pagel's $\lambda$	Phylogenetic signal	$\lambda \in [0,1]$
Pagel's $\delta$	Tempo of evolution	$\delta$ (node height scaling)
Pagel's $\kappa$	Punctuated vs. gradual change	$\kappa$ (branch length scaling)
OU	Stabilizing selection	$\alpha$ (pull strength), $\theta$ (optimum)
Early Burst	Adaptive radiation	$r$ (rate decay)
Mk	Discrete character evolution	Transition rate(s)
SSE (BiSSE, HiSSE, ...)	Trait-dependent diversification	State-specific $\lambda$, $\mu$

If you do not know a priori which clade has a different evolutionary rate, what are the obstacles and risks of testing every clade?
Do you think it is a good idea to link punctuated evolution of one trait to changes in other traits?
For what reasons might Pagel's statistics be less popular today than in the past?
Under the early burst model, why would we expect evolutionary rates to slow down over time?