Last week in the lab, one line of R fit Brownian motion to finch beaks. This lecture is what that line assumes.
In the lab you ran lines like:
Both rest on a single assumption: that the trait performs a random walk in continuous time along the branches of the tree. That random walk is Brownian motion (BM).
BM is the null model of trait evolution: the baseline that the contrast methods of last lecture assume, and the reference that every model next lecture is compared against.
One question to keep in mind:
When a trait "wanders" with no destination, what pattern does it leave at the tips of a tree, and how do we read the rate of wandering back out?
Three steps in this lecture
Step 2 is exactly the $\mathbf{C}$ you met in the comparative-methods lecture, now derived from first principles.
Independent lineages from a common ancestor at value 0.
The grey band is the theoretical $\pm 2\sqrt{\sigma^2 t}$ envelope. The spread grows with $\sqrt{t}$; the mean stays put unless you add a trend $\mu$.
The expected value never moves from $X(0)$. The walk has no destination.
$\operatorname{Var}=\sigma^2 t$. Long branches accumulate more change; the uncertainty fans out as $\sqrt{t}$.
Change in one interval is independent of, and normal like, change in any other. Non-overlapping branches evolve independently.
Property 3 is what makes independent contrasts work: differences between sister taxa are independent normal draws.
From a single walk to a tree of correlated walks
Each lineage does its own BM down the branches. Sister lineages share their path until they split, then diverge.
Time runs left to right; the vertical axis is the trait value. Colour marks the two deep clades.
Notice how the two same-coloured tips in each "cherry" end up close together: shared ancestry makes related species resemble one another, even with no selection.
$\mathbf{C}_{ij}$ = shared branch length from the root to the common ancestor of $i$ and $j$
| A | B | C | D |
Diagonal = root-to-tip length (the variance of that tip). Off-diagonal = shared path (the covariance between two tips). A and B share the most; D shares nothing.
Three routes to $\sigma^2$: contrasts, maximum likelihood, REML
Data simulated under a true rate, then we plot the log-likelihood as a function of the candidate rate $\sigma^2$.
The MLE (red) is the rate that makes the observed contrasts most probable. With few contrasts it scatters widely around the truth; with many it homes in.
Fitting BM to a single continuous trait (here, log body size on a lizard phylogeny) returns four numbers that summarise the whole fit:
| output | meaning |
|---|---|
| sigsq | $\hat{\sigma}^2$, the rate of evolution |
| z0 | $\hat{X}(0)$, the estimated root state |
| lnL | maximised log-likelihood |
| AIC / AICc | fit penalised by parameter count, for model comparison |
Two free parameters here ($\sigma^2$ and $X(0)$), so $\mathrm{AIC}=2\cdot 2 - 2\ln L$.