The Coalescent Approximation Revisited

Probability of distinct parents among $k$ sampled lineages in previous generation is:
\begin{align*} 1-p_{c}(k) &= \left(1-\frac{1}{N}\right)\left(1-\frac{2}{N}\right)\ldots\left(1-\frac{k-1}{N}\right)\\ & = 1 - \frac{1}{N}\sum_{j=1}^{k-1}j + O(N^{-2}) \end{align*}
Assumption: $k\ll N$ so that $p_c(k)\simeq \binom{k}{2}/N$. (No multifurcation.)
Probability of coalescence after $m$ generations:
$$P(m)=(1-p_c(k))^m p_c(k)$$
Since $N\gg k$, $p_c(k)\ll 1$ and thus $P(m)\simeq e^{-mp_c(k)}p_c(k)$, and
$$P(t)=e^{-(t/g)p_c(k)}p_c{(k)}\frac{dm}{dt}=e^{-t\binom{k}{2}/Ng}\binom{k}{2}\frac{1}{Ng}$$

The linear birth-death-sampling process

Generalization of Yule model for speciation to include extinction and sampling by Tanja Stadler (2009).
Also applied to populations, in particular measurably evolving pathogens.

Equivalent to a population evoving under the following (forward-time) reactions: \begin{align*} X & \overset{\lambda}{\longrightarrow} 2X\\ X & \overset{\mu}{\longrightarrow} 0 \end{align*} In addition, a linear sampling process $\psi$ probabilistically generates samples, but does not otherwise affect the population. (No implicit removal on sampling!)

The reconstructed birth-death-sampling tree

Stadler (2010)

Left-hand tree is full tree, right-hand tree is equivalent "reconstructed" tree.
How can we compute the probability of such a tree under the BDS process?

Flavour of derivation

Let $p_0(t)$ be the probability that an individual alive at time $t$ has no sampled descendents. Then: \begin{align*} p_0(t+\Delta t) \simeq & p_0(t)(1-\Delta (\lambda + \mu + \psi))\\ & + \Delta \mu + \Delta\lambda p_0(t)^2 \end{align*} and so $$\dot{p}_0(t) = -(\lambda + \psi + \mu)p_0(t) + \mu - \lambda p_0(t)^2$$

Let $g_e(t)$ be the probability that the sampled tree below time $t$ on edge $e$ evolved as observed. Then: $$\dot{g}_e(t) = -(\lambda + \psi + \mu)g_e(t) + 2\lambda p_0(t)g_e(t)$$ where

$$g_e(s)=\left\{\begin{array}{ll} \lambda g_{e_1}(s_1)g_{e_2}(s_2) & \text{ if $e$ has two sampled desc.}\\ \psi g_{e_1}(s_1) & \text{ if $e$ has one sampled desc.}\\ \psi p_0(s_1) & \text{ if $e$ has no sampled desc.} \end{array}\right.$$

Comparison with the coalescent

Stadler et al. (2015)

Birth-death trees comparable to coalescent trees under exponential growth.
Coalescent approximation breaks down when population sizes are small compared to ancestral lineage count.
Figure on right shows CDFs for coalescent times under different models including coalescent (blue) and birth-death model (black).

Non-linear birth-death tree priors

If complete population trajectory (sequence of birth/death events) is known, probability of tree is easy:

Here $p_c=1/\binom{N}{2}$ and $p_{nc}=1-\binom{k}{2}/\binom{N}{2}$.

Use particle filtering to simulate conditioned trajectories, estimate tree probability.

Non-linear birth-death tree priors

Using this in MCMC lets us jointly infer the tree and the nonlinear birth-death trajectory:

Summary

Birth-death processes (and their corresponding branching processes) can be used to describe the generative process behind tree formation.
In certain limits they yield the coalescent distribution for the sampled genealogy.
Stadler et al. have shown that they can be used directly and without approximation to derive model-based tree priors that:
- properly describe the effects of stochastic fluctuations in small populations, and
- explicitly take sampling processes into consideration.
Extension to nonlinear birth-death process descriptions requires numerical methods such as particle filtering or numerical integration of master equations.

The Coalescent Approximation Revisited

The linear birth-death-sampling process

The reconstructed birth-death-sampling tree

Flavour of derivation

Comparison with the coalescent

Non-linear birth-death tree priors

Non-linear birth-death tree priors

Summary

Further reading