General and affine gap penalties
Lecturer: Alexei Drummond
Global alignment using general gap scores
- We can also easily modify the Needleman–Wunsch algorithm
to compute optimal alignments with general gap scoring.
- Let $s(x_i , y_i)$ be the substitution score for characters $x_i$ and $y_i$.
- If $x_1, x_2, \dots , x_n$ and $y_1, y_2, \dots, y_m$ are two strings then we can
compute the best alignment $F(n,m)$ using:
\begin{align*}
F(i,j) = \max\left\{\begin{array}{l}
F(i-1,j-1) + s(x_i,y_j)\\
F(i,j-k) + \gamma(k), \; k= 1,\dots,j\\
F(i-k,j) + \gamma(k), \; k= 1,\dots,i
\end{array}\right.
\end{align*}
$$F(k,0) = F(0,k) = \gamma(k)$$
But, the running time is now cubic: $\Theta(nm(n + m))$ !
Global alignment using affine gap scores
An efficient set of recursions to compute affine gap
scores for strings $x_1,x_2,\dots,x_n$ and $y_1,y_2,\dots,y_m$ is now
$\max\{M(n,m),I_x(n,m),I_y(n,m)\}$ where:
\begin{align*}
M(i,j) = \max\left\{\begin{array}{l}
M(i-1,j-1) + s(x_i,y_j)\\
I_x(i-1,j-1) + s(x_i,y_j)\\
I_y(i-1,j-1) + s(x_i,y_j)
\end{array}\right.
\end{align*}
\begin{align*}
I_x(i,j) = \max\left\{\begin{array}{l}
M(i-1,j) - d\\
I_x(i-1,j) - e
\end{array}\right.
\end{align*}
\begin{align*}
I_y(i,j) = \max\left\{\begin{array}{l}
M(i,j-1) - d\\
I_y(i,j-1) - e
\end{array}\right.
\end{align*}
$$I_x(k,0) = I_y(0,k) = \gamma(k) = -d - (k-1)e,\;\; k>0$$