| Algorithms | 
| Longest Common Subsequence | 
Another problem that has a dynamic solution is that of finding the longest common subsequence.
This is a good example of the technique of dynamic programming, which is the following very simple idea: start with a recursive algorithm for the problem, which may be inefficient because it calls itself repeatedly on a small number of subproblems. Simply remember the solution to each subproblem the first time you compute it, then after that look it up instead of recomputing it. The overall time bound then becomes (typically) proportional to the number of distinct subproblems rather than the larger number of recursive calls.
As we'll see, there are two ways of doing dynamic programming, top down and bottom-up. The top down (memoizing) method is closer to the original recursive algorithm, so easier to understand, but the bottom up method is usually a little more efficient.
Why might we want to solve the longest common subsequence problem? There are several motivating applications.
(As an aside, it is natural to define a similar longest common substring problem, asking for the longest substring that appears in two input strings. This problem can be solved in linear time using a data structure known as the suffix tree but the solution is extremely complicated.)
Let's start with some simple observations about the LCS problem. If we have two strings, say "nematode knowledge" and "empty bottle", we can represent a subsequence as a way of writing the two so that certain letters line up:
    n e m a t o d e     k n o  w  l e d g e
      | |   |        |      |     | |
      e m p t y           b o t t l e
    If we draw lines connecting the letters in the first string to the
    corresponding letters in the second, no two lines cross (the top and
    bottom endpoints occur in the same order, the order of the letters
    in the subsequence). Conversely any set of lines drawn like this,
    without crossings, represents a subsequence.
    From this we can observe the following simple fact: if the two strings start with the same letter, it's always safe to choose that starting letter as the first character of the subsequence. This is because, if you have some other subsequence, represented as a collection of lines as drawn above, you can "push" the leftmost line to the start of the two strings, without causing any other crossings, and get a representation of an equally-long subsequence that does start this way.
On the other hand, suppose that, like the example above, the two first characters differ. Then it is not possible for both of them to be part of a common subsequence - one or the other (or maybe both) will have to be removed.
Finally, observe that once we've decided what to do with the first characters of the strings, the remaining subproblem is again a longest common subsequence problem, on two shorter strings. Therefore we can solve it recursively.
Rather than finding the subsequence itself, it turns out to be more efficient to find the length of the longest subsequence. Then in the case where the first characters differ, we can determine which subproblem gives the correct solution by solving both and taking the max of the resulting subsequence lengths. Once we turn this into a dynamic program we will see how to get the sequence itself.
These observations give us the following, very inefficient, recursive algorithm.
Recursive LCS:
    int lcs_length(char * A, char * B)
    {
	if (*A == '\0' || *B == '\0') return 0;
	else if (*A == *B) return 1 + lcs_length(A+1, B+1);
	else return max(lcs_length(A+1,B), lcs_length(A,B+1));
    }
    This is a correct solution but it's very time consuming. For
    example, if the two strings have no matching characters, so the last
    line always gets executed, the the time bounds are binomial
    coefficients, which (if m=n) are close to 2n .
    The dynamic programming solution is to check whenever we want to solve a subproblem, whether we've already done it before. If so we look up the solution instead of recomputing it. Implemented in the most direct way, we just add some code to our recursive algorithm to do this look up -- this "top down", recursive version of dynamic programming is known as "memoization".
In the LCS problem, subproblems consist of a pair of suffixes of the two input strings. To make it easier to store and look up subproblem solutions, I'll represent these by the starting positions in the strings, rather than (as I wrote it above) character pointers.
Recursive LCS with indices:
    char * A;
    char * B;
    int lcs_length(char * AA, char * BB)
    {
	A = AA; B = BB;
	return subproblem(0, 0);
    }
    int subproblem(int i, int j)
    {
	if (A[i] == '\0' || B[j] == '\0') return 0;
	else if (A[i] == B[j]) return 1 + subproblem(i+1, j+1);
	else return max(lcs_length(i+1, j), subproblem(i, j+1));
    }
    Now to turn this into a dynamic programming algorithm we need only
    use an array to store the subproblem results. When we want the
    solution to a subproblem, we first look in the array, and check if
    there already is a solution there. If so we return it; otherwise we
    perform the computation and store the result. In the LCS problem, no
    result is negative, so we'll use -1 as a flag to tell the algorithm
    that nothing has been stored yet.
    Memoizing LCS:
    char * A;
    char * B;
    array L;
    int lcs_length(char * AA, char * BB)
    {
	A = AA; B = BB;
	allocate storage for L;
	for (i = 0; i <= m; i++)
	    for (j = 0; j <= m; j++)
		L[i,j] = -1;
	return subproblem(0, 0);
    }
    int subproblem(int i, int j)
    {
	if (L[i,j] < 0) {
	    if (A[i] == '\0' || B[j] == '\0') L[i,j] = 0;
	    else if (A[i] == B[j]) L[i,j] = 1 + subproblem(i+1, j+1);
	    else L[i,j] = max(subproblem(i+1, j), subproblem(i, j+1));
	}
	return L[i,j];
    }
    Time analysis: each call to subproblem takes constant time. We call
    it once from the main routine, and at most twice every time we fill
    in an entry of array L. There are (m+1)(n+1) entries, so the total
    number of calls is at most 2(m+1)(n+1)+1 and the time is O(mn).
    As usual, this is a worst case analysis. The time might sometimes be better, if not all array entries get filled out. For instance if the two strings match exactly, we'll only fill in diagonal entries and the algorithm will be fast.
Iterative LCS:
    int lcs_length(char * A, char * B)
    {
	allocate storage for array L;
	for (i = m; i >= 0; i--)
	    for (j = n; j >= 0; j--)
	    {
		if (A[i] == '\0' || B[j] == '\0') L[i,j] = 0;
		else if (A[i] == B[j]) L[i,j] = 1 + L[i+1, j+1];
		else L[i,j] = max(L[i+1, j], L[i, j+1]);
	    }
	return L[0,0];
    }
    Advantages of this method include the fact that iteration is usually
    faster than recursion, we don't need to initialize the matrix to all
    -1's, and we save three if statements per iteration since we don't
    need to test whether L[i,j], L[i+1,j], and L[i,j+1] have already
    been computed (we know in advance that the answers will be no, yes,
    and yes). One disadvantage over memoizing is that this fills in the
    entire array even when it might be possible to solve the problem by
    looking at only a fraction of the array's cells.
        sequence S = empty;
    i = 0;
    j = 0;
    while (i < m && j < n)
    {
	if (A[i]==B[j])
	{
	    add A[i] to end of S;
	    i++; j++;
	}
	else if (L[i+1,j] >= L[i,j+1]) i++;
	else j++;
    }
    Let's see an example of this. Here's the array for the earlier
    example:
            n  e  m  a  t  o  d  e  _  k  n  o  w  l  e  d  g  e
    e   7  7  6  5  5  5  5  5  4  3  3  3  2  2  2  1  1  1  0
    m   6  6  6  5  5  4  4  4  4  3  3  3  2  2  1  1  1  1  0
    p   5  5  5  5  5  4  4  4  4  3  3  3  2  2  1  1  1  1  0
    t   5  5  5  5  5  4  4  4  4  3  3  3  2  2  1  1  1  1  0
    y   4  4  4  4  4  4  4  4  4  3  3  3  2  2  1  1  1  1  0
    _   4  4  4  4  4  4  4  4  4  3  3  3  2  2  1  1  1  1  0
    b   3  3  3  3  3  3  3  3  3  3  3  3  2  2  1  1  1  1  0
    o   3  3  3  3  3  3  3  3  3  3  3  3  2  2  1  1  1  1  0
    t   3  3  3  3  3  2  2  2  2  2  2  2  2  2  1  1  1  1  0
    t   3  3  3  3  3  2  2  2  2  2  2  2  2  2  1  1  1  1  0
    l   2  2  2  2  2  2  2  2  2  2  2  2  2  2  1  1  1  1  0
    e   1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  0
        0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
    (you can check that each entry is computed correctly from the
    entries below and to the right.)
    To find the longest common subsequence, look at the first entry L[0,0]. This is 7, telling us that the sequence has seven characters. L[0,0] was computed as max(L[0,1],L[1,0]), corresponding to the subproblems formed by deleting either the "n" from the first string or the "e" from the second. Deleting the "n" gives a subsequence of length L[0,1]=7, but deleting the "e" only gives L[1,0]=6, so we can only delete the "n". Now let's look at the entry L[0,1] coming from this deletion. A[0]=B[1]="e" so we can safely include this "e" as part of the subsequence, and move to L[1,2]=6. Similarly this entry gives us an "m" in our sequence. Continuing in this way (and breaking ties as in the algorithm above, by moving down instead of across) gives the common subsequence "emt ole".
So we can find longest common subsequences in time O(mn). Actually, if you look at the matrix above, you can tell that it has a lot of structure -- the numbers in the matrix form large blocks in which the value is constant, with only a small number of "corners" at which the value changes. It turns out that one can take advantage of these corners to speed up the computation.
       n e m a t o d e _ k n o w l e d g e
    e  o-o o-o-o-o-o-o o-o-o-o-o-o-o o-o-o o
       |  \| | | | |  \| | | | | |  \| |  \|
    m  o-o-o o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o
       | |  \| | | | | | | | | | | | | | | |
    p  o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o
       | | | | | | | | | | | | | | | | | | |
    t  o-o-o-o-o o-o-o-o-o-o-o-o-o-o-o-o-o-o
       | | | |  \| | | | | | | | | | | | | |
    y  o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o
       | | | | | | | | | | | | | | | | | | |
    _  o-o-o-o-o-o-o-o-o o-o-o-o-o-o-o-o-o-o
       | | | | | | | |  \| | | | | | | | | |
    b  o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o
       | | | | | | | | | | | | | | | | | | |
    o  o-o-o-o-o-o o-o-o-o-o-o o-o-o-o-o-o-o
       | | | | |  \| | | | |  \| | | | | | |
    t  o-o-o-o-o o-o-o-o-o-o-o-o-o-o-o-o-o-o
       | | | |  \| | | | | | | | | | | | | |
    t  o-o-o-o-o o-o-o-o-o-o-o-o-o-o-o-o-o-o
       | | | |  \| | | | | | | | | | | | | |
    l  o-o-o-o-o-o-o-o-o-o-o-o-o-o o-o-o-o-o
       | | | | | | | | | | | | |  \| | | | |
    e  o-o o-o-o-o-o-o o-o-o-o-o-o-o o-o-o o
       |  \| | | | |  \| | | | | |  \| |  \|
       o o o o o o o o o o o o o o o o o o o
    Then if you look at any path in the graph, the diagonal edges form a
    subsequence of the two strings. Conversely, if you define the
    horizontal and vertical edges to have length zero, and the diagonal
    edges to have length one, the longest common subsequence corresponds
    to the longest path from the top left corner to one of the bottom
    right vertices. This graph is acyclic, so we can compute longest
    paths in time linear in the size of the graph, here O(mn).
    Where did these edge lengths come from? They're just the amount by which the LCS length increases compared to the length at the corresponding subproblem. If A[i]=B[j], then L[i,j]=L[i+1,j+1]+1, and we use that last "+1" as the edge length. Otherwise, L[i,j]=max(L[i+1,j],L[i,j+1])+0, so we use zero as the edge length.
Cormen, Section 16.3
       Lecture notes by Kirk Pruhs, University of
        Pittsburgh
       Pseudo-code from John Stasko's notes for CS3158 at
        Georgia Tech 
| Key terms |