Algorithms: Matrix Chain Multiplication

Algorithms

Matrix Chain Multiplication

Problem

We are given a sequence of matrices to multiply: A₁ A₂ A₃ ... A_n Matrix multiplication is associative, so A₁ ( A₂ A₃ ) = ( A₁ A₂ ) A₃ that is, we can can generate the product in two ways.

The cost of multiplying an nxm by an mxp one is O(nmp) (or O(n³) for two nxn ones). A poor choice of parenthesization can be expensive: eg if we have

Matrix	Rows	Columns
A₁	10	100
A₂	100	5
A₃	5	50

the cost for ( A₁ A₂ ) A₃ is

A₁A₂	10x100x5	= 5000	=> A₁ A₂ (10x5)
(A₁A₂) A₃	10x5x50	= 2500	=> A₁A₂A₃ (10x50)
Total		7500

but for A₁ ( A₂ A₃ )

A₂A₃	100x5x50	= 25000	=> A₂A₃ (100x50)
A₁(A₂A₃)	10x100x50	= 50000	=> A₁A₂A₃ (10x50)
Total		75000

Clearly demonstrating the benefit of calculating the optimum order before commencing the product calculation!

Optimal Sub-structure

As with the optimal binary search tree, we can observe that if we divide a chain of matrices to be multiplied into two optimal sub-chains: (A₁ A₂ A₃ ... A_j) (A_j+1 ... A_n ) then the optimal parenthesisations of the sub-chains must be composed of optimal chains. If they were not, then we could replace them with cheaper parenthesisations.

This property, known as optimal sub-structure is a hallmark of dynamic algorithms: it enables us to solve the small problems (the sub-structure) and use those solutions to generate solutions to larger problems.

For matrix chain multiplication, the procedure is now almost identical to that used for constructing an optimal binary search tree. We gradually fill in two matrices,

one containing the costs of multiplying all the sub-chains. The diagonal below the main diagonal contains the costs of all pair-wise multiplications: cost[1,2] contains the cost of generating product A₁A₂, etc. The diagonal below that contains the costs of triple products: eg cost[1,3] contains the cost of generating product A₁A₂A₃, which we derived from comparing cost[1,2] and cost[2,3], etc.
one containing the index of last array in the left parenthesisation (similar to the root of the optimal sub-tree in the optimal binary search tree, but there's no root here - the chain is divided into left and right sub-products), so that best[1,3] might contain 2 to indicate that the left sub-chain contains A₁A₂ and the right one is A₃ in the optimal parenthesisation of A₁A₂A₃.

As before, if we have n matrices to multiply, it will take O(n) time to generate each of the O(n²) costs and entries in the best matrix for an overall complexity of O(n³) time at a cost of O(n²) space.

Key terms

optimal sub-structure: a property of optimization problems in which the sub-problems which constitute the solution to the problem itself are themselves optimal solutions to those sub-problems. This property permits the construction of dynamic algorithms to solve the problem.

Continue on to Longest Common Subsequence