Introduction
Consider the matrix
The first few powers are
The Fibonacci sequence is the sequence you get by starting with \(0, 1\) and after that always forming the next number by adding the two previous ones: \(F_0, F_1, F_2, F_3, F_4, F_5, F_6, F_7, ...\) = \(0, 1, 1, 2, 3, 5, 8, 13, ...\).
The matrix powers are generating the Fibonacci sequence:
So if there were a way to compute the \(\nth\) power of that matrix "directly", that would also be a way to compute the \(\nth\) Fibonacci number "directly", i.e. without computing all the preceding Fibonacci numbers en route.
How can we do this? To state the problem in a different way, we need to construct a new matrix that performs exactly the same transformation as \(A^n\), but which somehow does the exponentiation step "in one go" rather than by multiplying \(A\) with itself \(n\) times.
Solution outline
Matrices represent linear transformations, so we can talk about them as taking in some vector and producing some other vector. The approach we're going to take is to re-express the \(A^n\) transformation as follows:
- Convert the input vector to its representation in an alternative basis which uses the eigenvectors as the basis vectors (it's called an "eigenbasis").
- In this alternative basis, compute the new position of the vector after carrying out the \(A^n\) transformation.
- Convert the resulting vector back to its representation in our original basis.
I.e., we're going to compute the overall transformation as this product of matrices (remember that one reads these things right-to-left):
The crux of all this is that the exponentiation is efficient in the eigenbasis. That's because, in the eigenbasis, the transformation is just stretching space in the directions of the two basis vectors. So to do the transformation \(n\) times in the eigenbasis, you just stretch by the stretch-factor raised to the \(\nth\) power, rather than doing \(n\) matrix multiplications.
Solution details
Let's suppose we've already found the eigenvectors, and that there are two of them, and that we've arranged them as the two columns of a matrix \(V\). \(V\) holds the basis vectors of the alternative basis, and therefore we know from the change of basis notes that \(V\) is the matrix that takes as input a vector expressed in the alternative basis and outputs its representation in our basis.
So, step (3) is done by \(V\), and step (1) is done by \(V^{-1}\), and the matrix performing all three steps is going to look like
OK, so what is the matrix in the middle? The change of basis notes tell us that we can compute it as
In other words the matrix in the middle is
and the entire transformation is
Put back into words, that's
Recall that above we observed that the \(\nth\) power of \(A\) is a matrix with the nth Fibonacci number in its bottom left and top right entries. So the following tasks remain:
- Find the eigenvectors and put them in a matrix \(V\).
- Find the inverse of \(V\).
- Compute the matrix product \(V^{-1}AV\).
- Compute the result of raising that to the \(\nth\) power.
- Plug the result of that into the overall expression.
- Take the entry in the bottom left or top right (they should be the same!).
The result should be an expression giving the \(\nth\) Fibonacci number as a function of \(n\). It should be possible to give as input to that function the number one million, and have it output the one millionth Fibonacci number directly, without it having to go through the preceding 999,999 Fibonacci numbers.
The answer without showing the calculations
Does this actually work?
Yes.
from math import sqrt def fib(n): return ( ( (1 + sqrt(5))**n - (1 - sqrt(5))**n ) / float(2**n * sqrt(5))) for i in range(10): print i, fib(i) 0 0.0 1 1.0 2 1.0 3 2.0 4 3.0 5 5.0 6 8.0 7 13.0 8 21.0 9 34.0
History
The formula is known as Binet's formula (1843) but was apparently known to Euler, Daniel Bernoulli and de Moivre more than a century earlier. It can be derived without using linear algebra techniques; I don't know when the style of proof attempted here would first have been done. The result can be written as
where \(\phi = \frac{1+\sqrt{5}}{2}\) is the golden ratio.
Calculations
1. Find the eigenvectors
We follow the textbook approach: We have
An eigenvector \(v\) satisfies \(Av = \lambda v\) for some scalar \(\lambda\). That equation can be rearranged as follows
which means that the matrix \(A - \lambda I\) is a transformation that takes some non-zero vector \(\vec v\) to the zero vector (i.e. it has a non-empty "null space"). This means that the transformation cannot be reversed, i.e. the matrix has no inverse, i.e. its determinant is zero. So, use that last fact to find the eigenvectors \(\lambda\):
Using the quadratic formula we have \(a=1, b=-1, c=-1\) and
which are the two eigenvalues.
To find eigenvectors associated with the eigenvalues, go back to the equations
Let an eigenvector \(v\) be \(\scvec{v_1}{v_2}\). The matrix equation corresponds to this system of equations:
From the first equation we have \(v_2 = \lambda v_1\). There are infinitely many eigenvectors (a line of them) associated with any given eigenvalue, so we can pick an arbitrary value for \(v_1\). If we choose \(v_1=2\) then we have eigenvectors \(\scvec{2}{1+\sqrt 5}\) and \(\scvec{2}{1-\sqrt 5}\). The matrix containing the eigenvectors is
2. Find inverse of \(V\)
The inverse of a 2x2 matrix is given by
where \(\text{det} = ad - cb\). Therefore
3. Find the matrix product \(V^{-1}AV\)
Before we get lost in the calculation, let's remember what this is. It's a matrix that does the \(A\) transformation, but in the coordinate system defined by \(A\)'s eigenvectors. So, the resulting matrix must do nothing other than stretch space in the direction of one or both basis vectors in that coordinate system. That's because (1) we represent a transformation with a matrix saying where each of the basis vectors are taken to, (2) the definition of an eigenvector of a transformation is that it is a vector which is simply stretched by the transformation with no change in direction, therefore (3) if the eigenvectors are the basis vectors, then the matrix representing the transformation must just stretch space in the two directions. A matrix which stretches space in the direction of the basis vectors looks like \(\smat{a}{0}{0}{b}\), i.e. it is diagonal. Therefore, \(V^{-1}AV\) must be diagonal.
4. Compute \((V^{-1}AV)^n\)
The matrix is diagonal so this is straightforward. Note that this is the whole point of converting to the eigenbasis: the exponentiation at this step just involves the usual operations of raising scalar numbers to a power; no need to multiply matrices together. A computer will be able to compute the \(\nth\) power of a diagonal matrix much faster than that of a non-diagonal matrix.