Solving Recurrence Relations
Want to share your content on python-bloggers? click here.
Introduction
A neat bit of “engineering mathematics” is solving recurrence relations. The solution method falls out of the notation itself, and harkens back to a time where formal sums were often used in place of vector subscript notation.
Unfortunately the variety of such solutions is small enough to allow teaching by memorization. In this note I try to avoid memorization, and motivate the methodology. I feel this is facilitated by separating a number of often smeared together representations (formulas, sequences, vectors, linear checks, characteristic polynomial, and polynomial check families) into distinct realizations. We are also going to emphasize calculating and confirming claims in Python.
The problem
A simple form of the recurrence problem is to write down a general solution for a subscripted family of linear equations such as the following
Fn+2 = Fn+1 + Fn
where n
is a subscript varying over all positive integers.
Such a relation or equation can arise in number of situations or applications:
- Time series analysis.
- Estimating run times of algorithms.
- Combinatorics.
The above example is in fact the Fibonacci sequence.
The question is: if we are given initial conditions F1 = 1
and F2 = 1
, what is Fn
(for general non-negative integer n
)?
In this case the Wikipedia solution is Fn = (r1n - r2n)/sqrt(5)
where:
r1 = (1 + sqrt(5))/2
r2 = (1 - sqrt(5))/2
Natural questions include:
- Why is the solution in this form?
- How do you find
r1
andr2
?
We will set up some notation and then solve a few examples.
Vector space notation
Let’s formalize our notation a bit.
First let’s settle on working over the vector space of all infinite sequences of real numbers with non-negative subscripts. This is just saying we consider the infinite sequence F = (F1, F2, F3, ...)
we are solving for as one of many possible sequences. We can use “R[Z+]
” as the group-ring style naming of this vector space. Now consider any such vector v
that obeys the recurrence equations:
vn+2 = vn+1 + vn for n = 1, 2, ...
Let “S
” denote the subset of R[Z+]
that obey all of the above linear recurrence checks. We claim a few things about S
(the set of solutions to our current example system):
- The all zeroes vector is in
S
. SoS
is non-empty. - If
v
obeys all of the above constraints then so doesc v
for any constant c. - If
u
andv
obey all of the above constraints then so doesu + v
. - By induction, all
vn
are completely determined by the valuesv1
andv2
.
The first three observations tell us S
is a vector subspace of R[Z+]
. The fourth observation tells us this vector subspace is 2 dimensional, so any solution can be written as the linear combination of two basis solutions.
For notational convenience we will associate functions with vectors in R[Z+]
. For a function f()
let [f(n) | n = 1, 2, ...]
denote the graph or vector (f(1), f(2), f(3), ...)
. In particular we are interested in the vectors [r1n | n = 1, 2, ...] = (r1, r12, r13, ...)
and [r2n | n = 1, 2, ...] = (r2, r22, r23, ...)
(for the previously defined r1
and r2
). We claim [r1n | n = 1, 2, ...]
and [r2n | n = 1, 2, ...]
both obey the above linear check conditions.
# define [r1**n | n = 1, 2, ...] = f_r_1
import numpy as np
r_1 = (1 + np.sqrt(5))/2
f_r_1 = np.asarray([r_1**n for n in range(1, 11)])
f_r_1
array([ 1.61803399, 2.61803399, 4.23606798, 6.85410197, 11.09016994, 17.94427191, 29.03444185, 46.97871376, 76.01315562, 122.99186938])
# check f_r_1 obeys specified Fibonacci relations
assert np.all([
np.abs(f_r_1[n+2] - (f_r_1[n+1] + f_r_1[n])) < 1e-8
for n in range(2, len(f_r_1) - 2)
])
# define [r2**n | n = 1, 2, ...] = f_r_2
r_2 = (1 - np.sqrt(5))/2
f_r_2 = np.asarray([r_2**n for n in range(1, 11)])
f_r_2
array([-0.61803399, 0.38196601, -0.23606798, 0.14589803, -0.09016994, 0.05572809, -0.03444185, 0.02128624, -0.01315562, 0.00813062])
# check f_r_2 obeys specified Fibonacci relations
assert np.all([
np.abs(f_r_2[n+2] - (f_r_2[n+1] + f_r_2[n])) < 1e-8
for n in range(2, len(f_r_2) - 2)
])
# generate a bit of the Fibonacci sequence using
# the definitional recurrence F[n+2] = F[n+1] + F[n]
f = [1, 1]
for i in range(8):
n = len(f) - 2
f.append(f[n + 1] + f[n])
f
[1, 1, 2, 3, 5, 8, 13, 21, 34, 55]
# Wikipedia claimed solution
prediction = (f_r_1 - f_r_2) / np.sqrt(5)
prediction
array([ 1., 1., 2., 3., 5., 8., 13., 21., 34., 55.])
# confirm claimed combination matches Fibonacci sequence
assert np.max(np.abs(
np.asarray(f) - prediction
)) < 1e-8
# we can also solve for the mixture coefficients by
# treating our sequences as column vectors and asking
# a solver how to write the vector f as a linear
# combination of the vectors f_r_1 and f_r_2.
#
# As a linear system in the first 2 terms this looks like:
#
# f[0] = soln[0] * f_r_1[0] + soln[1] * f_r_1[0]
# f[1] = soln[0] * f_r_1[1] + soln[1] * f_r_1[1]
#
# And such a linear system is translated into
# matrix notation as follows:
soln = np.linalg.solve(
np.asarray([
[f_r_1[0], f_r_2[0]],
[f_r_1[1], f_r_2[1]],
]),
[
f[0],
f[1],
]
)
soln
array([ 0.4472136, -0.4472136])
# our (identical) derived solution
prediction2 = (
soln[0] * f_r_1
+ soln[1] * f_r_2
)
prediction2
array([ 1., 1., 2., 3., 5., 8., 13., 21., 34., 55.])
# confirm claimed combination matches Fibonacci sequence
assert np.max(np.abs(
np.asarray(f) - prediction2
)) < 1e-8
The neat trick
The core of the solution follows from a neat trick: replace the subscripts with superscripts. This is very powerful trick. Let’s see that in action.
We gamble and hope some of the solutions are of the following simple form: [rn | n = 1, 2, ...] = (r, r2, r3, ...)
, where r
is a to be determined (possibly complex) number.
Our claim is: if the number r
is a solution to the polynomial equation (in x
)
x2 = x + 1
then v = [rn | n = 1, 2, ...]
satisfies
vn+2 = vn+1 + vn for n = 1, 2, ...
.
The polynomial is called “the characteristic polynomial” of the linear recurrence checks.
The “trick” to this is: if x = r
is a root of, or satisfies, x2 = x + 1
, then it also satisfies xn+2 = xn+1 + xn
(which is equivalent to xn x2 = xn x + xn 1
) for all n ≥ 0
. So rn+2 = rn+1 + rn
for all n ≥ 0
which is exactly the claim v = [rn | n = 1, 2, ...]
is a solution to all of the subscripted v
-checks. It pays to think of the characteristic polynomial p(x)
as shorthand for the family of “check polynomials” xn p(x) for n = 0, 1, 2, ...
. However, for some problems we will need to appeal directly to the family of check polynomials.
In our case, the roots, or solutions, to this polynomial equation are the roots x2 - x - 1 = 0
. By the quadratic formula:
r1 = (1 + sqrt(5))/2
r2 = (1 - sqrt(5))/2
This gives us two linearly independent solutions to the recurrence check equations: [r1n | n = 1, 2, ...]
and [r2n | n = 1, 2, ...]
. These solutions are enough to form a basis for the solution space S
, so we know F
is some linear combination of [r1n | n = 1, 2, ...]
and [r2n | n = 1, 2, ...]
. And we have already showed how to find the linear combination by setting up linear equations to match the first two entries of the vector.
A harder example
Suppose we want to solve the recurrence:
Wn+2 = 6 Wn+1 - 9 Wn for n = 1, 2, ...
The above are the “W
checks” we now want to satisfy. A solution to these is a specific vector of values we can substitute in for the W
‘s, such that none of the equations are false.
We will try the earlier solution strategy. We are then interested in roots of the corresponding characteristic polynomial:
x2 - 6 x + 9 = 0
The above polynomial factors into (x - 3)2
. So r1 = r2 = 3
. So we know [3n | n = 1, 2, ...]
is a solution to the W
checks.
The space of solutions is again 2 dimensional. So to parameterize over all the possible solutions, we need a second linear independent solution. The question then is: how do we find a second linear independent solution?
Dealing with repeated roots
When our characteristic polynomial p(x)
has repeated roots, the characteristic polynomial is not expressive enough to represent some solutions. However, the corresponding family of check polynomials xn p(x) for n = 0, 1, ...
are rich enough to find the extra solutions. We will use that when a polynomial p(x)
has a repeated root (that is: for some number r
, the fact that (x-r)2
divides into p(x)
with no remainder), then p(x)
shares that root with p'(x)
(where p'(x)
is the derivative of p(x)
with respect to x
).
Take the earlier “W
check” polynomials p(x) = xn+2 - 6 xn+1 + 9 xn
. Define the polynomial y(x) = x p'(x) = (n+2) xn+2 - 6 (n+1) xn+1 + 9 n xn
. y(x)
itself is (by inspection) the check polynomials corresponding to the following (new) linear recurrence checks:
(n+2) Yn+2 = 6 (n+1) Yn+1 - 9 (n) Yn for n = 1, 2, ...
As y(3) = 0
(true because 3
is a double root of p(x)
) we know [3n | n = 1, 2, ...]
is a solution to the new Y
linear recurrence checks. Substitute this valid Y
solution [3n | n = 1, 2, ...]
into the Y
checks to derive the following family of (true or valid) equations.
((n+2) 3n+2) = 6 ((n+1) 3n+1) - 9 ((n) 3n) for n = 1, 2, ...
Notice the above is saying [n 3n | n = 1, 2, ...]
obeys the earlier W
linear recurrence problem. We have our desired additional solution.
Confirming the last claim
Frankly, this sort of argument doesn’t fully sink in until one confirms some examples and calculations. The derivation is a bit of “proof by change of notation”, which is never very satisfying.
So: let’s confirm some calculations claimed in the example to try to build some familiarity with the items discussed.
# show down p(x) zeros out at x=3
r = 3
evals_p = [
r**(n+2) - 6 * r**(n+1) + 9 * r**n
for n in range(10)]
evals_p
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
assert np.all(np.abs(evals_p)) < 1e-8
# show down y(x) = x p'(x) zeros out at x=3
r = 3
evals_y = [
r * ((n+2) * r**(n+1) - 6 * (n+1) * r**(n) + 9 * n * r**(n-1))
for n in range(10)]
evals_y
[0.0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
assert np.all(np.abs(evals_y)) < 1e-8
# write down the first p = [3**n | n = 1, 2, ...] solution
p = np.asarray([
r**n for n in range(1, 11)
])
p
array([ 3, 9, 27, 81, 243, 729, 2187, 6561, 19683, 59049])
# confirm W checks on p'
assert np.all([
p[n+2] - 6 * p[n+1] + 9 * p[n] == 0
for n in range(1, len(p) - 2)])
# confirm Y checks on p
# this works as W is the "Y" checks for p'
assert np.all([
(n+2) * p[n+2] - 6 * (n+1) * p[n+1] + 9 * n * p[n] == 0
for n in range(1, len(p) - 2)])
In general
The general solution strategy is as follows.
For a general homogeneous linear recurrence of the form:
vn+k = ck-1 vn+k-1 + ... + c0 vn
Move to the characteristic polynomial equation:
xk = ck-1 xk-1 + ... + c0
We can generate a basis for all solutions as v = [ nk rn | n = 1, 2, ...]
where r
is a root of the characteristic polynomial, and k
is a non-negative integer smaller than the degree of multiplicity of the root r
. Once we have enough linear independent solutions, we can write any other solution as a linear combination of what we have.
We call all of the above the “swap subscripts (general and powerful) to powers (specific and weak)” strategy (though there are obviously a few more steps and details to the method).
Conclusion
Our solution strategy was exchanging powerful subscripts (which can implement any integer keyed lookup table) with less powerful superscripts (that can only express powers). We can lift any solution found in the weaker world (power series) to the more general one (subscripted sequences). We don’t always find enough power series solutions, and in that case we transform the problem to find more solutions to modified polynomials. The trick is to track the details of how the transforms operate on both our vector solutions and check polynomials.
The above system is general, it can solve a lot of problems. We concentrated on calculating over vector values. Related methods include the umbral calculus, the study of shift operators, and the Laplace transform.
(The source code for this worksheet can be found here)
Want to share your content on python-bloggers? click here.