2nd Difference Derivation & Proof

1. Introduction

During a math class in junior high school, I was introduced to a formula for finding the n^th term of a certain sequence with a constant second difference that went something like:

T_n = 1/2(n)² - 1/2(n) + 1, for a sequence 1, 2, 4, 7 ... and n >= 1.

The teacher never really explained it though, so it was just a mysterious formula to me at the time.

Later in college, I was inspired by a textbook I was reading (Chapter 1 of Concrete Mathematics by Donald Knuth on Recurrence Relations) and remembered the mysterious formula, so I tried to derive it myself to better understand it.

2. What is a 2^nd Difference?

Say there is a sequence 1, 2, 4, 7 .... You may have come across quite a few of these types of sequences on IQ tests and whatnot, and it is easy to find that the fifth term is 11 and so on.

However, it is obvious at a glance that the sequence does not follow the familiar formulas for finding the n^th term of arithmetic or geometric sequences, so we cannot yet find an arbitrary n^th term of the sequence without first knowing the previous terms. Finding the general closed-form solution for this will be the goal of this derivation.

The difference or ratio between every term and its subsequent term is not constant, unlike what is needed for a+(n-1)d or ar^n-1 to work properly. Instead, the difference (which we will call the ‘first difference’) increases by a constant amount as the sequence goes on.

We can see this clearly if we construct a new sequence from the first differences:

T₂ - T₁ = 2 - 1 = 1;

T₃ - T₂ = 4 - 2 = 2;

T₄ - T₃ = 7 - 4 = 3;

…

so the sequence of first differences is:

1, 2, 3 ....

Unlike the original sequence, it is easy to see that this sequence has a constant difference of one, so the n^th term can be found using the familiar formula (albeit with some modifications to notation):

(d₁)_n = a₂ + (n - 1)d₂, for n >= 1,

where (d₁)_n is the n^th first difference, a₂ is the first term of the sequence of first differences (a₁ would be the first term of the original sequence), and d₂ is the constant difference between each term in the sequence of first differences (which we will call the ‘second difference’).

3. Derivation of the Formula

Taking inspiration from how we intuitively found that 11 was the fifth term in the original sequence earlier, we model the sequence as a recurrence problem:

T₁ = a₁;

T_n = T_n-1 + (d₁)_n-1, for n >= 2.

Note that it is (d₁)_n-1 and not (d₁)_n because the 1^st term of the first difference is

(d₁)₁ = T₂ - T₁ = 2 - 1 = 1,

so it is

(d₁)_n = T_n+1 - T_n,

and not

(d₁)_n = T_n - T_n-1.

Moving on: we expand—or “unfold”—the recursive formula to try and spot some patterns we could use to find a closed-form solution:

T_n = T_n-1 + (d₁)_n-1

= T_n-2 + (d₁)_n-2 + (d₁)_n-1

= T_n-3 + (d₁)_n-3 + (d₁)_n-2 + (d₁)_n-1.

…

Although not particularly useful by itself, there is an easy to spot pattern:

T_n = T_n-k + (d₁)_n-k + (d₁)_n-(k-1) + (d₁)_n-(k-2) + … + (d₁)_n-2 + (d₁)_n-1, for 1 <= k < n.

Now if we unfold all the way (k = n - 1):

T_n = T_n-(n-1) + (d₁)_n-(n-1) + (d_n-(n-2))₂ + (d₁)_n-(n-3) + … + (d₁)_n-2 + (d₁)_n-1

= T₁ + (d₁)₁ + (d₁)₂ + (d₁)₃ + … + (d₁)_n-2 + (d₁)_n-1

= a₁ + (d₁)₁ + (d₁)₂ + (d₁)₃ + … + (d₁)_n-2 + (d₁)_n-1.

Substituing the formula for finding the n^th term in the sequence of first differences:

T_n = a₁ + (a₂ + (1 - 1)d₂) + (a₂ + (2 - 1)d₂) + (a₂ + (3 - 1)d₂) + … + (a₂ + ((n - 1) - 1)d₂)

= a₁ + a₂ + (1 - 1)d₂ + a₂ + (2 - 1)d₂ + … + a₂ + ((n - 2) - 1)d₂ + a₂ + ((n - 1) - 1)d₂.

Since it goes from (d₁)₁ to (d₁)_n-1, there will be (n - 1) number of a₂’s on the right-hand side of the equation:

T_n = a₁ + (n - 1)a₂ + (1 - 1)d₂ + (2 - 1)d₂ + … + ((n - 2) - 1)d₂ + ((n - 1) - 1)d₂.

Attempting to factorise the remaining open-form part into d₂, we can start to spot a pattern:

T_n = a₁ + (n - 1)a₂ + ((1 - 1) + (2 - 1) + … + ((n - 2) - 1) + ((n - 1) - 1))d₂.

Again, because it goes from (d₁)₁ to (d₁)_n-1, there will be (n - 1) number of (-1)’s:

T_n = a₁ + (n - 1)a₂ + ((1 + 2 + 3 + … + (n - 3) + (n - 2) + (n - 1)) + (n - 1)(-1))d₂.

Now the pattern is easier to see. It is the sum of integers from 1 to (n - 1):

T_n = a₁ + (n - 1)a₂ + (S_n-1 + (n - 1)(-1))d₂

= a₁ + (n - 1)a₂ + (S_n-1 - (n - 1))d₂.

Luckily for us, the great mathematician Gauss had already found a closed-form solution for finding the sum of integers from 1 to n when he was apparently 9 years-old or something (dubious):

T_n = a₁ + (n - 1)a₂ + (1/2(n - 1)((n - 1) + 1) - (n - 1))d₂

= a₁ + (n - 1)a₂ + (1/2(n - 1)(n) - (n - 1))d₂

= a₁ + (n - 1)a₂ + 1/2(n - 2)(n - 1)d₂.

The above is the closed-form solution of the general formula for finding the n^th term of a sequence with a 2^nd difference. We can test it against the earlier sequence that we used as an example (1, 2, 4, 7 ...):

T_n = a₁ + (n - 1)a₂ + 1/2(n - 2)(n - 1)d₂

= 1 + (n - 1)(1) + 1/2(n - 2)(n - 1)(1), for a₁ = 1, a₂ = 1, d₂ = 1

= 1 + (n - 1) + 1/2(n - 2)(n - 1)

= n + 1/2(n - 2)(n - 1)

= n + 1/2(n² - 3n + 2)

= n + 1/2n² - 3/2(n) + 1

= 1/2(n)² - 1/2(n) + 1.

Which matches what I wrote in the introduction. You can try out a few examples to check for any counter-examples, but you will not know for certain if it will always be correct unless you prove it, which is what we will do in the next section.

4. Proof by Mathematical Induction

Continuing from the previous section: the derivation makes sense, and we can try a few examples, but how do we know for sure that there is no counter-example somewhere we did not check? We need to be mathematically rigorous to show our formula will always work.

Fortunately, there is a convenient method called “proof by induction” that fits nicely into our sequences situation. If we show that our formula is true for a base case and also that it is true for any n if it was true for (n - 1), we can confidently conclude that the formula is true for all positive integer n.

4.2 Base Case

Plugging in the formula for T₁:

T₁ = a₁ + (n - 1)a₂ + 1/2(n - 2)(n - 1)d₂

= a₁ + (1 - 1)a₂ + 1/2(1 - 2)(1 - 1)d₂, for n = 1

= a₁ + (0)a₂ + 1/2(-1)(0)d₂, for n = 1

= a₁ + 0 + 0

= a₁.

Which matches our definition, hence proving that the formula is true for n = 1.

4.1 Induction Step

Let us first go back to our definitions:

T₁ = a₁;

T_n = T_n-1 + (d₁)_n-1, for n >= 2;

(d₁)_n = a₂ + (n - 1)d₂, for n >= 1.

The formula we derived and want to prove is

T_n = a₁ + (n - 1)a₂ + 1/2(n - 2)(n - 1)d₂.

Plugging in the formula we derived, but for T_n-1:

T_n = a₁ + ((n - 1) - 1)a₂ + 1/2((n - 1) - 2)((n - 1) - 1)d₂ + (d₁)_n-1

= a₁ + (n - 2)a₂ + 1/2(n - 3)(n - 2)d₂ + (d₁)_n-1

= a₁ + (n - 2)a₂ + 1/2(n - 3)(n - 2)d₂ + (a₂ + ((n - 1) - 1)d₂)

= a₁ + (n - 2)a₂ + 1/2(n - 3)(n - 2)d₂ + (a₂ + (n - 2)d₂)

= a₁ + (n - 2)a₂ + a₂ + 1/2(n - 3)(n - 2)d₂ + (n - 2)d₂

= a₁ + (n - 1)a₂ + 1/2(n - 3)(n - 2)d₂ + (n - 2)d₂

= a₁ + (n - 1)a₂ + (1/2(n - 3) + 1)(n - 2)d₂

= a₁ + (n - 1)a₂ + ((n - 3)/2 + 2/2)(n - 2)d₂

= a₁ + (n - 1)a₂ + (((n - 3) + 2)/2)(n - 2)d₂

= a₁ + (n - 1)a₂ + ((n - 1)/2)(n - 2)d₂

= a₁ + (n - 1)a₂ + 1/2(n - 1)(n - 2)d₂

= a₁ + (n - 1)a₂ + 1/2(n - 2)(n - 1)d₂

That matches our formula, and hence proves that the formula will hold true for any n if (n - 1) is true.

4.3 Conclusion

Combining the base case with the induction step, we get that it is also true for n = 2 (as it is true for n = 1), which also shows that the formula is true for n = 3 (as it is true for n = 2) and so on. This continues infinitely, showing our formula holds true for any positive integer n. QED.

5. Conclusion

So thats the derivation and proof of the formula for finding the n^th term of a sequence with a second difference. Hopefully after reading through all that, you have gained a better understanding of how sequences with 2^nd differences work.