Not quite the bisection method

In various first year maths courses here, the students learn the "bisection method" for finding zeros of continuous functions. (A zero of a function is a number that makes the answer of the function come out to zero – it's therefore also a point where the graph of the function crosses the x-axis.) It's based on the Intermediate Value Theorem, which basically says that if the function is below zero at one spot and above zero at another, then is must be equal to zero somewhere in between. Here's how the process goes:

  1. Find a point where the function is below zero – we'll call it a – and a point where the curve is above zero – we'll call it b – and then the actual zero of the function must be somewhere between a and b.
  2. Divide the interval from a to b in half – we'll call this centre point c.
  3. Now put c into the formula to find out if the function is above or below zero there.
  4. If it's above zero, then the function goes from below zero to above zero between a and c and so the zero must be between a and c. If it's below zero, then the function goes from below zero to above zero between c and b, and so the zero must be between c and b.
  5. Now you have a new interval where you know there's a zero, so divide this interval in half ...

And the process continues on and on until you either actually find the zero, or you get as close as you need to be.

It's a good method, but the problem I have with it is that the numbers you end up working with are much much more precise than you need!

Let me explain what I mean with an example:

Suppose you want to estimate a zero of the function f(x) = x2-2 to two decimal places. Well, we need a point where the function is above zero and a point where it's below zero. The numbers 1 and 2 ought to do. Now watch:

f(1) is below zero and f(2) is above zero
So the zero is between 1 and 2

The midpoint of 1 and 2 is 1.5, and f(1.5) is above zero
So the zero is between 1 and 1.5

The midpoint of 1 and 1.5 is 1.25, and f(1.25) is below zero
So the zero is between 1.25 and 1.5

The midpoint of 1.25 and 1.5 is 1.375, and f(1.375) is below zero
So the zero is between 1.375 and 1.5

The midpoint of 1.375 and 1.5 is 1.4375, and f(1.4375) is above zero
So the zero is between 1.375 and 1.4375

The midpoint of 1.375 and 1.4375 is 1.40625, and f(1.40625) is below zero
So the zero is between 1.40625 and 1.4375

The midpoint of 1.40625 and 1.4375 is 1.421875, and f(1.421875) is above zero
So the zero is between 1.40625 and 1.421875

The midpoint of 1.40625 and 1.421875 is 1.4140625, and f(1.4140625) is below zero
So the zero is between 1.4140625 and 1.421875

The midpoint of 1.4140625 and 1.421875 is 1.41796875, and f(1.41796875) is above zero
So the zero is between 1.4140625 and 1.41796875

The midpoint of 1.4140625 and 1.41796875 is 1.416015625, and f(1.416015625) is above zero
So the zero is between 1.4140265 and 1.416015625

The midpoint of 1.4140265 and 1.416015625 is 1.4150390625, and f(1.4150390625) is above zero
So the zero is between 1.4140625 and 1.4150390625

The midpoint of 1.4140625 and 1.4150390625 is 1.41455078125, and f(1.41455078125) is above zero
So the zero is between 1.4140625 and 1.41455078125

Any number between these two will round off to two decimal places as 1.41 so our zero will also round off to two decimal places as 1.41.

But look at the numbers we were working with: the final numbers had at least seven decimal place precision, and we only needed two decimal place accuracy!!! I don't know about you, but it seems a bit silly to me!

So here's my idea: why don't we do something in the same spirit, but only use numbers that give us the precision we want? Let's think: anything between 1.405 and 1.415 will round off to two decimal places as 1.41, so instead of exactly dividing the intervals in half, let's just use answers with a 5 in the third decimal place. Let's try again with this focus in mind:

f(1) is below zero and f(2) is above zero
So the zero is between 1 and 2

A number between 1 and 2 is 1.5, and f(1.5) is above zero
So the zero is between 1 and 1.5

A number between 1 and 1.5 is 1.25, and f(1.25) is below zero
So the zero is between 1.25 and 1.5

A number between 1.25 and 1.5 is 1.375, and f(1.375) is below zero
So the zero is between 1.375 and 1.5

A number between 1.375 and 1.5 is 1.435, and f(1.435) is above zero
So the zero is between 1.375 and 1.435

A number between 1.375 and 1.435 is 1.405, and f(1.405) is below zero
So the zero is between 1.405 and 1.435

A number between 1.405 and 1.435 is 1.425, and f(1.425) is above zero
So the zero is between 1.405 and 1.425

A number between 1.405 and 1.415 is 1.415, and f(1.415) is above zero
So the zero is between 1.405 and 1.415

Any number between these two will round off to two decimal places as 1.41 so our zero is 1.41 to two decimal places.

See? Isn't that nicer?

I like it anyway.


This comment was left on the original blog post: 

David Roberts 10 May 2011:
This is a really good approach. It also demonstrates the flexibility one has in choosing terms in a sequence that approximate a real number. The midpoint algorithm is really just a demonstration of the sandwich theorem, and in the absence of any other information about picking a point in an interval to define the next term in the sequence, the midpoint is as good as any (think of it as having an uninformative prior, for example). But if you have some idea about the sort of numbers you want to pick (preference towards 3dp numbers ending in a 5 in this case), then this gives a much faster approach to the final answer.

Tagged in Isnt maths cool