Error and Convergence

Absolute Error

There are two main types of error. The absolute error is defined as the distance from the root and is bounded by | x - y |. This gives one way to terminate and is often the type of error a user wants. To guarantee this error is halved, the arithmetic mean (AM) is used.

Relative Error

Although absolute error is simple to understand, it does not actually match the structure of floating point numbers and misses the concept of significant figures. The relative error is defined as the absolute error divided by the absolute value of the root and is estimated by | x₁ - x₂ | / (0.5 | x₁ + x₂ |). There are several interpretations of the relative error. One interpretation is it measures the number of significant figures to which the root is found. Another intuitive interpretation is it measures how close the ratio x₁ / x₂ is to 1. To guarantee this error is halved (sort of), the geometric mean (GM) is used.

Order of Convergence

Let d_n be the number of digits accurate on the nth iteration. The order of convergence may be defined as

order = lim_{n → ∞} d_n^1/n,

which is often interpreted as an approximation in the form of

d_{n + 1} ≈ order × d_n .

For example, if the order of convergence for a method is 1.500, then it can be expected that that method gains 50% more digits accurate on every iteration eventually.

In terms of the absolute error, the order is often interpreted as an approximation in the form of

| ε_{n + 1} | ≈ | ε_n |^order

All methods have an order of convergence of at least 1.000, meaning they will converge.

Orders of Convergence for `pyroot`

Assuming the root is simple, the orders of convergence for each of the methods in pyroot are as follows:

Method	Order	Root of
Bisection	1.000	x - 1
False Position	1.414	x² - 2
Newton*	2.291	x¹⁵ - 63x¹⁰ - 3x⁵ + 1
Muller	1.731	x¹² - 9 x ⁸ + 1
Dekker	1.618	x² - x - 1
Brent	1.731	x¹² - 9x⁸ + 1
Chandrupatla	1.731	x¹² - 9x⁸ + 1
Chandrupatla-Quadratic	1.731	x¹² - 9x⁸ + 1
Chandrupatla-Mixed	1.731	x¹² - 9x⁸ + 1

_{*Note: Newton's method requires the derivative fprime. If fprime is counted as another function evaluation, and function evaluations are used for the number of iterations, Newton's method drops down to an order of ≈1.513.}

Orders of Convergence for Originals

In terms of their original variants (as bracketing methods), without the pyroot.solver implementation, the orders of convergence are as follows:

Method	Order	Root of
Bisection	1.000	x - 1
False Position	1.000	x - 1
Newton*	2.000	x - 2
Muller	1.618	x ² - x - 1
Dekker	1.618	x ² - x - 1
Brent	1.618	x ² - x - 1
Chandrupatla	1.618	x ² - x - 1
Chandrupatla-Quadratic	N/A	Experimental
Chandrupatla-Mixed	N/A	Experimental

The significant decrease in performance is due to the possibility that one of the bracketing points does not remain stuck. When this occurs, the interpolation methods degenerate into a secant approximation using 2 converging points on one side of the root.

Rate of Convergence

The rate of convergence is a refinement of the order of convergence. While the order of convergence describes the first term in the error, the rate of convergence describes the second term, in the refined approximation formulas:

d_{n + 1} ≈ order × d_n - log(rate),

ε_{n + 1} ≈ rate × ε_n^order,

where the base of the log depends on what numbering system is used for d_n (e.g. 2 = binary, 10 = base 10) i.e. d_n = -log_base( ε_n ).

The rate of convergence is particularly relevant when the order is 1, because that is when the rate of convergence controls how fast it converges. Outside of this however, the rate of convergence is usually very complicated and hardly worth a mention, and thus are omitted from here.

Analysis

A brief analysis of the orders of convergences shown in the tables above are provided here.

Let us denote the following variables:

x_☆ denotes the root.
x̂_☆ denotes the next estimate of the root.
x₁ denotes the opposing estimate of the root.
x₂ denotes the last estimate of the root.
x₃ denotes the last removed from x_{[1, 2]}.
ε_i denotes x_☆ - x_i , the signed errors of x_i .
ε̂_i denotes x̂_☆ - x_i , the approximate signed errors of x_i .
d_i denotes -log( ε_i ), the digits of accuracy for x_i .
d̂_i denotes -log( ε̂_i ), the digits of accuracy for x̂_i .

Bisection

On every iteration, bisection halves the error. Thus, it is easy to see that the order is 1 and the rate is 0.5.

Interpolation Methods

Newton's interpolation formula provides an easy way to understand the error behavior when dealing with interpolation methods. It gives us the approximation

f ( x_☆ ) = f_{[ i ]} + f_{[ i , j ]} ε_i + f_{[ i , j , k ]} ε_i ε_j + O (ε_i ε_j ε_k)

where f_{[ i ]}, f_{[ i , j ]}, and f_{[ i , j , k ]} are values depending on their respective x_i . This inverts itself to provide the following estimates for the root (depending on the points used):

Using x_{[1, 2]}:

f ( x_☆ ) = f_[2] + f_{[1, 2]} ε₂ + O ( ε₁ ε₂ ).
x̂_☆ = x_☆ + O ( ε₁ ε₂ ).

Using x_{[2, 3]}:

f ( x_☆ ) = f_[2] + f_{[2, 3]} ε₂ + O ( ε₂ ε₃ ).
x̂_☆ = x_☆ + O ( ε₂ ε₃ ).

Using x_{[1, 2, 3]}:

f ( x_☆ ) = f_[2] + f_{[2, 3]} ε₂ + f_{[1, 2, 3]} ε₂ ε₃ + O ( ε₁ ε₂ ε₃ ).
x̂_☆ = x_☆ + O ( ε₁ ε₂ ε₃ ).

Now we may construct the orders of convergence for methods using polynomial interpolation (a slight modification but equivalent result follows from inverse polynomial interpolation as well).

Furthermore, by extending these methods out with 1 extra point, we may also make note of what the sign of the error is, which is crucial for knowing whether x̂_☆ will replace x₁ or x₂. In fact, we can argue that in the worst case it will always replace x₂, assuming f_[...] doesn't change signs, which is asymptotically the case except when higher order derivatives are vanishing at the root.

False Position (Original)

The false position method uses only x_{[1, 2]}. Expanding one step further, we may see that we have

f ( x_☆ ) = f_[2] + f_{[1, 2]} ε₂ + f_{[1, 2, 3]} ε₁ ε₂ + O ( ε₁ ε₂ ε₃ ).
x̂_☆ = x_☆ + C ε₁ ε₂ + O ( ε₁ ε₂ ε₃ ).

Assuming C does not asymptotically change signs (unless f ''( x ) happens to change signs at x_☆ ), we may see that ε₁ ε₂ is always negative. This means x̂_☆ will always have the same sign for the error, and f ( x̂_☆ ) will always have fixed sign. This should intuitively match a sketch of the false position method.

Since the sign of the error asymptotically never changes, this means x̂_☆ must be replacing x₂, leaving x₁ as a constant. Since the error term is C ε₁ ε₂, this implies that we have ε₂ ≈ C ε₁ ε₃ from the previous iteration, meaning the error decreases by a factor of C ε₁ every iteration, giving an order of magnitude of 1.

We now formalize this in an easier format for analysis. By throwing away constants and considering d_i , we can tackle d₁, d₂, and d₃ in terms of their previous iterations.

d₁ remains the same.
d₂ is the sum of d₁ and d₂ (i.e. ε₂ := ε₁ ε₂ ).
d₃ is the previous d₂.

Expressed as a matrix, this is the same as multiplying by

[[1, 0, 0],
 [1, 1, 0],
 [0, 1, 0]]

whose largest eigenvalue is 1, the order of this method.

False Position (`pyroot`)

The pyroot implementation is the same as the original, except after 3 iterations of failing to update x₁, it forces an update by modifying the value of f_{[1, 2]}. This asymptotically increases error for the next iteration from O ( ε₁ ε₂ ) to O ( ε₂ ), but forces the sign of the error to change, and hence updates x₁. This leads to, on every 4th iteration, the following matrix

[[0, 1, 0],  # The new x1 is the previous x2.
 [0, 1, 0],  # The new x2 has O(e2) error.
 [1, 0, 0]]  # The last value removed is the previous x1.

which after multiplying 1 iteration of this with 3 iterations of the other gives the matrix gives

# SWAP @ KEEP @ KEEP @ KEEP =
[[3, 1, 0]
 [3, 1, 0]
 [1, 0, 0]]

which has a largest eigenvalue of 4. Thus, the order of convergence is 4^1/4.

Dekker (Original)

Dekker's method is a particularly interesting one. Rather than using only x_{[1, 2]}, it uses both x_{[1, 2]} and x_{[2, 3]}. Specifically, it uses x₂ with the previous x₂, which may become either x₁ or x₃. As such, iterations are really just x̂ and x₂, with matrix given by

[[1, 1],
 [1, 0]]

which has largest eigenvalue (1 + 5^1/2) / 2 ≈ 1.618, the order of this method.

Dekker (`pyroot`)

The pyroot implementation is the same as the original, except it forces sign changes if they do not occur.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Error and Convergence

Absolute Error

Relative Error

Order of Convergence

Orders of Convergence for `pyroot`

Orders of Convergence for Originals

Rate of Convergence

Analysis

Bisection

Interpolation Methods

False Position (Original)

False Position (`pyroot`)

Dekker (Original)

Dekker (`pyroot`)

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally

Error and Convergence

Absolute Error

Relative Error

Order of Convergence

Orders of Convergence for pyroot

Orders of Convergence for Originals

Rate of Convergence

Analysis

Bisection

Interpolation Methods

False Position (Original)

False Position (pyroot)

Dekker (Original)

Dekker (pyroot)

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally

Orders of Convergence for `pyroot`

False Position (`pyroot`)

Dekker (`pyroot`)