Optimization

Root-finding with the secant method (findroot)

The function findroot locates a root of a given function using the secant method by default. A simple example use of the secant method is to compute pi as the root of sin(x) closest to x = 3:

>>> from mpmath import *
>>> mp.dps = 30
>>> print findroot(sin, 3)
3.14159265358979323846264338328

The secant method can be used to find complex roots of analytic functions, although it must in that case generally be given a nonreal starting value (or else it will never leave the real line):

>>> mp.dps = 15
>>> print findroot(lambda x: x**3 + 2*x + 1, j)
(0.226698825758202 + 1.46771150871022j)

A good initial guess for the location of the root is required for the method to be effective, so it is somewhat more appropriate to think of the secant method as a root-polishing method than a root-finding method. When the rough location of the root is known, the secant method can be used to refine it to very high precision in only a few steps. If the root is a first-order root, only roughly log(prec) iterations are required. (The secant method is far less efficient for multiple roots.) It may be worthwhile to compute the initial approximation to a root using a machine precision solver (for example using one of SciPy’s many solvers), and then refining it to high precision using mpmath’s findroot method.

Neat examples

A nice application is to compute nontrivial roots of the Riemann zeta function with many digits (good initial values are needed for convergence):

>>> mp.dps = 30
>>> print findroot(zeta, 0.5+14j)
(0.5 + 14.1347251417346937904572519836j)

The secant method can also be used as an optimization algorithm, by passing it a derivative of a function. The following example locates the positive minimum of the gamma function:

>>> mp.dps = 20
>>> print findroot(lambda x: diff(gamma, x), 1)
1.4616321449683623413

Finally, a useful application is to compute inverse functions, such as the Lambert W function which is the inverse of w exp(w), given the first term of the solution’s asymptotic expansion as the initial value. In basic cases, this gives identical results to mpmath’s builtin lambertw function:

>>> def lambert(x):
...     return findroot(lambda w: w*exp(w) - x, log(1+x))
...
>>> mp.dps = 15
>>> print lambert(1), lambertw(1)
0.567143290409784 0.567143290409784
>>> print lambert(1000), lambert(1000)
5.2496028524016 5.2496028524016

Options

Strictly speaking, the secant method requires two initial values. By default, you only have to provide the first point x0; the solver automatically sets the second point (somewhat arbitrarily) to x0 + 1/4. Manually providing also the second point can help in some cases if secant fails to converge. If you provide more than one point, be sure to use a tuple (or a list).

By default, findroot performs a maximum of 30 steps (when using the secant method), which can be increased or decreased using the maxsteps keyword argument. You can pass findroot the option verbose=True to show detailed progress. The keyword tol specifies the maximal tolerated error for the solution. The solver will iterate until the error is smaller or maxsteps is reached.

Usually the secant method will just work, but you can use other solvers if it fails. This is done using the solver keyword. A solver is a class that can be called with (f, x0, **kwargs) and returns an iterator yielding tuple consisting of approximative solution and estimated error. So you can implement your own solver, but there are already many available in mpmath; see the mpmath.optimzation module.

Multiple roots

For multiple roots all methods of the Newtonian family (including secant) converge slowly. Consider this example:

>>> f = lambda x: (x - 1)**99
>>> findroot(f, 0.9, verify=False)
mpf('0.91807354244492868')

Even for a very close starting point the secant method converges very slowly. Use verbose=True to illustrate this.

It’s possible to modify Newton’s method to make it converge regardless of the root’s multiplicity.

>>> findroot(f, -10, solver='mnewton')
mpf('1.0')

This variant uses the first and second derivative of the function, which is not very efficient.

Alternatively you can use an experimental Newtonian solver that keeps track of the speed of convergence and accelerates it using Steffensen’s method if necessary.

>>> findroot(f, -10, solver='anewton', verbose=True)
x: -9.88888888888888888889
error: 0.111111111111111111111
converging slowly
x: -9.77890011223344556678
error: 0.10998877665544332211
converging slowly
x: -9.67002233332199662166
error: 0.108877778911448945119
converging slowly
accelerating convergence
x: -9.5622443299551077669
error: 0.107778003366888854764
converging slowly
x: 0.99999999999999999214
error: 10.562244329955107759
x: 1.0
error: 7.8598304758094664213e-18
mpf('1.0')

It’s possible to determine the multiplicity of a root using derivatives. You can do this using the function multiplicity.

>>> multiplicity(lambda x: sin(x) - 1, pi/2)
2

This will calculate the derivatives numerically if you do not specify them, this can be quite inefficient. Due to this, multiplicity cancels after evaluating 10 derivatives by default.

Accelerated convergence

You can use Steffensen’s method to accelerate a fixpoint iteration of linear (or less) convergence.

x* is a fixpoint of the iteration x_{k+1} = phi(x_k) if x* = phi(x*). For phi(x) = x**2 there are two fixpoints: 0 and 1.

Let’s try Steffensen’s method:

>>> f = lambda x: x**2
>>> from mpmath.optimization import steffensen
>>> F = steffensen(f)
>>> for x in [0.5, 0.9, 2.0]:
...     fx = Fx = x
...     for i in xrange(10):
...         try:
...             fx = f(fx)
...         except OverflowError:
...             pass
...         try:
...             Fx = F(Fx)
...         except ZeroDivisionError:
...             pass
...         print '%20g  %20g' % (fx, Fx)
                0.25                  -0.5
              0.0625                   0.1
          0.00390625            -0.0011236
        1.52588e-005          1.41691e-009
        2.32831e-010         -2.84465e-027
        5.42101e-020          2.30189e-080
        2.93874e-039          -1.2197e-239
        8.63617e-078                     0
        7.45834e-155                     0
        5.56268e-309                     0
                0.81               1.02676
              0.6561               1.00134
            0.430467                     1
            0.185302                     1
           0.0343368                     1
          0.00117902                     1
        1.39008e-006                     1
        1.93233e-012                     1
        3.73392e-024                     1
        1.39421e-047                     1
                   4                   1.6
                  16                1.2962
                 256               1.10194
               65536               1.01659
        4.29497e+009               1.00053
        1.84467e+019                     1
        3.40282e+038                     1
        1.15792e+077                     1
        1.34078e+154                     1
        1.34078e+154                     1

Unmodified, the iteration converges only towards 0. Modified it converges not only much faster, it converges even to the repelling fixpoint 1.

Complex roots

For complex roots it’s recommended to use Muller’s method as it converges even for real starting points very fast.

>>> findroot(lambda x: x**4 + x + 1, (0, 1, 2), solver='muller')
mpc(real='0.72713608449119684', imag='0.93409928946052944')

Finding all roots of a polynomial (polyroots)

The function polyroots computes all n real or complex roots of an n-th degree polynomial. It will for example successfully compute the two real roots of 3x^2 - 7x + 2:

>>> mp.dps = 15
>>> for r in polyroots([3,-7,2]):
...     print r
...
0.333333333333333
2.0

or the three roots of x^3 - x^2 - 14x + 24:

>>> polyroots([1,-1,-14,24])
[mpf('-4.0'), mpf('2.0'), mpf('3.0')]

The following example computes all the 5th roots of unity; i.e. the roots of x^5 - 1:

>>> mp.dps = 20
>>> for r in polyroots([1, 0, 0, 0, 0, -1]):
...     print r
...
1.0
(-0.8090169943749474241 + 0.58778525229247312917j)
(-0.8090169943749474241 - 0.58778525229247312917j)
(0.3090169943749474241 + 0.95105651629515357212j)
(0.3090169943749474241 - 0.95105651629515357212j)

Although all roots are internally calculated using complex arithmetic, any root found to have an imaginary part smaller than the estimated numerical error is truncated to a real number. Real roots are placed first in the returned list, sorted by value. The remaining complex numbers are sorted by real their parts so that conjugate roots end up next to each other.

Provided there are no repeated roots, polyroots can typically compute all roots with high precision:

>>> mp.dps = 70
>>> for r in polyroots([1, 0, -10, 0, 1]):
...     print r
...
-3.146264369941972342329135065715570445512477129187328701232486717442666
-0.3178372451957822447257576172961742883731333784334325548791272414612005
0.3178372451957822447257576172961742883731333784334325548791272414612005
3.146264369941972342329135065715570445512477129187328701232486717442666
>>>
>>> print sqrt(3) + sqrt(2)
3.146264369941972342329135065715570445512477129187328701232486717442665
>>> print sqrt(3) - sqrt(2)
0.3178372451957822447257576172961742883731333784334325548791272414612005

Ill-conditioned polynomials

If a root with multiplicity M is present, it can typically only be computed accurately to dps/M digits (an sometimes less, due to slow convergence). For example, a triple root can be expected to be computed to roughly 5 accurate digits at standard 15-digit precision:

>>> mp.dps = 15
>>> P = [1,3,3,1]
>>> roots = polyroots(P)
>>> for r in roots:
...     print r
...
(-1.00000007263211 - 1.9031226565599e-7j)
(-0.999999852005337 + 2.06642147088233e-7j)
(-1.00000162463939 - 1.50181300875847e-6j)

In these cases the modified Newton method is useful to refine the roots to high accuracy:

>>> mp.dps = 50
>>> f = extraprec(30)(lambda x: polyval(P, x))
>>> for r in roots:
...     print findroot(f, r, solver='mnewton')
...
(-1.0 + 0.0j)
(-1.0 + 3.0385816786431355991914375262598262893507090667584e-64j)
(-1.0 + 0.0j)

The extraprec might be needed to ensure accurate evaluation of the polynomial.

An example of an extremely ill-conditioned polynomial is Wilkinson’s polynomial which has roots at the integers 1 through 20.

>>> mp.dps = 15
>>> W = [2432902008176640000, -8752948036761600000, 13803759753640704000,
...     -12870931245150988800, 8037811822645051776, -3599979517947607200,
...     1206647803780373360, -311333643161390640, 63030812099294896,
...     -10142299865511450, 1307535010540395, -135585182899530,
...     11310276995381, -756111184500, 40171771630, -1672280820, 53327946,
...     -1256850, 20615, -210, 1][::-1]
...
>>> roots = polyroots(W)
>>> for r in roots:
...     print r
...
1.0
2.0
3.0
4.00000000000184
4.99999999994666
6.00000000011394
7.00000001128901
7.99999998684772
8.99999991431934
9.99999897717367
10.9999998763699
11.9999947045394
13.0000024841578
13.9999973809886
15.0000157682936
16.000000498559
17.0000017669947
18.0000008200121
18.999999711431
20.0000000125722

All roots have been separated out, but polyroots fails to converge to high accuracy. The roots can however effectively be polished using the secant method:

>>> for r in roots:
...     print findroot(extraprec(50)(lambda x: polyval(W, x)), r)
...
1.0
2.0
3.0
4.0
5.0
6.0
7.0
8.0
9.0
10.0
11.0
12.0
13.0
14.0
15.0
16.0
17.0
18.0
19.0
20.0

The extra precision ensures that the evaluation of the polynomial is accurate.