Emerging Tools for Experimental Mathematics

Jonathan M. Borwein and Robert M. Corless

1 Introduction and ``Warm-up''

If I can give an abstract proof of something, I'm reasonably happy. But if I can get a concrete, computational proof and actually produce numbers I'm much happier. I'm rather an addict of doing things on the computer, because that gives you an explicit criterion of what's going on. I have a visual way of thinking, and I'm happy if I can see a picture of what I'm working with. -John Milnor, []

Using mostly elementary examples, we discuss the use of some recent and emerging tools for experimental mathematics. The tools discussed include so-called ``inverse symbolic computation'', using lattice reduction algorithms such as ``LLL'' and ``PSLQ'', and Sloane and Plouffe's integer sequence lookup program. We concentrate on `computer-assisted discovery' of mathematical results, but a little `computer-assisted proof' creeps in as well. This paper is not a tutorial on how lattice basis reduction algorithms such as LLL or PSLQ actually work; rather, we discuss some of the ways these tools can be used to generate conjectures, and for that, a detailed understanding of the underlying algorithms is not necessary. We do hope, however, to convey some appreciation of their power.

We begin with some warm-up examples, using the Inverse Symbolic Calculator (ISC); http:// www.cecm.sfu.ca/ MRG/ INTERFACES.html. The basic idea is simple: given the first few decimal digits of some real number, we want the ISC to guess a formula for what it ``really'' is.

For example, if we input 3.14626436994198, and click on ``simple lookup'' (the default) and ``Run'', the ISC tells us that

...
3146264369941972 = (0405) 1/abs(-sr(3)+sr(2))
Your value of 314626436994198 would be here.
3146264469611207 = (0192) (5^(1/2)+4)/(exp(1/2)+1/3)
...

This has correctly identified our input as 1/(�3-�2) = �3+�2, by table lookup. Using the ``integer relation'' option would get us, instead, the error message that we need at least 16 digits, and then when we change the final 8 to 72, the following answer appears:

K = 3.146264369941972 gave the following results:
                                                      2    4
        K satisfies the following polynomial, 1 - 10 X  + X

together with some negative results about combinations of other constants.

Now consider a second ``warm-up''. If we input the number K, computed from the infinite product

K =

�
n � 2

n²-1

n²+1

= .2720290549821331� ,

then the ``simple lookup'' fails to tell us anything; the ``integer relations'' option tells us that it is not a simple combination of a few specific constants; but the ``smart lookup'' tells us that K/2 = p/(exp(-p)-exp(p)). This is actually wrong-it's got the wrong sign, possibly because signs are ignored for the ISC-but the digits are correctly identified. K is indeed equal to p/sinh(p).

As a final warm-up, consider the following two infinite products:

�
k � 1

(1 +

k²

)²

(1+

k²

)

1.84893618285824448�

(1)

�
k � 1

(1 +

k²

k³

)²

(1+

k²

k³

)

1.797439588835227�

(2)

Simple lookup, smart lookup, and integer relations as embodied in the ISC all fail to tell us anything about these numbers. In fact, (1) is

�
k � 1

(1 +

k²

)²

(1+

k²

)

3�2

cosh²(p

�3

)

sinh(p �2)

but this is a strange enough formula that we aren't surprised that the ISC can't identify it. We do not know any closed form expression for (2), however.

The ``generalized expansions'' option guesses that there is a simple generating function for the ``egyptian fraction'' of the reciprocal of (1), namely

x(69x-2)

47x-1

but this is incorrect, and it is easy to disprove this conjecture by computing the series expansion

x(69x-2)

47x-1

= 2 x+

�
k � 2

(25·47^k-2) x^k

(3)

and evaluating the rational number that is the ``egyptian fraction'' defined by the coefficients of the series (3):

�
k � 2

25·47^k-2

311

575

= .540869565� ,

(4)

whereas the reciprocal of (1) starts off as .5408515498�, which differs from 311/575 after the fourth decimal place. Similarly, the ISC's generalized expansions return an incorrect egyptian fraction for the reciprocal of (2).

An `egyptian fraction' is just an ordinary rational written as a sum of reciprocals of natural numbers without repeated entries in the sum. The egyptian fraction (4) corresponding to the generating function (3) is a geometric series.

So, we have seen examples where the ISC tells us something useful, tells us something incorrect, and tells us nothing.

2 A connection between the Lambert W function and Stirling's formula for n!

We now look at a more interesting example, using the online version of the Encyclopedia of Integer Sequences [] (http:// www.research.att.com/ _[ \tilde]njas/ sequences/).

The Lambert W function satisfies

W(x) e^W(x) = x .

(5)

See [] for a survey of properties and applications of W, together with some of its history; [] explores various series for W including the one we discuss in this section. We give a short introduction to this function in Appendix .

Figure

Figure 1: The real branches of the function W(x) that satisfies WexpW = x.

There is a branch point of W at x = -1/e, where W(x) = -1. See Figure 1, which can be produced in MAPLE by the command

   > plot( [ t*exp(t), t, t=-5..1 ], -1..3, -4..1 );

The two real-valued branches of W are denoted W₀(x) and W_-1(x); we also refer to W₀(x) as the principal branch. We wish to know more about the function near the branch point at x = -1/e. After various experiments, we decide to compute the series of

W₀(-e^-1-z²/2)

in MAPLE. We get, very quickly, that W₀(-exp(-1-z²/2)) =

-1 + z-

z²+

z³+

270

z⁴+

4320

z⁵-

17010

z⁶

139

5443200

z⁷-

204120

z⁸-

571

2351462400

z⁹+O(z¹⁰) .

(6)

As our first real example of using a new tool, we look up the sequence of denominators 1, 3, 36, 270, 4320, �, in []. [Sometimes denominators have nontrivial common factors with numerators. Cancellation of these common factors will make any ``guessing'' procedure more difficult.] We find the sequence immediately, and the Encyclopedia gives a reference to the delightful paper [], which does not mention W or refer to any papers on W, or indeed even use it explicitly. Thus, [] would not easily be found by a normal citation search. We find out in [] that equation (6) gives coefficients needed in Stirling's formula for n!, which begins

n! ~

�

2pn

nⁿ e^-n

�
�
�

12 n

288 n²

139

51840 n³

571

2488320 n⁴

�
�
�

n⁵

�
�
�

The connection we discover (without doing any work ourselves) is that if

W₀(-e^-1-z²/2) =

�
k � 0

(-1)^k-1 a_k z^k ,

then

n! ~

�

2pn

nⁿ e^-n

�
k � 0

1·3 ·5 �(2k+1)

n^k

a_2k+1 ,

and moreover there is a lovely (and useful!) recurrence relation for the a_k's, namely a₀ = 1, a₁ = 1, and

a_n =

(n+1)a₁

�
�

a_n-1 -

n-1
�
k = 2

ka_ka_n+1-k

�
�

3 Riemann Surfaces

Tools such as MATLAB and MAPLE permit easy and accurate visualization of Riemann surfaces for elementary functions [,]. Our qualitative understanding of even extremely basic mathematical building blocks can thus be affected by mathematical software tools. See [] for more discussion of visualization in general; here we concentrate on a simple technique for visualization of Riemann surfaces, namely to make 3-d plots of �f(z) or �f(z).

It is necessary to prove something about this technique-namely, that it really gives us a good picture of the Riemann surface and not just a 3-d plot of the imaginary part (or the real part) of the function involved. This is pursued in more detail in [], but the key point is that given w = u + iv = f(z) = f(x + iy), then we get an accurate Riemann surface by plotting, say, (x,y,v) if and only if the missing piece of information (here, u) is completely determined once x, y, and v are given. This is simple, if not quite obvious: once we have a smooth three-dimensional surface, each point of which can be associated with a unique value (i.e. ordered pair) of the map z � w = f(z), then we have a representation of the Riemann surface of f.

This exact association is not automatic. For example, if w = ln(z) and we plot (x,y,u), then we do not get a picture of the Riemann surface for logarithm, because the branch of v = �(w) = arg (z) is not determined from u = ln(x²+y²)/2, x, and y. If we plot (x,y,v), of course, we do recover the classical picture of the Riemann surface for ln(z), because given x, y, and v we can easily find u.

The following short piece of MAPLE code shows how to graph the Riemann surface for the Lambert W function. We urge you to try the following computation, because the dynamic coloured picture you get is much more easily understood than the static black-and-white image in Figure . We also urge you to try your hand at your own functions; many others are graphed in [] and [].

w : = u + I v

z : = (u + I v) e^{(u + I v)}

u e^u cos(v) - v e^u sin(v) + I (v e^u cos(v) + u e^u sin(v))

x : = u e^u cos(v) - v e^u sin(v)

y : = v e^u cos(v) + u e^u sin(v)

   > plot3d([x,y,v], u=-6..1,
            v=-5..5, axes=FRAME, 
            orientation=[-110,73], 
            labels=["x","y","v "], 
            style=PATCHNOGRID,
            colour=u,
            view=[-1..1,-1..1,-5..5],
            grid=[50,50]);

See Figure for a static representation of the results of this plot command.

Figure Figure 2: The Riemann surface for the Lambert W function.

3.1 1-1 correspondence proof

Given x, y, and v, we have to solve for u. Of course, one takes the existence of (u,v) for a given (x,y) for granted here; for the Lambert W function, a proof can be found in []. We have

(u+iv)e^u+iv = x + iy ,

which gives

ue^u + ive^u = (x+iy)e^-iv = (x+iy)(cosv - isinv) ;

therefore,

ue^u + ive^u = (xcosv + ysinv) + i (ycosv - xsinv) .

If v � 0, and moreover ycosv - xsinv � 0, then dividing the real part by the imaginary part gives u in terms of x, y, and v:

u =

v (x cosv + y sinv)

y cosv - xsinv

This solution is unique. Investigation of the exceptional conditions v = 0 or y cosv - xsinv = 0 leads to uexpu = x, which has two solutions if and only if -1/e � x < 0, in the case v = 0, and to the singular condition u = -� and x = y = 0.

This is precisely what we observe in the graph: two sheets intersect only if -1/e � x < 0 (note that the colours are different and hence the corresponding sheets on the Riemann surface do not ``really'' intersect), and all sheets have a singularity at the origin, except the central one, which contains v = 0. This is as good a representation of the Riemann surface for the Lambert W function as can be produced in three dimensions.

However, Figure 2 is nowhere near as intelligible as the live MAPLE plot. On a PC, the use of OpenGL by MAPLE allows the plot to be rotated by direct mouse control. This helps to give a good sense of what the surface is really like, in three dimensions.

4 Dynamical systems, numerical analysis, and formal power series

In this section we give a brief overview of a surprising connection between numerical analysis of dynamical systems and formal power series. We begin with a simple question: what, exactly, does the fixed time step forward Euler numerical method do to the solution of the simple initial value problem

= y²

(7)

with y(0) = y₀? The numerical procedure is just

y_n+1 = y_n + hy_n�

(8)

for integer n � 0, where y_n� = y_n² and h > 0 is the chosen time step.

It turns out to be useful to rescale y and t so that v = hy and t = ht, giving

= v² ,

(9)

and (8) becomes

v_n+1 = v_n + v_n² .

(10)

We may then rephrase our question to ask instead what the relationship between v_n and v(t) is.

The point of view taken in [] is that of backward error analysis. That is, instead of asking for the difference between v(n) and v_n, we ask instead if there is another differential equation, say

= B(w)w² ,

(11)

whose solution interpolates v_n. That is, we impose the conditions w(0) = v₀ and w(t+1) = w(t) + w(t)² (cf. (10)), and see if we can find such a function B(w).

We do this not so we can improve the behaviour of Euler's method for this problem, but rather so that we may understand what Euler's method has done to the problem; for by understanding the function B(w) we will learn something about Euler's method, by comparing (9) to (11).

It turns out that we can use the method of modified equations [] to find as many terms of the Taylor series for B(w) as we desire. When we compute the modified equation for (7) to (say) fifth order, we get

�
�
�

1 - w +

w² -

w³ +

124

w⁴ -

1256

w⁵

�
�
�

w² .

(12)

Now we see the sequence 1, -1, 3, -16, 124, -1256 appearing. This is sequence M3024 in [], which points us directly to the very beautiful and useful paper [].

We find in that paper that if

B(w) =

�
n � 0

c_n wⁿ ,

(13)

then

c_n =

n-1

n-1
�
i = 1

�
�
�

n-i + 1
i+1

�
�
�

c_n-i ,

and this, combined with the functional equation

B(w) =

(1+w)²

1+2w

B(w+w²)

(which can be iterated to give us two converging infinite products for B), allows us to write an efficient program to evaluate B(w). We can show that B(w) has a pole at w = -1/2; by mapping backwards, solving w + w² = -1/2, we find two more (complex) poles. Iterating this process finds an infinite number of complex poles, approaching the Julia set for the map v � v+v² arbitrarily closely; see Figure .

Figure Figure 3: The first 16000 poles of B(v), approaching the Julia set of v � v + v².

The Julia set itself approaches the origin arbitrarily closely. That is, there are poles arbitrarily close to the point of expansion of the series given for B. Thus the series (13) diverges-but, nonetheless, it can be used to evaluate B(w) for w close enough to zero, using MAPLE's built-in sequence acceleration techniques. This is precisely where the convergent infinite products are slow, and hence the series is useful. See [] for details.

But more to the point, in [], G. Labelle completely solves the problem of interpolating discrete dynamical systems with continuous dynamical systems, in the domain of formal power series. The mathematical language, however, is quite different from that used in the numerical analysis world. As an example, in [] the `modified equation' is termed an `infinitesimal generator' for the discrete dynamical system. Therefore, simple subject searches might not find []. Indeed, a combinatorics journal seems an unlikely place to find the solution of a problem in the numerical analysis of dynamical systems, but the Encyclopedia of Integer Sequences provides a way to search the `knowledge database' that is keyed on the examples, or the concrete results, of papers-not the jargon. This, if you like, is a new kind of search tool.

5 An integer-relation example

The following is taken from []. As a didactic example, suppose that we are interested in finding the value of the definite integral

V =

�
�

�

0

�x ln⁵ x

(1-x)⁵

dx ,

(14)

and that we suspect that V could be expressed as a polynomial in p², of low degree, with short rational coefficients.

Such a conjecture might arise naturally from consideration of

�
�

�

0

�xln² x

(1-x)²

2p²

�
�

�

0

�xln³ x

(1-x)³

p²(p²-12)

�
�

�

0

�xln⁴ x

(1-x)⁴

p²(p²-12) ,

for example, and we may suppose that these values are known already, for the sake of argument. One can use the ``mellin'' routine of the ``inttrans'' package in MAPLE to evaluate all these (and V) symbolically-so this example is really just expository.

To be explicit, we conjecture that

V = r₁ + r₂ p² + r₃ p⁶ + r₄ p⁶ ,

where all the r_i are short rational numbers. Instead of trying to derive the coefficients of the this polynomial analytically, we can use numerical approximation and a lattice basis reduction algorithm, the LLL algorithm given in [], to identify the coefficients heuristically. In an ideal world, we would then know what we had to prove, and, knowing that, would find the proof easier.

We give a short overview of using the LLL algorithm to find integer relations. Suppose that we have a finite set B of n-dimensional linearly independent vectors with rational entries. We call the set

L =

�
�
�

�
v � B

r_v v

�
�
�

r_v � \mathbb Z

�
�
�

``the lattice spanned by B". We say that the lattice has dimension n, and that B is a basis for the lattice. There may be many other bases for the lattice, and we often want to find particular bases with nice properties. For many applications, and in particular for finding integer relations, what we would really like to have is ``the basis with the shortest Euclidean length". Unfortunately, the problem of determining whether one has the shortest basis may be NP-complete []. But finding a short basis is often just as helpful, and the LLL algorithm [] can, in polynomial time, find ``relatively" short vectors; guaranteed, in fact, to be of length at most 2^n-1l, where l the shortest possible. In practice the LLL algorithm often returns vectors much better than this bound.

To proceed in MAPLE, we choose a large constant C and form the following matrix, and use the lattice reduction subroutine.

B : = [[1, 0, 0, 0, 0, 0, C·1],

[0, 1, 0, 0, 0, 0, C·p²],

[0, 0, 1, 0, 0, 0, C·p⁴],

[0, 0, 0, 1, 0, 0, C·p⁶],

[0, 0, 0, 0, 1, 0, C·p⁸],

[0, 0, 0, 0, 0, 1, C·V]]

proc () � end

We work to 30 digits for this example. In general, one has to experiment to find how many digits one needs.

Digits : = 30

We compute an approximation for the value V that we wish to identify, and approximations of the quantities that we wish to relate to V.

V : =

�
�

�

0

�x ln(x)⁵

(1 - x)⁵

lastcol : = [1., 9.86960440108935861883449099988,

97.4090910340024372364403326888, 961.389193575304437030219443653,

9488.53101607057400712857550392, -16.6994737192290704961872434007]

We now choose a large constant C. We use the size of C to penalize vectors that do not combine to zero.

C : = 1000000000000000

We construct the rows of the matrix that we need, as follows.

   >  for i to 6 do 
   >    row.i := [ seq(0, j=1..7) ]: 
   >    row.i[i] := 1:
   >    row.i[7] := C* lastcol[i]:
   >  od:

B : = [[1, 0, 0, 0, 0, 0, .1000000000000000 10¹⁶],

[0, 1, 0, 0, 0, 0, .986960440108935861883449099988 10¹⁶],

[0, 0, 1, 0, 0, 0, .974090910340024372364403326888 10¹⁷],

[0, 0, 0, 1, 0, 0, .961389193575304437030219443653 10¹⁸],

[0, 0, 0, 0, 1, 0, .948853101607057400712857550392 10¹⁹],

[0, 0, 0, 0, 0, 1, -.166994737192290704961872434007 10¹⁷]]

Now we call the ``lattice'' routine to compute a short basis for the set generated by these rows.

[[0, 120, 140, -15, 0, 24, .622 10^-11],

[-16743, 51, 156, 10, -1, -55, 6738.90916826007994],

[35146, -443, -57, -16, -1, 21, 19729.34720281002100],

[6349, -2221, 94, 2, 0, -269, -7554.67120587589348],

[-2452, -99, 8, -3, 2, 805, 5948.36266979182662],

[32181, 345, 9, -11, -1, 982, -19383.09100001444674]]

(15)

All of these new basis vectors are of the form

�
�

r₁, r₂, r₃, r₄, r₅, r₆, C

6
�
i = 1

r_i a_i

�
�

where the r_i are integers. This is because each new vector is an integer linear combination of rows of the initial matrix. Because the initial matrix was an augmented identity matrix, the coefficients of the requisite integer combination show up in the result. Because we chose C to be so large, looking for a short vector in this space really biases the search towards places where the integer linear combination of the final column is zero, if there are any. Hence we suspect, from the first row of (15), that

120p² + 140p⁴ -15p⁶ + 24V = 0

V =

p²(3p⁴ - 28p² - 24) .

Issuing the following MAPLE command lends credence to our suspicion.

-.8 10^-97

There is a simpler Web-based implementation, which uses the ``EZface" to emulate a more comprehensive GNU MP implementation of this method. Go to http:// www.cecm.sfu.ca/ MRG/ INTERFACES.html, click on ``EZface", and type in the following:

lindep([1., 
9.86960440108935861883449099988, 
97.4090910340024372364403326888, 
961.389193575304437030219443653, 
9488.53101607057400712857550392, 
-16.6994737192290704961872434007])

Then, select 30 digits of precision, and click ``evaluate". Very quickly, the vector

0, -120., -140., 15., 0, -24.

is returned-voilá, our integer relation.

Issuing the command ``lindep" calls a subroutine that looks for short integer linear dependencies among the given vector of numbers. Again its results are to be considered as ``possible relations", to be proved later.

The LLL approach is very powerful and, if used with imagination, offers rich possibilities for discovery.

One danger, indeed a `mathematical sin', is indicated by the following quotes:

``Number theory is full of things that everyone knows are true, but no one is able to prove.'' -Peter Borwein

``I have absolutely no interest in proving things that I know are true.'' -Henry Abarbanel, at a dynamical systems workshop at Penn State in 1994.

We discuss `redemption' of sins such as this in the final section.

Numerical instability in the LLL algorithm may cause difficulty, as well. Here we have simply worked to ``enough'' digits to mitigate its effects-that is, we are trying to buy more accuracy by paying for more precision. This is often expensive, and PSLQ, discussed in Section , is better.

6 How solvable is `solvable'?

This example is also taken from []. The following problem arises when thinking about modular (theta) functions; see []. If we define

a(q)

: =

�
m,n � \mathbb Z

q^m²+mn+n²

b(q)

: =

�
m,n � \mathbb Z

w^n-mq^{m² + mn+n²}

c(q)

: =

�
m,n � \mathbb Z

q^{(n+1/3)² + (n+1/3)(m+1/3) + (m+1/3)²}

where w = exp(2pi/3), then we have

a³ = b³ + c³

and a lovely parameterization of the ₂F₁ hypergeometric function []:

�
�
�
�
�
�

�
�
�

c³

a³

�
�
�
�
�
�

= a .

RMC-removed reference to CL. Is is necessary? Should we put it back? Choosing q = exp(-2p�[(N/3)]) for N � \mathbb Q, it can be shown that s_N : = c/a is an algebraic number expressible by radicals; see []. If N is a positive integer, then s_N is called the Nth cubic singular value. What can we discover computationally about s_N? For example, can we determine radical formulae for the higher order cubic singular values?

The following observations help the efficiency of the computations. It is known that

a(q)

q₃(q)q₃(q³)+q₂(q)q₂(q³)

b(q)

(3a(q³)-a(q))/2

c(q)

(a(q^1/3)-a(q))/2 ,

where

q₂(q) =

�
n � \mathbb Z

q^(n+1/2)² and q₃(q) =

�
n � \mathbb Z

q^n²

are the classical theta functions. The lacunarity of these series allows for very rapid computation.

6.1 A useful transformation

A further transformation, which makes the as yet unknown minimal polynomial simpler, is useful. After examining the patterns in the first few cases s₁, s₂, s₃, �, and using the analogous classical quadratic singular values (where one sees the forms 4k_N²(1-k_N²) or (1-k²_N)/2k_N depending on the parity of N), the authors of [] thought to look at

G_N : =

�
�
�

- s_N

�
�
�

N \not � 0 mod 3

g_N : =

3s_N

1-s_N³

N � 0 mod 3 ;

the minimal polynomial for G_N or g_N then has lower degree than the minimum polynomial for s_N. This makes the polynomial easier to find by the PSLQ algorithm.

The PSLQ algorithm (see []) and the LLL algorithm can both be used to find integer relations (and hence minimal polynomials for an algebraic number a, by looking for an integer relation among 1, a, �, a^m). However, PSLQ can also produce negative results. If PSLQ fails to find an integer relation, then one can usually say that there is no such relation with coefficients less than a computable bound, effectively proving that there is no ``simple'' relation of the guessed form.

The authors of [] used these ideas to `decode' the numerical values of s_N into radical form, up to N = 100, and some values beyond, such as N = 110 and 154. They used a variety of strategies to verify the results; some ingenuity was necessary in order to extract the radicals. For N < 53, they computed P_N, the minimal polynomial for G_N or g_N; they then tried factoring P_N over different quadratic number fields until they got a factor of degree 4 or less, which they solved in radicals. This approach failed at N = 53, where they first had to use a special MAPLE program for finding a radical for any solvable quintic. (See ftp:// calfor.lip6.fr:/ pub/ softwares/ Maple/ quinticV2.gz.) The radical returned for N = 53 has over 7500 symbols in it. Kevin Hare at the CECM refined it to an equivalent but simpler radical with `only' 860 symbols. MAPLE was able to verify symbolically that this simpler radical solved P₅₃. In general, determining that a symbolic equation is indeed zero is, in certain classes of expressions, computationally undecidable [].

Indeed, the point of this whole exercise was to determine how good both MAPLE's symbolic tools and PSLQ's numerical ones were on ``grand challenge'' examples. Experience with exercises such as this have led to improvements in both tools.

If symbolic verification is not possible, reassurance that the results are correct can often be obtained by using Klein's absolute invariant []

J₂(x) =

(1-x²(1-x²))³

x⁴(1-x²)²

and its cubic counterpart

J₃(x) =

(9-8x³)³

x⁹(1-x³)

If our computed s_N is correct, then it is related to the (known) classical singular value k_3N by

J₂(k_3N) = J₃(s_N) .

(16)

The identity (16) can be derived from Proposition 5.8 in []. It can be checked symbolically in MAPLE for the radicals arising in the cases N � 10. For larger N, some human intervention is required. For N = 70, the verification requires use of k₂₁₀, the computation of which Hardy called ``one of the most striking of Ramanujan's results" []. We note that purely numerical computation, together with analytic reasoning about such computation (some of which is automatable) can be used to verify the results. Standard irrational number theoretic techniques allow one to show that either J₂(k₂₁₀) = J₃(S) or | J₂(k₂₁₀) - J₃(S)| > 10^-6400 , where S is our heuristically guessed radical formula for s₇₀. Given this knowledge, a few minutes of CPU time establishes that | J₂(k₂₁₀) - J₃(S)| < 10^-6400 , and thus J₂(k₂₁₀) = J₃(S).

7 Final Vignettes

Integer relation algorithms have already helped to discover many new results. We list a few of these, again taken from []. The number of such results continues to climb. We have to tell the algorithms what kind of relationship to look for, but, given that, the algorithms allow previously impossible jumps.

7.1 Zeta value series

The formula for z(3) used by Apéry to prove that z(3) is irrational, namely

z(3) =

�
k � 1

(-1)^k+1

k³

�
�
�

2k
k

�
�
�

has no analog for z(2n+1) with n � 2. It can be shown using PSLQ (or more simply the Euclidean algorithm since there are only two unknown integers) that if a formula like

z(5) =

�
k � 1

(-1)^k+1

k⁵

�
�
�

2k
k

�
�
�

exists, then the integers p and q are larger than 10³⁰⁰.

There is a similar but more complicated formula for z(5), due to Koecher, that does suggest generalization, however. Borwein and Bradley used an LLL algorithm to determine the new coefficients []. They found that

z(7) =

�
k � 1

(-1)^k+1

k⁷

�
�
�

2k
k

�
�
�

�
k � 1

(-1)^k+1

k³

�
�
�

2k
k

�
�
�

k-1
�
j = 1

j⁴

and they discovered similar formulas for z(4n+3) for 2 � n � 10 that involve linear combinations of sums of the form

H_m,n =

�
k � 1

(-1)^k+1

k^4m+3

�
�
�

2k
k

�
�
�

k-1
�
j = 1

j^4(n-m)

and multiple dimensional analogues. They conjectured the following generating function:

�
n � 0

z(4n+3)z⁴ⁿ

�
k � 1

k³(1-z⁴/k⁴)

�
k � 1

(-1)^k+1

k³

�
�
�

2k
k

�
�
�

(1-z⁴/k⁴)

k-1
�
m = 1

1+4z⁴/m⁴

1-z⁴/m⁴

(17)

where the final infinite sum is quite unexpected. However, from the first ten cases it was apparent that the series had the form

�
k � 1

(-1)^k+1

k³

�
�
�

2k
k

�
�
�

(1-z⁴/k⁴)

P_k(z)

for as yet undetermined P_k; and there were abundant data to compute

P_k(z) =

k-1
�
m = 1

1+4z⁴/m⁴

1-z⁴/m⁴

They reduced the conjectured formula to an equivalent finite sum

n
�
k = 1

�
�
�

2k
k

�
�
�

k²

4n⁴+k⁴

k-1
�
j = 1

n⁴-j⁴

4n⁴+j⁴

n²

(18)

(1 � n < �) that was subsequently proved by Almkvist and Granville []. Series expansion of the finite products in (17) gives a rapidly converging series for any z(4n+3), which someone may be able to use to settle the question of the irrationality of the z values.

7.2 Independent computation of digits of p

The following formula, discovered using the PSLQ algorithm, allows rapid computation of hexadecimal digits of p independently of previous digits []:

p =

�
k � 0

�
�
�

8k+1

8k+4

8k+5

8k+6

�
�
�

(19)

Bailey, Borwein, and Plouffe knew that a fast algorithm would result from a formula of this form, and deliberately used a computer search to find it; some have called this approach mathematical reverse engineering. Once known, the formula can be proved very concisely by a human [19]. Interestingly, the following MAPLE session shows that it can now be proved almost automatically, too.

p : =

�
�
k = 0

(

)^k (4

8 k + 1

- 2

8 k + 4

8 k + 5

8 k + 6

)

The following shows a temporary increase in complexity. This phenomenon is called ``intermediate expression swell''.

hypergeom([1,

], [

)

8192

hypergeom([2,

], [

)

123669

40960

819

40960

___
�241

1504

36855

(

391

8192

___
�241

) hypergeom([2,

], [

)

(

151

240

___
�241

) (

151

240

___
�241

)

1920

(

391

8192

___
�241

) hypergeom([3,

], [

)

(

151

240

___
�241

) (

151

240

___
�241

) (

511819

8192

1309

8192

___
�241

)

Looking at those conjugate radicals in the denominators suggests expansion-this step is natural but not automatic.

hypergeom([1,

], [

)

271

39312

hypergeom([2,

], [

)

20944

hypergeom([3,

], [

)

(20)

As an aside, equation (20) is an interesting identity itself. In the notation of [], it implies (once the proof is completed) that

�
�
�
�
�
�
�
�
�
�

�
�
�

�
�
�
�
�
�
�
�
�
�

271

39312

�
�
�
�
�
�
�
�
�
�

�
�
�

�
�
�
�
�
�
�
�
�
�

20944

�
�
�
�
�
�
�
�
�
�

�
�
�

�
�
�
�
�
�
�
�
�
�

The next step the mechanical proof of the Bailey-Borwein-Plouffe p formula simplifies the hypergeometric functions:

�2(ln(1 -

�2) - ln(1 +

�2) +

�2 ln(

) - �2 arctan(1) - 2 arctan(

�2)

�2 ln(

) - �2 arctan(

)) + ln(

) - ln(

) +

�2(ln(1 -

�2) - ln(1 +

�2)

�2 ln(

) + �2 arctan(1) - 2 arctan(

�2) +

�2 ln(

) + �2 arctan(

)) + ln(

)

- ln(

) + 2 arctan(

)

The next step is not necessary, but it slows down the computation so we can see that many of the terms in the above formula simply cancel.

p+ 2 arctan(

) + 2 arctan(

)

Now, finally, our answer is plain:

A somewhat more efficient version of (19) was discovered by Fabrice Bellard. This has led Colin Percival, an undergraduate student at Simon Fraser, to design an ingenious parallel internet computation of staggeringly high order hexadecimal digits of p. Details may be found at http:// www.cecm.sfu.ca/ projects/ pihex/: the five trillionth bit of p is `0'.

7.3 Fast series for the Catalan constant

Consider the Catalan constant, which can be defined by

G =

�
k � 0

(-1)^k

(2k+1)²

(21)

or alternatively by

G = -

�
�

p/4

0

logtanq dq = -

�
�

1

0

logu

1+u²

du .

This is perhaps the simplest constant whose irrationality is still unsettled.

Ramanujan discovered the following series for G, which converges much more quickly than (21) []:

G =

log(2+�3)+

�
k � 0

(2k+1)²

�
�
�

2k
k

�
�
�

(22)

After many false starts, David Bradley found a new family of series that includes (22). One member of this family is

G =

log

�
�
�
�
�
�

10+

�

50-22�5

10-

�

50-22�5

�
�
�
�
�
�

�
k � 0

L_2k+1

(2k+1)²

�
�
�

2k
k

(23)

where the Lucas numbers L_n are given by L_n = L_n-1 + L_n-2 with L₀ = 2, L₁ = 1.

The general formula for Bradley's family of series is proved using certain identities among log tangent integrals. For example, (23) is proved using

�
�

p/4

0

log(tanq) dq = 5

�
�

3p/20

0

log(tanq) dq-5

�
�

p/20

0

log(tanq) dq .

This identity was discovered by an LLL integer relation algorithm. It turns out to be quite easy to search for such relations among log tangent integrals, whereas looking for resummations of the original series (by LLL) is quite difficult.

David Broadhurst has, in his pursuit of new insights for theoretical physics, computationally probed more of these constants []. Based on an extraordinary blend of intuition, methodical use of PSLQ, and computer-assisted proofs, he was led to remarkable binary identities for polylogarithmic constants such as z(3), z(5), and Catalan's constant. His formula for Catalan's constant is:

�
�
i = 0

16ⁱ

�
�
�

(8 i + 1)²

(8 i + 2)²

(8 i + 3)²

(8 i + 5)²

(8 i + 6)²

(8 i + 7)²

�
�
�

�
�
i = 0

16ⁱ

�
�
�

(8 i + 1)²

(8 i + 2)²

(8 i + 3)²

(8 i + 5)²

128

(8 i + 6)²

512

(8 i + 7)²

�
�
�

Thus, digits of both G and p may be computed in the same fashion, and we might hope that the formula sheds some light on the normality of Catalan's constant. [Recall that a number is `normal' if its digits occur with equal frequency.]

8 Sin, Redemption and Cautionary Tales

Experimental mathematics cannot supplant rigorous mathematics. Dropping the latter for the former would indeed be a `sin'. We have seen at least one example of a false computer generated conjecture-namely the egyptian fractions example in Section 1-and we could come up with many more []. Experimental mathematics is, however, a good supplement to rigorous mathematics. It can enrich our subject and, when used with discipline, can significantly assist mathematical discovery. We have also seen examples where the computer can assist with the proof.

Jon, eliminating this example, which I still find somewhat weak (especially when the point about proof has already been made), would also eliminate a reference and save some much-needed space. As a final demonstration, consider the power series

J(x) =

�
n₁ > n₂ > 0

x^n₁

n₁² n₂

In [], a functional relation was sought in pursuit of a proof of the identity J(1) = 8J(-1). For 0 � x � 1,

J(x)

�
�

x

0

ln²(1-t)

dt = z(3)+

ln² (1-x)ln(x)

ln(1-x)polylog(2,1-x)-polylog(3,1-x) .

It can be shown that

J(-x) = -J(x)+

J(x²)+J

�
�
�

x+1

�
�
�

(x+1)²

�
�
�

(24)

This relation was found, once the ingredients were determined by inspection, by evaluating (24) (actually, a version of it with undetermined coefficients) at a random point and then using LLL. Another successful strategy is to evaluate each J function at enough specific values of x to enable one to solve linear equations for the unknown coefficients.

If L(x) and R(x) denote the left-hand and the right-hand sides of (24), respectively, then computer manipulations (under the assumption 0 < x < 1) show that dL/dx = dR/dx: mechanically differentiating both sides and using ``simplify'' reduces the difference between the two to zero. Observing that L(0) = R(0) = 0 completes a proof of (24).

8.1 Knowing `the answer' might limit us

We are all familiar with examples of the value of `doing things ourselves'. It is now trivial in most computer algebra systems (CAS) to compute very large values of the partition function with little or no thought, directly from the generating function

P(q) =

�
�
n = 1

(1-qⁿ)

The well-known exact finite series for values of the partition function, due to Radamacher [], and its wonderful infinite, asymptotic precursor due to Ramanujan and Hardy, might well have seemed less worthy of discovery, had CAS been available then. We must be careful to ensure that our use of new tools neither limits us to what they can find for us nor supresses our interest in things easily computed.

This really will require attention: for example, the authors of [] report in their conclusions that had they been aware of the answers in the Encyclopedia, they might not have bothered to prove what they did-and their results went beyond those in the Encyclopedia!

9 Concluding Remarks

It's not over yet. The merging of text and tools that can be anticipated over the next few years will make an enormous difference-we can expect greater insight while reading mathematical materials, and easier access to yet more powerful tools-but we make no detailed predictions, because the most significant, qualitative, changes to the work environment are by their nature unexpected.

Acknowledgements

This work was supported by NSERC. David Boyd, Mark Giesbrecht, David Jeffrey, and Glen Ord were all helpful, but the contributions of Petr Lisonek were exceptional. Many good suggestions from the referees and editor, for which we are grateful, were used to improve the paper.

A The Lambert W function in brief

If you have used MAPLE to solve transcendental equations, you may already have encountered the Lambert W function, defined by (5). The history and some of the properties of this remarkable function are described in []. This function provides a beautiful new look at much of undergraduate mathematics, in addition to some new puzzles of intrinsic interest.

Here are some of the elementary properties of W.

On 0 � x < � there is one real-valued branch W(x) � 0 (see Figure 1). On -1/e < x < 0 there are two real-valued branches. We call the branch that has W(0) = 0 the principal branch. On this branch, it is easy to see that W(e) = W(1·e¹) = 1.
The derivative of W can be found by implicit differentiation to be

d
dx
W(x)

=

1
(1+W(x))e^W(x)

=

W(x)
(1+W(x))x

where the second formula follows on using expW(x) = x/W(x), and holds if x\not = 0. We may use the first formula to find the value of the derivative at x = 0, and we see the singularity is just a removable one.
The function y = W( expz) satisfies

y + logy = z .

This function appears, for example, in convex optimization. Consider the convex conjugate, f^*(s) = sup_r rs-f(r), of the function f (r) = rln(r/(1-r))-r. Calculation shows that f^*(s) is just W(exps).

W(x) has a Taylor series about x = 0 with rational coefficients. Similarly, W(expz) has a Taylor series with rational coefficients about z = 1. MAPLE computes the first few terms to be

W(e^z)

: =

1 +

(z - 1) +

(z - 1)² -

192

(z - 1)³ -

3072

(z - 1)⁴

61440

(z - 1)⁵-

1474560

(z - 1)⁶ -

41287680

(z - 1)⁷

2447

1321205760

(z - 1)⁸ -

16811

47563407360

(z - 1)⁹

15551

1902536294400

(z - 1)¹⁰ + O((z - 1)¹¹)

Here is an exact formula for the coefficients of the nth derivative of W(expz), in terms of second-order Eulerian numbers �n \atopwithdelims < > k � []. This formula comes from the following expression for the nth derivative of W(expz), which is stated in []. Once the answer is known, the proof is an easy induction, which we leave for the reader.

The derivatives of W(expz) are

dⁿ

dzⁿ

W(e^z) =

q_n(W(e^z))

( 1 + W(e^z) )^2n-1

(25)

where q_n(w) is a polynomial of degree n satisfying the recurrence relation

q_n+1(w) = -(2n-1)wq_n(w) + (w+w²)q_n�(w) , n > 1

(26)

and having the explicit expression

q_n(w) =

n-1
�
k = 0

�

n-1\atopwithdelims < > k

�

(-1)^k w^k+1 .

(27)

If n = 1 we have q₁(w) = w, and it is convenient to put q₀(w) = w/(1+w); this isn't a polynomial, but it makes things work out right. This means that our series for W(expz) about z = 1 is just

W(e^z) =

�
n � 0

q_n(1)

n! 2^2n-1

(z-1)ⁿ .

(28)

Jonathan M. Borwein,
Centre for Experimental and Constructive Mathematics, Simon Fraser University, Vancouver, B.C., Canada, V5A 1S6, jborwein@cecm.sfu.ca	Robert M. Corless,
Department of Applied Mathematics, University of Western Ontario, London, Ontario, Canada, N6A 5B7, Rob.Corless@uwo.ca

File translated from T_EX by T_TH, version 2.20.
On 17 May 2000, 07:26.