(01/06/2011, 03:02 AM)bo198214 Wrote: Because it is not theoretically safe.
There is no proof that the Dmitrii's (and also Sheldon's) procedure of computing the sexp even converges.
There is however a theoretic uniqueness criterion and a proven procedure that exactly produces the corresponding sexp/slog. I found that out recently and give links/references perhaps later. However Dmitrii's and also Sheldon's way is simpler to calculate and seem to compute exactly this unique sexp/slog.
At long last, I think I am ready to begin posting on
why the Kneser.gp algorithm I wrote converges, and perhaps how to go about proving that the algorithm converges to the unique sexp function. This post will give another overview of the sexp and riemaprx routines and how they are generated. Then I will consider some modifications and simplifications to the definition, that I believe can eventually be proven to converge. The basic idea is to show that the error term, going from one iteration to the next, is highly predictable and linear, because for small errors, all of the steps in the kneser.gp algorithm are linear. It turns out the individual terms in the next error estimate can be very accurately calculated by a dot product, of the current error terms, with each terms error equation.
update the general dot product can be written as matrix multiplication by an error matrix. But, even assuming linearity, I still need to prove the terms in the error matrix itself converge.
\( \text{sexp}(z)=\text{SexpExact}(z)+ \text{error}(z) \)
\( \text{error}(z)= \sum_{n=0}^{\infty}c_n\times z^n = \text{sexp}(z)-\text{SexpExact}(z) \)
But first, I need to give an overview the kneser.gp algorithm. I am starting with some ideas
I posted earlier. The basic idea using base e here, where L is the fixed point such that \( L=\exp(L) \), \( L\approx 0.318+1.317i \), is that if
\( f(z)=L+\delta \) and \( f(z+1)=\exp(f(z))=L\exp(\delta) \approx L+L\delta \) and \( f(z+2)=\exp(\exp(f(z))) \approx L+L^2\delta \) etc.
This can be used to develop a complex valued entire superfunction such that \( \text{superf}(z+1)=\exp(\text{superf}(z)) \) for all values of z. \( \text{superf}(z) = \lim_{n \to \infty} \exp^{[n]}(L + L^{z-n}) \)
The problem is that the superf is complex valued, not real valued. A 1-cyclic mapping is used to convert this function to an analytic real valued tetration. The 1-cyclic theta mapping is equivalent to the Riemann mapping in Kneser's algorithm. However, the 1-cyclic theta function has a very nasty singularity at the integer values. Theta(z) converges for imag(z)>0. Theta(z) also converges at the real axis, as long as z is not an integer. All of the terms in theta(z) exponentially decay, as imag(z) increases.
\( \theta(z)=\sum_{n=0}^{\infty}a_n\times \e^{(2\pi n i z)} \)
\( \text{sexp}(z)=\text{superf}(z+\theta(z)) \)
For imag(z)<0, the schwarz reflection can be used to generate sexp(z). For integers values z>-2, sexp(z) is defined, since the superf(z) cancels the theta(z) singularity.
After generating an initial sexp estimate as a seed, the algorithm consists of three steps. The following diagram hopefully helps to explain the algorithm. The value of 0.12i is somewhat arbitrarily chosen, to maximize convergence while avoiding the singularity in theta(z).
1) Rieman approximation update, 1-cyclic fourier transform from -0.5+0.12i to 0.5+0.12i.
The Rieman aproximation update is done from the sexp algorithm, along the green "unit length" line.
2) sexp approximation update, along the purple "Kneser>0.12i" path, which is concatenated with the exp(f(x)) and the log(f(x)), and the "Kneser<0.12i" line, to generate a full circle for the Cauchy integral.
3) recenter and renormalization, setting a0=1, and renormalizing a1 and a2.
There are a few differences between this "idealized" algorithem, and the actual implemented algorithm
1) The idealized algorithm uses continuous integrals, instead of discreet algorithms.
2) The idealized functions all have an infinite number of terms in their taylor/fourier series. The idealized superfunction is exact.
3) There are differences in the recenter/renormalize routine, although for base(e), the idealized algorithms behaves nearly identically to the kneser.gp code algorithm.
4) The actual routine adjusts the iteration counts and the number of terms in the series as we go, which is unnecessary in the idealized routine. Also, the actual routine has some boundary checking, to truncate fourier/sexp series when exponential decay stops. The boundary checks also add in calculating the values to use in the next iteration of the loop.
\( \text{sexp}(z)=\sum_{n=0}^{\infty} a_n \times z^n \)
In the idealized routine recentering simply sets the sexp approximation taylor series term for a0=1. And then the renormalization is modifying a1 and a2, such that exp(sexp(-0.5+0.12i))=sexp(0.5+0.12i). Because sexp(z) is real valued at the real axis, two terms are require, a1 and a2. We are also solving the complex conjugate equation exp(sexp(-0.5-0.12i))=sexp(0.5-0.12i). The recentering corrects small errors in each iteration, to guarantee that sexp(0)=1. The renormalization of a1 and a2 dramatically improves convergence. It guarantees that the fourier series generated for theta(z) is for continuous function, which halves the number of iteration loops required for convergence. The combined effect of recentering and renorm are that three degrees of freedom are lost, in that sexp(a0,a1,a2) become dependent on the rest of the sexp taylor series terms. It is asserted that, for small enough error terms, the a1 and a2 error terms are a linear dot product of generating the a1 and a2 deltas independently, for each of the a3,a4,a5,a6.... terms. This shouldn't be too difficult to rigorously prove.
Earlier, I asserted that given the error term for sexp(z), the error term for the next loop iteration of sexp(z) can be approximated very well by a linear dot product. Specifically, as the error terms go to zero, the following equation for the next sequential error term becomes more and more exact. This can be represented by an error term matrix, where the ith row of the matrix represents the next iteration error terms, d1..dn, generated by one ci term, in isolation.
\( \text{sexperror}_{k}(z)= \sum_{n=0}^{\infty}c_n\times z^n = \text{sexpsexp}_{k}(z)-\text{SexpExact}(z) \)
\( \text{sexperror}_{k+1}(z)= \sum_{n=0}^{\infty}d_n\times z^n = \text{sexp}_{k+1}(z)-\text{SexpExact}(z) \)
Then, the sexperror_k terms can be put in a one row matrix. k is for the kth iteration. And the sexperror_k+1 can be predicted by multiplying the 1 row current error matrix by an nxn row errorpredict matrix.
\( \text{sexperror_k}(1,j)=c_j \)
\( \text{sexperror_{k+1}}=\text{sexperror_k}\times\text{errorpredict} \)
Here, each row of the error predict is for sexperror generated specifically by the ith term, which for small values of errors, are all linearly independent. For rows 1 and 2, the errorpredict is all zeros.
Consider an initial sexp estimation, where c3=0.00001. Then
\( \text{sexp}_{k}(z)= \text{SexpExact}(z)-c3*z^3 \)
Why start with c3? Because if we consider an error term in c0, c1, or c2, than the renormalization and recentering will have completely eliminated it! The error term c0=0 by definition since we set a0=1. If there are no errors for c3, c4, c5... then we only have errors in c1 and c2. Assuming these errors are small, the renormalization routine will exactly correct errors in c1 and c2.
Now, lets walk through the algorithm step by step. First, initialize sexp. a0=1, a1=1. Now, solve for a1 and a2 with the renormalization step. Then we have a0=1, a=1.137484... a2=0.275598.... This gives us exp(sexp(-0.5+0.12i))=sexp(0.5+0.12i). Here is a plot of the initial sexp(z) funtion, from (-0.6+0.12i to 0.6+0.12i). Here, sexp(real(z)>0.5) is defined as exp(sexp(z), and sexp(real(z)<-0.5) is defined as log(sexp). We plot the initial sexp error term, sexp(z)-sexpexact(z), red is real, and green is imaginary. At imag(z)=0.12i, the error term is continuous, but the derivative is discontinuous at real(z)=-0.5,0.5.
Next, we generate the rieman approximation. See
this link for some background. We want to generate the next theta(z) approximation, from the current sexp(z). However, theta(z) is constrained to have only terms which decay to zero as imag(z) goes to infinity. We generate the fourier transform for theta(z) at imag(z)=0.12i. In the large "kneser mapping" diagram, this is the unit length.
\( \theta(z) \approx \text{isuperf}(\text{sexp}(z))-z \)
\( \theta(z)=\sum_{n=0}^{\infty}b_n\times \e^{(2\pi n i z)} \)
\( b_n=\int_{-0.5+0.12i}^{0.5+0.12i} \text{sexp(z)\times\exp(-2\pi n i z)\mathrm{d}z \)
The fourier result is not exactly the same as the sexp(z), since we only include terms with exp(2Pi n i z) and not terms with exp(-2Pi n i z), but this way we have theta(z) decaying to zero as imag(z) goes to infinity. Here is the graph of riemaprx(z) along the same -0.6+0.12i to 0.6+0.12i corrider, with the original sexp(z) graph also included. Notice that since we calculated the theta(z) terms at 0.12i, this must be subtracted with theta(z-0.12i).
\( \text{riemaprx}(z)=\text{superf}(z+\theta(z-0.12i)) \)
Riemaprx(z) is only defined for z>0.12i, since there is a discontinuity in the derivative of sexp(z) at 0.12i. This is a fairly mild discontinuity, compared to the discontinuity in the exact theta(z) function at integers of z. Also, because imag(z)>=0.12i, the much more difficult discontinuity is largely gone, and the terms for the exact theta(z) exponentially decay by \( \exp(-0.12*2*pi)\approx0.47049 \), which greatly improves the numeric stability of the algorithm, and greatly reduces the number of sample points required. With each subsequent iteration, theta(z) is still inexact, so eventually exponential decay stops, and the theta(z) error terms becomes more or less constant, slowly decaying as 1/n^2. Notice that if the sexp error term is small, and if the sexp error term scales by a constant, then the fourier error term will scale by the exact same constant since the isuperf is an analytic function, which can be linearly approximated at every point at 0.12i+real(z), and therefore there is a linear approximation of isuperf(sexpexact(z)+sexperror(z)).
Now, on to the next step, which is to generate the next sexp(z) iteration from the riemaprx(z) function. First, a graph of the unit circle error function for the riemaprx(z). Since riemaprx(z) is only valid for imag(z)>0.12i, if imag(z)<0.12i and real(z)<0, we substitute log(sexp(z+1)), and if imag(z)<0.12 and real(z)>0, we substitute exp(sexp(z-1)). In the "kneser mapping" diagram, this is exp(f(x)) and log(f(x)), which amounts to 8% of the circle. The rest of the circle is generated from the riemaprx(z).
Now, we graph the error term, of this riemaprx(z), for half a circle imag(z)>0. Red is real, and green is imaginary. Notice the jump discontinuity where the riemaprx(z) is spliced with sexp(z). For the jump discontinuity, we are substituting log(sexp(z+1)) or exp(sexp(z-1)), but the absolute value of z+1 or z-1, is always <= 0.12, so error contribution comes primarily from the error in the a1 term; errors in higher order terms are negligable. The other half of the circle is the mirror and complex conjugate of the first half, with the imag(z) negating. The right half of the graph shows the next iteration(sexp(z)), generated from the cauchy integral, which will be described below.
The jump discontinuity causes a corresponding ringing in the subsequent sexp(z) function, which is on the right hand side of the graph. We generate the a_n terms of the next sexp(z) function with a cauchy circular integral of riemaprxsplice, which splices the riemaprx on the unit circle, exactly as just described.
\( a_n=\frac{1}{2\pi}\int_{0}^{2\pi} \text{riemaprxSplice(\exp(z*i))\times\exp(-n i z)\mathrm{d}z \)
Then we modify a0, by setting a0=1. We also modify a1 and a2 so that exp(sexp(-0.5+0.12i))=sexp(0.5+0.12i). I calculate a1 and a2 by iterating on a 2x2 linear, calculating the complex slope with respect to a small change in both a1 and a2, and solving a linear equation, separately solving the real and imaginary. Then we have the next iteration of the sexp(z) function,
\( \text{sexp}(z)=\sum_{n=0}^{\infty} a_n \times z^n \)
The right hand side of the graph of shows the unit circle of this sexp(z) function, superimposed with the riemaprxsplice(z) function, with the jump discontinuity causes quite a bit of ringing in the sexp(z) function. Also, this sexp(z) will not converge to the riemaprx(z) function since we only generated the taylor series terms, and ignored the Laurent series terms. Over the unit circle, the sexp(z) function is perhaps 2x improved over the riemaprx(z) it is generated from.
The big improvement in error comes when we look at the next sexp(z) function over the unit length, from (-0.5+0.12i) to (0.5+0.12i), especially after modifying a1 and a2. The following error terms are given in binary bits of precision, versus the sexpexact function, at each step of the first loop. Notice the big jump when looking at the sexp over the unit length. On the unit length, abs(z)<0.515, so the contribution of the higher order Taylor series terms (and the ringing) quickly vanish.
Code:
Initial sexp error average, unit length, 6.687139553
riemaprx error average, unit length 8.847704235
riemaprx circle error average 9.109320182 sexp circle error average 9.92542267
sexp error avg w/out renorm 12.27520332 sexp error after renorm 15.03785673
First iteration series terms and error terms
a_n a_n - exact(a_n)
0 1 0.0000000000000
1 1.091637758 -0.0001295933525
2 0.2713792667 -0.0001039462271
3 0.2129680048 0.0005147566222
4 0.06984257853 0.0003022023935
5 0.04479767614 0.0005057240501
6 0.01505727568 0.0003205335857
7 0.009083675026 0.0004148932086
8 0.003070658851 0.0002741794525
9 0.001978461350 0.0003678300594
10 0.0007321945088 0.0002422672773
11 0.0006470674180 0.0003588863469
12 0.0003064663018 0.0002263716893
13 0.0004169618222 0.0003666706804
14 0.0002300482254 0.0002178644351
15 0.0003854247147 0.0003767591811
16 0.0002122859728 0.0002105981905
17 0.0003828593609 0.0003813661076
18 0.0002014534837 0.0002012547230
19 0.0003769660549 0.0003767051875
20 0.0001883143091 0.0001882995991
21 0.0003613445638 0.0003612977293
22 0.0001712539242 0.0001712554734
23 0.0003350887121 0.0003350799706
24 0.0001502941916 0.0001502953174
25 0.0002989277621 0.0002989260542
26 0.0001260102512 0.0001260106291
27 0.0002543645846 0.0002543642350
28 0.00009926540810 0.00009926551348
29 0.0002033800581 0.0002033799835
30 0.00007109256008 0.00007109258726
31 0.0001482578474 0.0001482578309
32 0.00004261012703 0.00004261013378
33 0.00009144118322 0.00009144117949
34 0.00001495134161 0.00001495134325
35 0.00003540230906 0.00003540230820
36 -0.00001079796348 -0.00001079796308
37 -0.00001747888857 -0.00001747888877
38 -0.00003365399895 -0.00003365399886
39 -0.00006502587372 -0.00006502587376
40 -0.00005278165669 -0.00005278165667
41 -0.0001053632837 -0.0001053632837
42 -0.00006753220679 -0.00006753220679
43 -0.0001369983921 -0.0001369983921
44 -0.00007747105358 -0.00007747105358
45 -0.0001588832568 -0.0001588832568
46 -0.00008239473860 -0.00008239473860
47 -0.0001704552051 -0.0001704552051
48 -0.00008233671901 -0.00008233671901
49 -0.0001716541556 -0.0001716541556
Over the course of subsequent iterations, the pattern continues more or less unchanged. The sexp(z) terms initially decay exponentially, following the exact sexp series. But then the jump discontinuity takes over. Theoretically, the jump discontinuity causes the error terms to linearly decay. Also, following the same argument (which I have to find a way to make more mathematically precise), the algorithm generating the subsequent sexp(z) series from an initial sexperror(z) is approximately linear if the inital sexperror is small. If the sexp error term is small, and if the sexp error term scales by a constant, then the fourier error term will scale by the exact same constant, and so will the subsequent sexp error term, since it can be linearly approximated at every point on the unit circle as a linear approximation of the initial sexp(z) function.
One thing to notice though, is that as the iterations increase, the sexp(z) will appear to have a radius of convergence of "2", but because of the jump discontinuity, the sexp(z) function carried out to infinite precision only has a convergence radius of "1". Nontheless, within that unit circle, sexp(z) will converge to sexpexact(z) since it is based on the superf(z+theta(z)), so long as the theta(z) error term goes to zero.
update I decided to combine these two sub-postings together. The rest of this post will discuss subsequent iterations, and will give some very accurate error term approximations, for sexperror(z) for subsequent iterations. If you have any suggestions, please feel free to give them.
The graphs for the second iteration looks much like the first iteration, with a similar graph for the sexperror and riemaprxerror over the unit length, as well as the error term over the unit half circle, for both the riemaprx and the subsequent sexp generated from it.
The sexperror terms can be arranged in a nxn errorpredict matrix, where the Cn row predicts the next iteration error for the C1..Cn terms. I calculated the matrix of error terms for rows C3 through C39, accurate to 10 significant digits. Then I used this matrix, along with the error terms from the 4th iteration of the sexp/riemaprx/recenter loop to calculate the error terms for the 5th iteration of the loop. The predicted results matched the actual 5th iteration error terms, accurate to approximately 10 significant digits. I continued, generating the error matrix iteratively from itself, independent of the main program, and the error matrix continued to track the taylor series error terms, accurate to 10 significant digits or so. The convergence pattern is highly predictable.
\( \text{sexperror_{k+1}}=\text{sexperror_k}\times\text{errorpredict} \)
This shows the sexp error terms as the iteration count increases, become more and more predictable, and convergence can be expected to continue, with each loop being approximately 8-9 binary digits more accurate than the preceeding loop, ad infinitum. Also, the algorithm predicts that for
any initial sexperror term series, if the series is reasonably accurate, then the next iteration of loop will be more accurate. Notice that the error term prediction only involves rows C3 through C39, and does not require the error term for rows C1 and C2, which are all zeros. After the renorm/recenter algorithm the error term for C1 and C2 become dependent on the rest of the error terms, and are no longer independent, so they are not required to predict the error terms for the next iteration of the loop. Also, row C0, by definition of the algorithm, always has an error term of zero.
Additional thoughts. To make a robust proof, assuming linearity, I still have to generate a theoretical bounds check on each and every element of the error matrix array, and use that to prove that the multiplying the current sexperror row by the error matrix converges. In practice, each row of the error matrix is more or less constant (or perhaps linearly decaying), with subsequent rows exponentially decaying by something around 0.51x the previous row. So I guess that would be the next thing to do, was rigorously charecterize the error matrix.
So now I need some help in making this argument rigorous enough, so that the kneser.gp algorithm can be considered theoretically safe in generating the base(e) sexp series. Any suggestions or feedback are welcome.
- Sheldon
Code:
4th iteration, sexp error term taylor series
1 -0.000000001075469975
2 -0.0000000007826197840
3 0.000000004315735948
4 0.000000002432801019
5 0.000000004149449793
6 0.000000002487202408
7 0.000000003441423260
8 0.000000002132079585
9 0.000000003118906531
10 0.000000001919654135
11 0.000000003103661577
12 0.000000001837440311
13 0.000000003214735346
14 0.000000001810026603
15 0.000000003333535841
16 0.000000001786482891
17 0.000000003396194025
18 0.000000001739481814
19 0.000000003371522504
20 0.000000001656244387
21 0.000000003247424309
22 0.000000001532675479
23 0.000000003023767726
24 0.000000001370142954
25 0.000000002708536107
26 0.000000001173640644
27 0.000000002315482636
28 0.0000000009506155242
29 0.000000001862457165
30 0.0000000007101124207
31 0.000000001370030261
32 0.0000000004620761944
33 0.0000000008602538359
34 2.167410733 E-10
35 0.0000000003555002171
36 -1.591932794 E-11
37 -1.226282102 E-10
38 -2.266859595 E-10
39 -0.0000000005543021885
5th iteration sexp error term taylor series
1 3.295111379 E-12
2 9.302899162 E-13
3 -1.418660932 E-11
4 -6.145664973 E-12
5 -1.020311859 E-11
6 -3.645891118 E-12
7 -7.469882010 E-12
8 -2.479534043 E-12
9 -6.680314967 E-12
10 -2.325652016 E-12
11 -6.895843918 E-12
12 -2.611733966 E-12
13 -7.449733231 E-12
14 -3.019268986 E-12
15 -7.998326000 E-12
16 -3.398626053 E-12
17 -8.374762272 E-12
18 -3.680220850 E-12
19 -8.500951760 E-12
20 -3.832040970 E-12
21 -8.346886626 E-12
22 -3.841958270 E-12
23 -7.912098675 E-12
24 -3.710199665 E-12
25 -7.216379344 E-12
26 -3.445742031 E-12
27 -6.294311964 E-12
28 -3.064213063 E-12
29 -5.191365035 E-12
30 -2.586353531 E-12
31 -3.960606380 E-12
32 -2.036664259 E-12
33 -2.659657522 E-12
34 -1.442094382 E-12
35 -1.347762623 E-12
36 -8.307314523 E-13
37 -8.296865343 E-14
38 -2.305033111 E-13
39 1.080527631 E-12
Predicted 5th iteration error term taylor series, predicted from the 4th iteration error terms, and a previously calculated 40x40 matrix of error terms. The results matched to about 10 significant digits.
1 3.295111376 E-12
2 9.302899153 E-13
3 -1.418660930 E-11
4 -6.145664967 E-12
5 -1.020311858 E-11
6 -3.645891116 E-12
7 -7.469882005 E-12
8 -2.479534042 E-12
9 -6.680314965 E-12
10 -2.325652015 E-12
11 -6.895843917 E-12
12 -2.611733966 E-12
13 -7.449733231 E-12
14 -3.019268986 E-12
15 -7.998326000 E-12
16 -3.398626053 E-12
17 -8.374762272 E-12
18 -3.680220850 E-12
19 -8.500951760 E-12
20 -3.832040970 E-12
21 -8.346886626 E-12
22 -3.841958270 E-12
23 -7.912098676 E-12
24 -3.710199665 E-12
25 -7.216379345 E-12
26 -3.445742031 E-12
27 -6.294311964 E-12
28 -3.064213063 E-12
29 -5.191365036 E-12
30 -2.586353531 E-12
31 -3.960606381 E-12
32 -2.036664259 E-12
33 -2.659657522 E-12
34 -1.442094382 E-12
35 -1.347762624 E-12
36 -8.307314523 E-13
37 -8.296865374 E-14
38 -2.305033110 E-13
39 1.080527631 E-12
C3 row of the error matrix, which predicts the contribution of the C3 error terms in generating the C1..C39 error terms for the next iteration.
1 0.0004158668408
2 0.0007652546543
3 -0.001369783854
4 -0.001355723324
5 -0.002350528845
6 -0.002190922611
7 -0.002224139382
8 -0.002066574360
9 -0.002014621753
10 -0.001819405279
11 -0.001907282376
12 -0.001609964771
13 -0.001866972251
14 -0.001436639663
15 -0.001842344037
16 -0.001278410345
17 -0.001800458227
18 -0.001120937765
19 -0.001724342464
20 -0.0009575219887
21 -0.001607345709
22 -0.0007867242389
23 -0.001449355120
24 -0.0006104054120
25 -0.001254554042
26 -0.0004324073328
27 -0.001030023011
28 -0.0002576568681
29 -0.0007847601244
30 -0.00009152060166
31 -0.0005289083786
32 0.00006068687160
33 -0.0002730922736
34 0.0001940902115
35 -0.00002782403210
36 0.0003045591924
37 0.0001970311403
38 0.0003889427536
39 0.0003927272261
C39 row of the error matrix, which predicts the contribution of the C39 error terms in generating the C1..C39 error terms for the next iteration.
1 -1.315196506 E-13
2 -5.567211000 E-14
3 5.610506314 E-13
4 2.651946239 E-13
5 3.653272482 E-13
6 1.593343988 E-13
7 1.953891883 E-13
8 8.197001612 E-14
9 1.087359427 E-13
10 4.525933580 E-14
11 7.263189721 E-14
12 3.085914608 E-14
13 5.988123250 E-14
14 2.623911289 E-14
15 5.643500666 E-14
16 2.533116569 E-14
17 5.587438699 E-14
18 2.545305691 E-14
19 5.539023132 E-14
20 2.545503371 E-14
21 5.381196359 E-14
22 2.486833914 E-14
23 5.072832864 E-14
24 2.353751317 E-14
25 4.609985430 E-14
26 2.146110320 E-14
27 4.008333166 E-14
28 1.872005195 E-14
29 3.294681964 E-14
30 1.544290502 E-14
31 2.502312381 E-14
32 1.178643454 E-14
33 1.667957158 E-14
34 7.922679766 E-15
35 8.294824033 E-15
36 4.028705972 E-15
37 2.390949837 E-16
38 2.776744909 E-16
39 -7.143399179 E-15