Posts: 1,214
Threads: 126
Joined: Dec 2010
02/16/2023, 10:26 AM
(This post was last modified: 02/16/2023, 10:30 AM by JmsNxn.)
Hey, everyone.
It's been pissing me off how slow Mike3's MakeGraph program runs. And I've decided to employ a speed up of this program.
Now, traditionally; Mike3's program goes pixel by pixel and assigns a colour point. I am going to add in a subroutine which splines together n by k pixels into a linear solution of the pixel values at the origin of this block. This is much more in tune with standard graphing programs. I'd just like to have these algorithms attached to parigp; rather than run pari in a graphing program.
I'm still beta testing this; but my idea works on paper. The general idea is we write a modified version of mike3's program. Where instead of reevaluating the function at each pixel; we set 5 by 5 or 10 by 5 pixels are assigned this value. Which greatly speeds up convergence.
If we assume func(z) = O(h(n)); then Mike3's program runs like O(n^2h(n)). I believe I can reduce this to O(n log(n) h(n)) without sacrificing too much quality.
I'm going to publish it soon. But it is proving pretty tricky to get everything rightplus I suck with graphics. But theoretically this will diversify mike3's program greatly.
.......................
Wish me luck
Posts: 903
Threads: 130
Joined: Aug 2007
(02/16/2023, 10:26 AM)JmsNxn Wrote: Wish me luck
Done!
Gottfried Helms, Kassel
Posts: 1,214
Threads: 126
Joined: Dec 2010
02/19/2023, 12:24 PM
(This post was last modified: 02/19/2023, 03:11 PM by JmsNxn.)
(02/16/2023, 03:08 PM)Gottfried Wrote: (02/16/2023, 10:26 AM)JmsNxn Wrote: Wish me luck
Done!
Lmao!
I'm a little stuck. But I have managed to make it so that instead of pixel by pixel run, we have \(n \times n\) pixels; with the origin as the sample point. So it makes it a little blocky. But it still displays the function, especially if you just "lower the resolution" by a small amount. I'm having a lot of trouble at writing the splining. But I think this is still an awesome tool. Despite what turns out. If mike3's program grows like \(O(N)\)for \(N^2\) amount of pixels. I've at least got it to \(O(N^{1\delta})\) for \(\delta >0\) but \(\delta \approx 0\). But the final graph is a bit more blocky depending on how big \(\delta\) is. It's really not that noticeable. But it's taking way faster to graph Literally quartering graph time With only a little bit of "lower resolution" on the graph
Once I figure out how to efficiently write splining, so without killing the \(O(N)\), we'll be coasting! Instead of setting each \(4 \times 4\) pixel block to the reference point, we spline towards the four corners of reference points... it'll look the exact same and run like \(O(N^{1\delta})\) rather than \(O(N)\). It'll save us so much fucking time!
My solution so far is to calculate the reference point, and its derivative; and then consider each 4x4 pixel block as the reference point plus its derivative about this point. I'm having a little trouble getting the code tho. Lmao!
I think I nearly got it though. Where, we can "lower the resolution" which causes "increase the speed of the compiling of the graph".
EDIT:
Okay, I've added that the function mike3's program takes must be differentiable. This works fine for everything I want.
I just need to fix some dumb shit. But this makes "basically" mike3's graph. And if you want you can still just run his graph; program accounts for that. But if you want to sacrifice a bit of quality, you gain a bunch of speed!
I Fucking did it! Just let me do more test runs and shit. Don't want errors or anything. Just want to hit all the fringe cases....
Posts: 1,214
Threads: 126
Joined: Dec 2010
Alright! I'm still looking into this; seeing if I can create speed ups, but for the moment everything is working.
Here is a graph of \(\beta(z)\) for \(0 \le \Re(z) \le 6\) and \(\Im(z) \le 3\); for 300 by 300 pixels using Mike3's program:
And here is the exact same graphwhen I set the "blur factor" to 1/2.
The second graph isn't considerably faster; but it is noticeably faster. When we get to something like 1000 by 1000 pixels; or hi res graphs; my program is going to be considerably faster. I set the blur factor to a 1/2, because I wanted to show the noticeable degradation. Normally, I would suggest something like 3/4 blur factor; this won't be as noticeable. And will make the pixel blocks smaller.
Also, you can notice it getting particularly pixelated near large values. I'm trying to think of a way around this; and I think I have a way; but I'm scared it'll be useless, because that's just Tetration and iterated exponentials.
But nonetheless...
Instead of graphing pixel by pixel; I have successfully implemented "pixel block" by "pixel block"where the "blur factor" determines the size of the block. And by making a smaller blur factor, we can noticeably increase run time.
For example; with 300x300 pixels; and 1/2 blur factor; we only need to evaluate \(\beta(z)\) 150x150 times, rather than 300x300 times. And we aren't losing much detail; despite a bit of pixelation at Large values. I think the speed makes it worth it.
I'm still working on optimizing it; but there's only one thing I see as a problem at the moment; and I'm just trying to find the best way to code it. Will keep you guys updated.
PM me if you are interested in beta testing. I don't want to post this until I have everything working!
Posts: 1,214
Threads: 126
Joined: Dec 2010
What would usually take 24 hrs, was just done in about 4 hours.
This is a graph of:
\[
\beta(z) + \tau^{50}(z)\\
\]
For \(0 \le \Re(z) \le 2\) and \(1 \le \Im(z) \le 1\). This is a 700x700 pixel graph. It is done with a 1/2 "blur factor". While dropping compile time by at least a fifth:
Posts: 1,214
Threads: 126
Joined: Dec 2010
02/23/2023, 02:36 PM
(This post was last modified: 02/23/2023, 03:52 PM by JmsNxn.)
Let's assume we have a stack size: PIXELS BY PIXELS; where X and Y run through PIXELS BY PIXELS.
Let's call the operation WRITE(PIXELS BY PIXELS) the operation which runs through X and Y and writes to a file each value X and Y.
Let us call the time expended to do this E = O(PIXELS BY PIXELS).
When running Mike3's program; we not only have a stack size of PIXELS BY PIXELS; and not only do we run through PIXELS BY PIXELS; we evaluate each pixel. Instead of a fast run through of X and Y. We have to assign a value F(X,Y), which can take more computation time. For each X: F(X,Y); for each Y: F(X,Y). If F is a tetration kind of function, it takes a long time to evaluate these values...
This creates a far longer expended time than E. And especially so, because F(X,Y) appears at the bottom of the forloops.
The goal of my program is simple. We create a stack size of : PIXELS*R BY PIXELS*R. Where R is between 0 and 1 and PIXELS*R is a natural number. My code excels in the fact that:
OVERALL RUNTIME = E + STACK(PIXELS*R BY PIXELS*R)
So, if we set R =0.5; and increase PIXELS BY PIXELS, we lower the OVERALL RUNTIME. But we still have to account for the initial write time of E.
Instead of running:
O(PIXELS BY PIXELS) = O(F(X,Y))
Which is a for loop of X=0,PIXELS, and then Y=0,PIXELS, where we evaluate F(X,Y) then write it.
We instead run:
E + O(PIXELS*R BY PIXELS*R)
Where X=0, PIXELS*R, and then Y=0,PIXELS*R, and we evaluate F(X,Y), then we write PIXELS BY PIXELSwe have an offset of E amount of time to write PIXELS BY PIXELS with values in the registry.
The hypothesis of this code is that F(X+h_1,Y+h_2) = F(X,Y) + DF(X,Y) *(h_1 + ih_2)
The reason the large values produce more pixelated code, is because this approximation becomes less sharp. But nonetheless, we still have a very very good approximations. And the speed ups are worth it. Pixelation is a small price to pay, especially considering the following.
Here is a graph of 1000 by 1000 pixels of BETA(z). The same BETA as before. This graph would normally take about 68 hours. I have done it in 2 hours. I have set the "blur factor" = R = 0.5.
This means, I have spent the CPU time of 500 by 500 pixels on evaluating F(X,Y); and then added the time it takes to WRITE 1000 by 1000 pixels (E = O(1000 BY 1000)). And instead of 68 hours of CPU time, it's 2 hours.
This graph is the same \(\beta\) function as above, and done from \(0 \le \Re(z) \le 4\) and \(2 \le \Im(z) \le 2\)... You can't even tell it's blurred
The run time for my program is:
E + O(F(PIXELS*R,PIXELS*R))
Rather than:
O(F(PIXELS,PIXELS))
If that makes sense..... This might seem like nothing. But when you need to spend a long time to calculate each value F(X,Y), my code halves the time! By reducing the amount of times we need to call F by half...
Posts: 1,214
Threads: 126
Joined: Dec 2010
02/23/2023, 04:30 PM
(This post was last modified: 02/23/2023, 04:49 PM by JmsNxn.)
Just for shits and giggles. Here is: \(f(z) = \sin(z)\). Where \(1 \le \Re(z) \le 1\) and \(1 \le \Im(z) \le 1\). And let's make a 500 by 500 pixel graph. But let's only choose 125 x 125 reference points; which is a "blur factor" of 0.25.
We have to run:
E = for X; for Y; write(X,Y);
For all X,Y in 500 by 500; which creates our default starter speed.
P = for X/0.25; for Y/0.25; F(X,Y)
Where, X only ends up doing 125 loops, same with Y. Because 500*0.25 = 125.
So we write 500x500 times; but we only calculate F 125x125 times.
Then the total time is just:
E + P
Which with large graphs, and large values, reduces times tenfold!!!!! Assuming we don't lose accuracy, or quality of the graph, is a purist's question. Everything looks good enough!
This does E = WRITE(500 BY 500) time, plus the addition of O(F(125,125))
And god its fast as fuck, and there's zero difference
The code is simple as:
P =
For X=0, 125:
For Y=0, 125:
F(X,Y)
E =
For X = 0, 500:
For Y=0, 500:
WRITE(X,Y)
Total time to compile is:
E + P
Mike3's time would be written:
MIKE =
For X=0,500:
For Y=0, 500:
P = F(X,Y);
WRITE(P);
Which causes 500 BY 500 evaluations of F(X,Y). Rather than my modest 125 BY 125. And the graphs are indistinguishable.... What I call "blur factor" isn't a jpeg blur. The splining, and graphing, is indistinguishable. So if you take a nice function like F(X,Y) = sin(X + iY), everything behaves perfectly. And even choosing a blur factor of 0.25 still keeps everything beautiful!
Posts: 903
Threads: 130
Joined: Aug 2007
(02/23/2023, 04:30 PM)JmsNxn Wrote: E =
For X = 0, 500:
For Y=0, 500:
WRITE(X,Y)
(...)
Mike3's time would be written:
MIKE =
For X=0,500:
For Y=0, 500:
P = F(X,Y);
WRITE(P);
(...)
Hmm, I cannot go deep into it. But when I optimized Mike3's procedure my first rewriting was to replace the "write(element)" by something like "write(vectorofelements)" because the Pari/GPinternal implementation of the elementwise "write()" is extremely timeconsumptive. So for instance it would be better to put the elements using "Str()" into a stringvariable, and then you can with one (of this timeconsumptive) "write()" deliver the whole line to the file. You might even construct the fileimage as complete string including the "\n"linefeeds and deliver this with one "write()"call at all.
In the new versions of Pari/GP there is moreover a lowlevel "filewrite()" which has a cache in the background and performs much faster than all procedures which call "write()" in a loop.
Something more perhaps ... but have not much time at the moment...
Gottfried Helms, Kassel
Posts: 1,214
Threads: 126
Joined: Dec 2010
(02/23/2023, 05:08 PM)Gottfried Wrote: (02/23/2023, 04:30 PM)JmsNxn Wrote: E =
For X = 0, 500:
For Y=0, 500:
WRITE(X,Y)
(...)
Mike3's time would be written:
MIKE =
For X=0,500:
For Y=0, 500:
P = F(X,Y);
WRITE(P);
(...)
Hmm, I cannot go deep into it. But when I optimized Mike3's procedure my first rewriting was to replace the "write(element)" by something like "write(vectorofelements)" because the Pari/GPinternal implementation of the elementwise "write()" is extremely timeconsumptive. So for instance it would be better to put the elements using "Str()" into a stringvariable, and then you can with one (of this timeconsumptive) "write()" deliver the whole line to the file. You might even construct the fileimage as complete string including the "\n"linefeeds and deliver this with one "write()"call at all.
In the new versions of Pari/GP there is moreover a lowlevel "filewrite()" which has a cache in the background and performs much faster than all procedures which call "write()" in a loop.
Something more perhaps ... but have not much time at the moment...
Okay, I believe I understand what you are saying, Gottfried.
An even faster speed up, would be to take:
Xn = X[j+1], X[j+2],..., X[j+n]
Yn = Y[j+1], Y[j+2],..., Y[j+n]
Where now we string them as a speed up on the WRITE function Pari has. Instead of writing:
For j =1, n: WRITE X[j+n];
We instead write:
WRITE(Xn)
Where, I have made my code too slow, because it depends on running the WRITE command too many times. And running the WRITE command takes time. You are saying; create the string, then reduce the amount of times we need to run the WRITE command.
I never thought of that before. My code, is more so, just a mathematical speed up. So if you assume that WRITE takes O(1) time. This means my code is mathematically faster. But WRITE does have some fuck ups. So you're definitely right!!!!!
We compile the values of a larger string; and write less often. And this increases the speed of the program!
Posts: 903
Threads: 130
Joined: Aug 2007
02/23/2023, 07:41 PM
(This post was last modified: 02/23/2023, 07:46 PM by Gottfried.)
(02/23/2023, 07:08 PM)JmsNxn Wrote: (02/23/2023, 05:08 PM)Gottfried Wrote: (02/23/2023, 04:30 PM)JmsNxn Wrote: E =
For X = 0, 500:
For Y=0, 500:
WRITE(X,Y)
(...)
Mike3's time would be written:
MIKE =
For X=0,500:
For Y=0, 500:
P = F(X,Y);
WRITE(P);
(...)
Hmm, I cannot go deep into it. But when I optimized Mike3's procedure my first rewriting was to replace the "write(element)" by something like "write(vectorofelements)" because the Pari/GPinternal implementation of the elementwise "write()" is extremely timeconsumptive. So for instance it would be better to put the elements using "Str()" into a stringvariable, and then you can with one (of this timeconsumptive) "write()" deliver the whole line to the file. You might even construct the fileimage as complete string including the "\n"linefeeds and deliver this with one "write()"call at all.
In the new versions of Pari/GP there is moreover a lowlevel "filewrite()" which has a cache in the background and performs much faster than all procedures which call "write()" in a loop.
Something more perhaps ... but have not much time at the moment...
Okay, I believe I understand what you are saying, Gottfried.
An even faster speed up, would be to take:
Xn = X[j+1], X[j+2],..., X[j+n]
Yn = Y[j+1], Y[j+2],..., Y[j+n]
Where now we string them as a speed up on the WRITE function Pari has. Instead of writing:
For j =1, n: WRITE X[j+n];
We instead write:
WRITE(Xn)
Where, I have made my code too slow, because it depends on running the WRITE command too many times. And running the WRITE command takes time. You are saying; create the string, then reduce the amount of times we need to run the WRITE command.
I never thought of that before. My code, is more so, just a mathematical speed up. So if you assume that WRITE takes O(1) time. This means my code is mathematically faster. But WRITE does have some fuck ups. So you're definitely right!!!!!
We compile the values of a larger string; and write less often. And this increases the speed of the program!
Yes, for a vector of values:
st=""; for(c=1,cols, st=Str(st, X[ c ])); write(st);
or for a rowsbycols array
st=""; for(r=1,rows, st=Str(st,X[r,1]; for(c=2,cols, st=Str(st, ",",X[r,c])); st=Str(st,"\n")); write(st);
This should be considerably faster than all mathematical optimizations...
Perhaps a "strjoin()" on a vector of numerical values might even be faster than doing the loop of "for(c=1,cols, st=Str(st,",",<numerical value>))". You should test it.
Moreover, the new "filewrite()" procedures do moreorless make the above implicite (via caches in mainmemory, while "write()" do the full procedure of looking whether the file exists, allocation of a writestring, looking where the filepointer is, increase, write to the filesystem, wait until received, closing the file descriptor etc etc.
Gottfried Helms, Kassel
