Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perl tweaks, also PDL version of benchmark #1

Open
wants to merge 25 commits into
base: master
Choose a base branch
from

Conversation

mohawk2
Copy link

@mohawk2 mohawk2 commented Sep 28, 2021

Apart from whitespace changes (to ignore which: https://github.com/Fourmilab/floating_point_benchmarks/pull/1/files?diff=unified&w=1), uses Time::HiRes to do the timing rather than just saying when to stop a timer, and splits the paraxial and non-paraxial surface-code into separate functions.

@mohawk2
Copy link
Author

mohawk2 commented Sep 28, 2021

By the way, the comparative benchmarks in https://www.fourmilab.ch/scanalyzer/archives/2021/09/floating-point-benchmark-raku-perl-6-language-added.html show Perl results for 5.8. Having just tried both that and 5.32 with the high-res timer, it seems like 5.32 is up to 10% faster on average than 5.8 - worth a re-run?

@mohawk2
Copy link
Author

mohawk2 commented Sep 29, 2021

Notes for posterity: the telescope article is from Amateur Telescope Making Vol 3, available in DjVu format from https://b-ok.cc/book/449242/42c54c?id=449242; the J.H. Wyld chapter is on p581 (p588 of the PDF).

@mohawk2
Copy link
Author

mohawk2 commented Sep 30, 2021

I have now added a second script to the Perl directory, with a PDL "PP" function. It’s about twice as slow as pure-Perl, but the small input size is not where PDL would shine. The way this would benefit performance-wise would be to trace rays through many more surfaces than 4, or with a large number of different ray heights. That could then also benefit from automatically using multiple cores, aka "pthreading" (as documented at https://metacpan.org/dist/PDL/view/Basic/Pod/ParallelCPU.pod).

@mohawk2
Copy link
Author

mohawk2 commented Oct 2, 2021

With the latest commit, the for-loop is eliminated, and a "dummy dimension" is added to the 3rd and 4th parameters with a size to cause the calculations to be done the given number of times. This seems to show speed results comparable to C, which isn't surprising as that is where most of the effort happens. Also, this model shows speed improvements with the parallel-processing feature:

$ perl src/perl/fbench-pdl.pl 10000000
Name "PDL::BIGPDL" used only once: possible typo at src/perl/fbench-pdl.pl line 227.
Ready to begin John Walker's floating point accuracy
and performance benchmark.  10000000 iterations will be made.

Measured run time in seconds should be divided by 10000
to normalise for reporting results.  For archival results,
adjust iteration count so the benchmark runs about five minutes.

Time taken: 26.041648
Divided by 10000 = 0.0026041648

No errors in results.
$ PDL_AUTOPTHREAD_TARG=4 PDL_AUTOPTHREAD_SIZE=0 perl src/perl/fbench-pdl.pl 10000000
Name "PDL::BIGPDL" used only once: possible typo at src/perl/fbench-pdl.pl line 227.
Ready to begin John Walker's floating point accuracy
and performance benchmark.  10000000 iterations will be made.
[snip]
Time taken: 14.400384
Divided by 10000 = 0.0014400384

No errors in results.

@mohawk2 mohawk2 changed the title Perl tweaks Perl tweaks, also PDL version of benchmark Oct 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant