-
Notifications
You must be signed in to change notification settings - Fork 582
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Zoltan: HSFC curve options #13880
base: develop
Are you sure you want to change the base?
Zoltan: HSFC curve options #13880
Conversation
Signed-off-by: Leo Kotipalo <leo.kotipalo@helsinki.fi>
Status Flag 'Pre-Test Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging |
@lkotipal : Thanks for the PR and the very detailed explanation! There has not been any development in the Zoltan space-filling curves for many years. This looks like a nice addition! I have a couple questions:
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this looks great! The new options will be useful. In particular, the "beta" curve seems to give better results (at least in some cases). We appreciate you contributing this code to Zoltan.
The failing tests do not seem to be related to this PR. Trying to run them again. |
Status Flag 'Pull Request AutoTester' - User Requested Retest - Label AT: RETEST will be reset after testing. |
8 similar comments
Status Flag 'Pull Request AutoTester' - User Requested Retest - Label AT: RETEST will be reset after testing. |
Status Flag 'Pull Request AutoTester' - User Requested Retest - Label AT: RETEST will be reset after testing. |
Status Flag 'Pull Request AutoTester' - User Requested Retest - Label AT: RETEST will be reset after testing. |
Status Flag 'Pull Request AutoTester' - User Requested Retest - Label AT: RETEST will be reset after testing. |
Status Flag 'Pull Request AutoTester' - User Requested Retest - Label AT: RETEST will be reset after testing. |
Status Flag 'Pull Request AutoTester' - User Requested Retest - Label AT: RETEST will be reset after testing. |
Status Flag 'Pull Request AutoTester' - User Requested Retest - Label AT: RETEST will be reset after testing. |
Status Flag 'Pull Request AutoTester' - User Requested Retest - Label AT: RETEST will be reset after testing. |
Looks like a permissions issue? |
Status Flag 'Pre-Test Inspection' - SUCCESS: The last commit to this Pull Request has been INSPECTED by label AT: PRE-TEST INSPECTED! Autotester is Removing Label; this inspection will remain valid until a new commit to source branch is performed. |
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: PR_gcc-openmpi-openmp
Jenkins Parameters
Build InformationTest Name: PR_clang
Jenkins Parameters
Build InformationTest Name: PR_cuda
Jenkins Parameters
Build InformationTest Name: PR_intel
Jenkins Parameters
Build InformationTest Name: PR_cuda-uvm
Jenkins Parameters
Using Repos:
Pull Request Author: lkotipal |
FYI, there is a limited list of who can do the Pretest-Inspections. @egboman should be on the list but I went ahead and added the Pretest-Inspection label. |
Thanks, @ccober6 . I didn't realize I should do that. |
Yeah, there are times when Pretest-Inspected label needs to be added, when the contributor is not on the pretest inspection list. |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: 1 or more Jobs FAILED Note: Testing will normally be attempted again in approx. 2 Hrs 30 Mins. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run. Pull Request Auto Testing has FAILED (click to expand)Build InformationTest Name: PR_gcc-openmpi-openmp
Jenkins Parameters
Build InformationTest Name: PR_clang
Jenkins Parameters
Build InformationTest Name: PR_cuda
Jenkins Parameters
Build InformationTest Name: PR_intel
Jenkins Parameters
Build InformationTest Name: PR_cuda-uvm
Jenkins Parameters
|
Not sure why some tests failed, looks like they didn't even build correctly? |
Maybe DevOps folks can help. @trilinos/framework |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Status Flag 'Pull Request AutoTester' - User Requested Retest - Label AT: RETEST will be reset after testing. |
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: PR_gcc-openmpi-openmp
Jenkins Parameters
Build InformationTest Name: PR_clang
Jenkins Parameters
Build InformationTest Name: PR_cuda
Jenkins Parameters
Build InformationTest Name: PR_intel
Jenkins Parameters
Build InformationTest Name: PR_cuda-uvm
Jenkins Parameters
Using Repos:
Pull Request Author: lkotipal |
Sorry, I don't understand. What kind of approval is needed? There is still a checkmark by my code review, so I believe it is still valid. Same for Curt. |
In kicked off all AT2 jobs again. Let's see if that does it. |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: 1 or more Jobs FAILED Note: Testing will normally be attempted again in approx. 2 Hrs 30 Mins. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run. Pull Request Auto Testing has FAILED (click to expand)Build InformationTest Name: PR_gcc-openmpi-openmp
Jenkins Parameters
Build InformationTest Name: PR_clang
Jenkins Parameters
Build InformationTest Name: PR_cuda
Jenkins Parameters
Build InformationTest Name: PR_intel
Jenkins Parameters
Build InformationTest Name: PR_cuda-uvm
Jenkins Parameters
|
Failed again... Will keep trying! |
Status Flag 'Pull Request AutoTester' - User Requested Retest - Label AT: RETEST will be reset after testing. |
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: PR_gcc-openmpi-openmp
Jenkins Parameters
Build InformationTest Name: PR_clang
Jenkins Parameters
Build InformationTest Name: PR_cuda
Jenkins Parameters
Build InformationTest Name: PR_intel
Jenkins Parameters
Build InformationTest Name: PR_cuda-uvm
Jenkins Parameters
Using Repos:
Pull Request Author: lkotipal |
It appears to be valid build errors in the Intel and Clang configurations. Here's a link to the Intel failures: https://trilinos-cdash.sandia.gov/viewBuildError.php?buildid=2171929 |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: 1 or more Jobs FAILED Note: Testing will normally be attempted again in approx. 2 Hrs 30 Mins. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run. Pull Request Auto Testing has FAILED (click to expand)Build InformationTest Name: PR_gcc-openmpi-openmp
Jenkins Parameters
Build InformationTest Name: PR_clang
Jenkins Parameters
Build InformationTest Name: PR_cuda
Jenkins Parameters
Build InformationTest Name: PR_intel
Jenkins Parameters
Build InformationTest Name: PR_cuda-uvm
Jenkins Parameters
|
After looking at the latest test failures, there are some issues that need to be fixed. Turns out clang and the Intel compilers are more picky than gcc. Unfortunately, the cdash output is internal to Sandia. I will copy-paste some output below. @lkotipal , looks like minor errors so I hope you have a chance to fix this and update the PR. Thanks! Failing tests (clang): Zoltan | Error building packages/zoltan/src/CMakeFiles/zoltan.dir/hsfc/hsfc_hilbert.c.o Source: packages/zoltan/src/hsfc/hsfc_hilbert.c Source: packages/zoltan/src/hsfc/hsfc_box_assign.c Source: packages/zoltan/src/hsfc/hsfc.c |
@trilinos/Zoltan @egboman
Motivation
Zoltan’s “Octree” Hilbert curve is an implementation of a three-dimensional curve canonized in most implementations. However, it is not unique among space filling curves; Haverkort (2017) describes up to ten million depending on desired properties. Of particular interest to load balancing is Lp-dilation, measuring the maximum Lp-distance between points in space divided by the difference between their Hilbert coordinates. Two particular curves known as the hyperorthogonal curves have optimal locality: Alfa has optimal L∞-dilation, while Beta has optimal L1- and L2-dilation.
This PR implements several curves with the base pattern Ca00 shared with the Octree curve, with state tables calculated adapting the method described by Lawder & King (2001). These include Butz (should effectively be a re-implementation of Octree), Harmonious, Sasburg, Base Camp, and the hyperorthogonal curves. The Z-curve is also implemented in two and three dimensions, mostly as a control. The choice of curve is controlled by the user parameter
CURVE
, defaulting to the Hilbert curve in two dimensions and the Octree curve in three dimensions, preserving previous behavior. The generation code and state tables (as C arrays) can be found here.Addresses issue #13879
Testing
Default behaviour (Hilbert curve and Octree partitioning in 2D and 3D respectively) should be exercised by existing tests.
No tests for different curve implementations yet, but functionality and performance have been tested on Vlasiator with promising results; a production scale simulation shows around a 14% improvement on the best curve found (Beta) over the original Octree curve in spatial translation for a 4% improvement in overall simulation time (shown on the two box plots below). We expect these results to be applicable for similar workloads, i.e. Eulerian grids where data needs to be communicated along coordinate axes. In general we expect one of the hyperorthogonal curves to typically be optimal for any workload depending on the communication patterns of the program.
Additional Information
Sources cited: