Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nexus: add support for Summit #1394

Merged
merged 3 commits into from
Feb 19, 2019
Merged

Nexus: add support for Summit #1394

merged 3 commits into from
Feb 19, 2019

Conversation

jtkrogel
Copy link
Contributor

Add Summit machine class and specializations need to submit and monitor jobs on Summit.

When using typical options to Nexus jobs, i.e. nodes/threads, maximal filling of the node is assumed and jsrun is configured appropriately. Usage of gpu's is on by default (with one per resource group assumed), but can be turned off by supplying gpus=0 to a job. In the event that a user supplies options to jsrun, Nexus generates no options of its own (the user must supply everything).

Tests are added for rendered jsrun commands.

@ghost ghost assigned jtkrogel Feb 19, 2019
@ghost ghost added the in progress label Feb 19, 2019
@qmc-robot
Copy link

Can one of the maintainers verify this patch?

@@ -2259,6 +2259,14 @@ def machines():
('stampede2' , 'n2_t2' ) : 'ibrun -n 68 -o 0 test.x',
('stampede2' , 'n2_t2_e' ) : 'ibrun -n 68 -o 0 test.x',
('stampede2' , 'n2_t2_p2' ) : 'ibrun -n 4 -o 0 test.x',
('summit' , 'n1' ) : 'jsrun -a 21 -r 2 -b rs -c 21 -d packed -n 2 -g 0 test.x',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you rearrange the order as
-n 2 -r 2 -c 21 -g 0 -a 21 -b rs -d packed
total number of resource set(-n), resource set configuration(-r -c -g), number of MPI ranks per RS(-a), MPI binding and affinity (-b -d)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The options are stored in a dictionary (hash table) and so are not ordered. The write order may come out differently each time. Correct function does not depend on order.

@@ -2259,6 +2259,14 @@ def machines():
('stampede2' , 'n2_t2' ) : 'ibrun -n 68 -o 0 test.x',
('stampede2' , 'n2_t2_e' ) : 'ibrun -n 68 -o 0 test.x',
('stampede2' , 'n2_t2_p2' ) : 'ibrun -n 4 -o 0 test.x',
('summit' , 'n1' ) : 'jsrun -a 21 -r 2 -b rs -c 21 -d packed -n 2 -g 0 test.x',
('summit' , 'n1_g6' ) : 'jsrun -a 7 -r 6 -b rs -c 7 -d packed -n 6 -g 1 test.x',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This configuration assigns 7 mpi tasks to each GPU. Is this your intention?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just the rendering that comes out with nodes=1,gpus=6. I expect a user will probably provide something more like nodes=1,threads=7,gpus=6 which will give 1 mpi per gpu and 7 threads per mpi.

@ghost ghost assigned ye-luo Feb 19, 2019
@ye-luo
Copy link
Contributor

ye-luo commented Feb 19, 2019

Okay to test

@ye-luo ye-luo merged commit 3268f08 into QMCPACK:develop Feb 19, 2019
@ghost ghost removed the in progress label Feb 19, 2019
@jtkrogel jtkrogel deleted the nx_summit branch September 16, 2019 12:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants