Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement unpack_at for JGF #775

Merged
merged 11 commits into from
Jan 22, 2021
Merged

Implement unpack_at for JGF #775

merged 11 commits into from
Jan 22, 2021

Conversation

milroy
Copy link
Member

@milroy milroy commented Nov 26, 2020

This PR implements the unpack_at() function in the JGF reader as discussed in issue #767. The work in this PR is a necessary component of grow with the RV1/JGF reader and complements PR #773. It also adds a subcommand to the resource-query utility to enable testing of unpack_at(). Once more of the infrastructure is in place, we can add unpack_at() tests to flux-ion-resource as well.

Unpack_at() requires a vertex attachment point and a fully-qualified JGF subgraph as inputs. In this context, fully qualified means that the subgraph contains the root vertex and all intermediate vertices between the root and the new vertices to be attached. In other words, every vertex in the subgraph has a path to the root. The implementation is robust and doesn't assume any ordering of the vertices or edges. It iterates through the JGF subgraph vertices and adds vertices not present in the resource graph, keeping track of the added vertex ids with an std::unordered_set. When iterating through the subgraph edges, it only adds edges where either the source or target vertices (or both) have been added, ensuring that pre-existing edges aren't re-added to the resource graph.

Note: as indicated in the conversation below, this functionality is currently experimental as resource graph growth causes a resize of the boost vecS vertex container type. Resizing the vecS results in lost job allocations and reservations as there is no copy constructor for planner. Furthermore, to initialize the pruning filters of the new JGF subgraph, the implementation developed here reinitializes the entire resource graph. We intend to add a performance optimizing PR to improve the pruning filter initialization in the future.

This PR addresses and solves issues #782 and #783.

@milroy
Copy link
Member Author

milroy commented Nov 26, 2020

Note that I will be adding additional tests to the testsuite as this PR evolves.

@milroy milroy force-pushed the unpack-at branch 2 times, most recently from f8b4c50 to 97b91cc Compare November 26, 2020 01:50
@milroy
Copy link
Member Author

milroy commented Dec 2, 2020

@dongahn I've been creating more tests for CI and have a question about handling containment subsystem in edges.

I've been assuming that the JGF subgraph to be attached will have both contains and in edges. However, if I pass a JGF with bidirectional edges to update attach, the traverser update here:

if ( (rc = ctx->traverser->run (str, ctx->writers, rd, id, at, d)) != 0) {

results in a segfault. Is this expected behavior? If so, should I handle it in resource-query by passing two JGFs to update attach: one with contains and in edges for unpack_at (), and one with only contains edges for the traverser->run () step?

@dongahn
Copy link
Member

dongahn commented Dec 2, 2020

@milroy: As we discussed, I am going to have a review pass with a goal to land this right after PR #773. Could you remove [WIP] label when you are ready?

@dongahn
Copy link
Member

dongahn commented Dec 2, 2020

If so, should I handle it in resource-query by passing two JGFs to update attach: one with contains and in edges for unpack_at (), and one with only contains edges for the traverser->run () step?

No you shouldn't have to. There seems to be a logic in run () that doesn't like a bidirectional edge. We should look into it.

@dongahn
Copy link
Member

dongahn commented Dec 2, 2020

If you pass a bidirectional JGF into the existing interface (plain update), does run() still crash?

@milroy
Copy link
Member Author

milroy commented Dec 3, 2020

If you pass a bidirectional JGF into the existing interface (plain update), does run() still crash?

Yes, it does crash:

$ resource/utilities/resource-query -L t/data/resource/jgfs/elastic/l2-full.json -F jgf -f jgf -S CA -P high
INFO: Loading a matcher: CA
resource-query> update allocate t/data/resource/jgfs/elastic/node-add-test.json 1 0 3600
Segmentation fault

@dongahn
Copy link
Member

dongahn commented Dec 3, 2020

Could you create an issue for this? Since this is a distinct problem, it would be ideal to address it as a separate PR then yours?

@milroy
Copy link
Member Author

milroy commented Dec 3, 2020

@dongahn no problem! I'll create an issue and get to work on a PR.

@milroy
Copy link
Member Author

milroy commented Dec 3, 2020

I'll create an issue and get to work on a PR.

However, fixing the issue enables CI testing for unpack_at (), so I can see an argument for fixing it in this PR.

@milroy
Copy link
Member Author

milroy commented Dec 3, 2020

@dongahn I opened issue #782 to discuss the segfault, but the fix appears to be a one liner. If that looks good to you I can add a commit to this PR.

@dongahn
Copy link
Member

dongahn commented Dec 3, 2020

Yes, that sounds great!

@milroy
Copy link
Member Author

milroy commented Dec 4, 2020

I ran into another problem with the traverser update step and created issue #783 for discussion. That issue needs to be solved before this PR can be reviewed.

@milroy milroy linked an issue Dec 10, 2020 that may be closed by this pull request
@milroy milroy changed the title [WIP] Implement unpack_at for JGF Implement unpack_at for JGF Dec 10, 2020
@milroy
Copy link
Member Author

milroy commented Dec 10, 2020

I've dropped the WIP tag and have added two CI tests. As soon as I add the remaining tests this PR will be ready for review.

@milroy
Copy link
Member Author

milroy commented Dec 11, 2020

This should be ready for your review, @dongahn and @SteVwonder.

@milroy
Copy link
Member Author

milroy commented Jan 18, 2021

I broke out a line-wrap change into a new commit and made some minor edits to commit messages.

Copy link
Member

@dongahn dongahn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@milroy: this LGTM! I have added a few minor comments, which would be good to be addressed before this is landed. But since they are minor, I approve this PR.

@SteVwonder: would you want to take a look at this PR as well?

resource/utilities/command.cpp Outdated Show resolved Hide resolved
resource/readers/resource_reader_jgf.cpp Show resolved Hide resolved
resource/readers/resource_reader_jgf.cpp Outdated Show resolved Hide resolved
resource/readers/resource_reader_jgf.cpp Outdated Show resolved Hide resolved
resource/readers/resource_reader_jgf.cpp Outdated Show resolved Hide resolved
resource/readers/resource_reader_jgf.cpp Show resolved Hide resolved
resource/readers/resource_reader_jgf.hpp Outdated Show resolved Hide resolved
@milroy
Copy link
Member Author

milroy commented Jan 21, 2021

@dongahn, thanks for the helpful suggestions! I'll implement them and push the fixes. @SteVwonder let me know if you want to look this PR over as well.

@milroy milroy force-pushed the unpack-at branch 2 times, most recently from ea781aa to 458958e Compare January 21, 2021 07:32
@dongahn
Copy link
Member

dongahn commented Jan 21, 2021

@dongahn, thanks for the helpful suggestions! I'll implement them and push the fixes. @SteVwonder let me know if you want to look this PR over as well.

@milroy: very nicely done and those position us very well for ultimately elasticity! Thanks you.

@SteVwonder: it would be great to land this to our next release (Jan release). I believe this only needs just one more cursory release.

Copy link
Member

@SteVwonder SteVwonder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @milroy for putting this together. This is a huge leap in functionality. I'm super excited to see it in action :) Just two question below:

resource/readers/resource_reader_jgf.cpp Show resolved Hide resolved
t/t3028-resource-grow.t Show resolved Hide resolved
@milroy
Copy link
Member Author

milroy commented Jan 22, 2021

Thanks for the review and comments @SteVwonder! Let me know if you need more details and if you'd like me to make changes before this gets merged.

Copy link
Member

@SteVwonder SteVwonder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks @milroy!

@milroy
Copy link
Member Author

milroy commented Jan 22, 2021

LGTM! Thanks @milroy!

Great, thanks to you and @dongahn! I'm going to rebase and set MWP.

milroy added 11 commits January 22, 2021 14:43
Add an attach command to the resource-query tool.
Attach takes a fully-qualified JGF subgraph as
an argument. After unpacking the subgraph,
the resource graph must be reinitialized
so that the pruning filters are properly
initialized with the new resources.
Since reinitialization creates new
subtree plans, allocated and reserved
jobs may be affected. The command
exits with an error if there are
currently allocated or reserved jobids.

The attach command also enables testing of
the unpack_at() functionality.
Add the ability to reinitialize the resource
graph. After attaching a JGF subgraph to the
original resource graph, the pruning filters
must be reinitialized with the new resources.

Add `reset_color()` to the DFU Traverser
Implementation Public API. This function
allows the DFU traverser to reset the
graph color before reinitializing the
resource graph. Without this function,
`dfu_impl_t::prime_pruning_filter()`
stops at the root vertex for a resource
graph that has been initialized.

Note that the reinitialization currently
traverses the entire resource graph. In the
future we will need to add an optimization
to only traverse and initialize the added
subgraph.
Fix a non-terminating traversal cycle for
depth-first traversal of a graph with
bidirectional edges. Upon reaching a leaf,
the depth-first traverser would traverse
the out edge back to the ancestor, which
in turn traverses back to the leaf in
a non-terminating cycle until producing
a segfault. Inserting a gray coloration
breaks the cycle.
Add the ability to add a JGF subgraph into
a resource graph by implementing the `unpack_at()`
function. The function takes a fully-qualified
JGF subgraph and an attachment point as inputs.
The JGF is said to be fully qualified in that
every vertex has a path to the root via
intermediate vertices. The root and intermediate
vertices are also in the JGF subgraph. Currently,
the vertex attachment point is not used by
`unpack_at()`, but could be used to speed
up the unpacking.

Remove the check for empty `is_roots` map
in `update_tgt_edge()`. The check causes a
spurious error when encountering an edge
into a root before the edge out of a root
in a JGF with bidirectional edges. We
cannot assume an edge iteration ordering,
so the check must be removed for robustness.

Refactor a commonly used sequence of code
into a function (`update_vmap`) for readibility.
Add input and attach JGFs to test
unpack_at. The vertices and edges of
the randomized attach test JGF subgraph
have been shuffled randomly to test the
robustness of `unpack_at()`.

Add a jobspec to allocate the resource
graph.
Add tests for `unpack_at()`. The tests
add a JGF subgraph with default
and then randomized vertex and edge
orderings. The tests check for
error conditions such as attempting
to add another graph root and
attaching a JGF subgraph with
outstanding job allocations or
reservations, as well as basic
checks for correctly formed
command arguments.
@dongahn
Copy link
Member

dongahn commented Jan 22, 2021

Great job @milroy and @SteVwonder. Thanks!

@codecov
Copy link

codecov bot commented Jan 22, 2021

Codecov Report

Merging #775 (d931055) into master (ae1628f) will increase coverage by 0.1%.
The diff coverage is 78.9%.

@@           Coverage Diff            @@
##           master    #775     +/-   ##
========================================
+ Coverage    72.4%   72.6%   +0.1%     
========================================
  Files          85      85             
  Lines        8876    8931     +55     
========================================
+ Hits         6433    6487     +54     
- Misses       2443    2444      +1     
Impacted Files Coverage Δ
resource/readers/resource_reader_jgf.hpp 100.0% <ø> (ø)
resource/traversers/dfu_impl.hpp 100.0% <ø> (ø)
resource/utilities/command.hpp 100.0% <ø> (ø)
resource/readers/resource_reader_jgf.cpp 69.4% <77.7%> (+1.9%) ⬆️
resource/utilities/command.cpp 75.2% <77.7%> (+0.5%) ⬆️
resource/traversers/dfu.cpp 85.9% <100.0%> (+0.1%) ⬆️
resource/traversers/dfu_impl.cpp 83.2% <100.0%> (+<0.1%) ⬆️
resource/traversers/dfu_impl_update.cpp 77.7% <100.0%> (+<0.1%) ⬆️

@milroy milroy added the merge-when-passing mergify.io - merge PR automatically once CI passes label Jan 22, 2021
@mergify mergify bot merged commit a51b004 into flux-framework:master Jan 22, 2021
@milroy milroy deleted the unpack-at branch January 22, 2021 22:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merge-when-passing mergify.io - merge PR automatically once CI passes
Projects
None yet
3 participants