-
Notifications
You must be signed in to change notification settings - Fork 335
Refactor Force Atlas 2 #4969
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor Force Atlas 2 #4969
Conversation
cpp/src/c_api/legacy_fa2.cpp
Outdated
namespace cugraph { | ||
namespace c_api { | ||
|
||
struct cugraph_clustering_result_t { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be left over from a copy/paste. It's not relevant here.
The new cugraph_layout_result_t
object would be defined here instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I defined cugraph_layout_result_t
instead
@@ -78,6 +78,7 @@ void force_atlas2(raft::handle_t const& handle, | |||
verbose, | |||
callback); | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like an arbitrary change. I'd suggest deleting this line to eliminate the file change.
cpp/src/c_api/legacy_fa2.cpp
Outdated
|
||
cugraph::internals::GraphBasedDimRedCallback* callback = nullptr; | ||
|
||
//cugraph::legacy::GraphBasedDimRedCallback() // FIXME Add support for callback |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does anything in python use this callback feature?
I don't see any other implementations offering that feature. I'm guessing it was for debugging purposes? It would be nice to be able to just drop this feature... but we currently support it.
Thoughts @rlratzel?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is exposed in the current Python API too, meaning users could theoretically be using it now. However, I really doubt anyone is, and I'm in favor of making this a breaking change as a tradeoff for not having to support, test, and document it in the new PLC-based API.
I like the code and it could be useful for something, but it may mean that we can't release the GIL (which we've been wanting to do in more of our functions), and I suspect we could get most of the intended functionality with an option that enables debug prints if we really wanted to.
Maybe @hlinsen has more info about the intended use case though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was an asked by @trxcllnt and @exactlyallan for dynamic viz. There is maybe some cyber use cases or demo that could need it. I think it is also exposed in umap for cuml.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the feedback @hlinsen . Do you have suggestions about ways to test FA2 apart from the smoke tests cuGraph and networkX have?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some ideas:
- Checking that we converged by checking some of the values like speed, swinging values here
- Trying to make barnes hut deterministic and compare positions
- Check consistency in nearest neighbor nodes, look at a top k. Some times the layout is the same but shifted and I think nearest nodes should still be the same ones.
* @param [in] handle Handle for accessing resources | ||
* @param [in] graph Pointer to graph. NOTE: Graph might be modified if the storage | ||
* needs to be transposed | ||
* @param [out] pos Position of the vertices |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that in the C++ code, pos is a 2-d array and assumes that the output is ordered by internal vertex ids.
Generally, the C API hides the renumbering of vertices from the user (we automatically renumber/unrenumber in the C API before calling the C++ API. As such, this really needs include the vertex ids.
I'd suggest creating a cugraph_layout_result_t
type and including vertex id, x position, y position so that you can easily create a data frame for the resulting python.
* needs to be transposed | ||
* @param [out] pos Position of the vertices | ||
* @param [in] max_iter Maximum number of iterations. Initial vertex positioning | ||
* @param [in] x_start Initial vertex positioning (x-axis) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same issue for x_start
and y_start
. The values in the legacy C++ implementation are assumed to be in the order of the internal vertex ids... but the python code doesn't know this order.
I'd suggest using the cugraph_layout_result_t
object as input here. You can create one, passing in the vertex ids, x_start and y_start values and use it as input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With your latest refactoring you did not address this concern. The legacy C++ implementation assumes that x_start
and y_start
are sorted such that element 0 refers to the starting position of vertex 0 using the internal vertex id. If the graph was created with renumbering, there is no way for the caller to know what order to put x_start
and y_start
.
Simplest solution would be to add a vertices parameter (optional like x_start and y_start) that identified the vertex id for each position.
Then in the implementation you can make a copy of this data into rmm device vectors, renumber the vertices and then sort the vertices/x_start/y_start by vertex id. I think page rank can be used as a model for handling the intial guess.
Also, documentation for x_start and y_start (and the new vertices parameter) should indicate that they are optional. And there should be a CAPI_EXPECTS
call that verifies that if you specify x_start and y_start you must also specify vertices.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
x_start and y_start are optional. The documentation should reflect the following:
- You can pass NULL for x_start and y_start and no initial starting position will be implied
- If you pass a value for one you must pass a value for the other
Should also indicate that x_start and y_start the vertex order. You assume in the implementation that the values of x_start and y_start are sorted by vertex id, the documentation should make this clear.
* @param [in] verbose Output convergence info at each interation. | ||
* @param [in] do_expensive_check | ||
* A flag to run expensive checks for input arguments (if set to true) | ||
* @param [out] result Opaque object containing the clustering result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This result would be what pos
above is. I'd delete the pos
parameter above and move it to this position. By convention, the C API functions pass all input parameters before the output parameters.
cpp/src/c_api/legacy_fa2.cpp
Outdated
} else if constexpr (!std::is_same_v<edge_t, int32_t>) { | ||
unsupported(); | ||
} else { | ||
// FIXME: Does FA2 expects store_transposed == false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FA2 is expecting a legacy::GraphCOOView
, which you're creating below. there's no reason to transpose the graph, you should be able to use it exactly as is.
I'd take out this logic. You can then remove the annotation in the C API documentation that the graph object might be modified... because in the case of this algorithm it would not be.
* needs to be transposed | ||
* @param [out] pos Position of the vertices | ||
* @param [in] max_iter Maximum number of iterations. Initial vertex positioning | ||
* @param [in] x_start Initial vertex positioning (x-axis) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With your latest refactoring you did not address this concern. The legacy C++ implementation assumes that x_start
and y_start
are sorted such that element 0 refers to the starting position of vertex 0 using the internal vertex id. If the graph was created with renumbering, there is no way for the caller to know what order to put x_start
and y_start
.
Simplest solution would be to add a vertices parameter (optional like x_start and y_start) that identified the vertex id for each position.
Then in the implementation you can make a copy of this data into rmm device vectors, renumber the vertices and then sort the vertices/x_start/y_start by vertex id. I think page rank can be used as a model for handling the intial guess.
Also, documentation for x_start and y_start (and the new vertices parameter) should indicate that they are optional. And there should be a CAPI_EXPECTS
call that verifies that if you specify x_start and y_start you must also specify vertices.
legacy_coo_graph_view, | ||
pos.data(), | ||
max_iter_, | ||
x_start_ != nullptr ? x_start_->as_type<float>() : nullptr, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned above, this need to be sorted properly, so you'll need to make a copy and sort by vertex id after renumbering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After our discussion, I sort both the x_start
and y_start
by key where the key refers to the position of the number_map
cpp/src/c_api/legacy_fa2.cpp
Outdated
|
||
cugraph::internals::GraphBasedDimRedCallback* callback = nullptr; | ||
|
||
rmm::device_uvector<vertex_t> vertices(graph_view.local_vertex_partition_range_size(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would move this down to where you do x_axis and y_axis... I don't think you need this until constructing the result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM once the other review comments are addressed. I just had a few minor issues/questions.
Also, we'll need to make sure this is addressed before updating the cugraph Python API. Since this PR is adding to PLC, I don't believe the callback use cases using cugraph Python are affected (but we need to confirm).
cugraph_error_t, | ||
) | ||
from pylibcugraph._cugraph_c.array cimport ( | ||
cugraph_type_erased_device_array_t, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is used in this file. Can it be removed?
|
||
cdef extern from "cugraph_c/layout_algorithms.h": | ||
########################################################################### | ||
# triangle_count |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copy/paste error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm just curious about the legacy_
prefix for the file name. Is legacy referring to the C++ implementation not using primitives? If so, once it uses primitives, would this file need to be replaced? I figured this would just be called fa2.cpp
and only the C++ impl would change once primitives are used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is legacy referring to the C++ implementation not using primitives
I believe that is correct @ChuckHastings
If so, once it uses primitives, would this file need to be replaced? I figured this would just be called fa2.cpp and only the C++ impl would change once primitives are used.
Right. No changes to the C (except the file name), and PLC API
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we targeted getting the API definition right for both legacy implementation and some future primitive based implementation. The implementation is currently solely a legacy implementation. The implementation of the C API will look totally different on a primitive based C++ implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation of the C API will look totally different
However the function definition in the C API will most likely be unchanged right? and regarding C the implementation, we will remove the legacy COO logic and call the primitive based C++ implementation right
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. Function definition would be the same. The functor definition might be the same... the implementation inside the operator()
method would likely be totally different (simpler) as we wouldn't have to worry about legacy graph objects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function definition would be the same.
ok, that's what I figured too. In that case, my preference would be to just call the file fa2.cpp
since the _legacy
prefix isn't needed and will be misleading later, but I don't feel strongly about it, and it can always be renamed later if needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's helpful to have the legacy
in the name when we're looking over the codebase later. If the transition occurs as we anticipate we'll just do a rename and preserve the history when we transition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few documentation cleanup comments.
* needs to be transposed | ||
* @param [out] pos Position of the vertices | ||
* @param [in] max_iter Maximum number of iterations. Initial vertex positioning | ||
* @param [in] x_start Initial vertex positioning (x-axis) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
x_start and y_start are optional. The documentation should reflect the following:
- You can pass NULL for x_start and y_start and no initial starting position will be implied
- If you pass a value for one you must pass a value for the other
Should also indicate that x_start and y_start the vertex order. You assume in the implementation that the values of x_start and y_start are sorted by vertex id, the documentation should make this clear.
cugraph::c_api::cugraph_layout_result_t** result, | ||
cugraph_error_t** error) | ||
{ | ||
force_atlas2_functor functor(handle, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add CAPI_EXPECTS calls to validate that x_start and y_start are either both NULL or are both specified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the docs and in case x_start and y_start are not provided, it is now mentioned in the documentation that those will be selected randomly internally.
/merge |
This PR exposes FA2 to the PLC API
closes #4881