Skip to content

Conversation

jnke2016
Copy link
Contributor

@jnke2016 jnke2016 commented Mar 12, 2025

This PR exposes FA2 to the PLC API

closes #4881

@jnke2016 jnke2016 requested a review from a team as a code owner March 12, 2025 12:21
@jnke2016 jnke2016 self-assigned this Mar 12, 2025
@jnke2016 jnke2016 added this to the 25.04 milestone Mar 12, 2025
@rlratzel rlratzel added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Mar 12, 2025
@jnke2016 jnke2016 requested a review from a team as a code owner March 13, 2025 13:49
@github-actions github-actions bot added the CMake label Mar 13, 2025
@jnke2016 jnke2016 requested a review from a team as a code owner March 13, 2025 17:04
namespace cugraph {
namespace c_api {

struct cugraph_clustering_result_t {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be left over from a copy/paste. It's not relevant here.

The new cugraph_layout_result_t object would be defined here instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I defined cugraph_layout_result_t instead

@@ -78,6 +78,7 @@ void force_atlas2(raft::handle_t const& handle,
verbose,
callback);
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like an arbitrary change. I'd suggest deleting this line to eliminate the file change.


cugraph::internals::GraphBasedDimRedCallback* callback = nullptr;

//cugraph::legacy::GraphBasedDimRedCallback() // FIXME Add support for callback
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does anything in python use this callback feature?

I don't see any other implementations offering that feature. I'm guessing it was for debugging purposes? It would be nice to be able to just drop this feature... but we currently support it.

Thoughts @rlratzel?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is exposed in the current Python API too, meaning users could theoretically be using it now. However, I really doubt anyone is, and I'm in favor of making this a breaking change as a tradeoff for not having to support, test, and document it in the new PLC-based API.

I like the code and it could be useful for something, but it may mean that we can't release the GIL (which we've been wanting to do in more of our functions), and I suspect we could get most of the intended functionality with an option that enables debug prints if we really wanted to.

Maybe @hlinsen has more info about the intended use case though?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was an asked by @trxcllnt and @exactlyallan for dynamic viz. There is maybe some cyber use cases or demo that could need it. I think it is also exposed in umap for cuml.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback @hlinsen . Do you have suggestions about ways to test FA2 apart from the smoke tests cuGraph and networkX have?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some ideas:

  • Checking that we converged by checking some of the values like speed, swinging values here
  • Trying to make barnes hut deterministic and compare positions
  • Check consistency in nearest neighbor nodes, look at a top k. Some times the layout is the same but shifted and I think nearest nodes should still be the same ones.

* @param [in] handle Handle for accessing resources
* @param [in] graph Pointer to graph. NOTE: Graph might be modified if the storage
* needs to be transposed
* @param [out] pos Position of the vertices
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that in the C++ code, pos is a 2-d array and assumes that the output is ordered by internal vertex ids.

Generally, the C API hides the renumbering of vertices from the user (we automatically renumber/unrenumber in the C API before calling the C++ API. As such, this really needs include the vertex ids.

I'd suggest creating a cugraph_layout_result_t type and including vertex id, x position, y position so that you can easily create a data frame for the resulting python.

* needs to be transposed
* @param [out] pos Position of the vertices
* @param [in] max_iter Maximum number of iterations. Initial vertex positioning
* @param [in] x_start Initial vertex positioning (x-axis)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue for x_start and y_start. The values in the legacy C++ implementation are assumed to be in the order of the internal vertex ids... but the python code doesn't know this order.

I'd suggest using the cugraph_layout_result_t object as input here. You can create one, passing in the vertex ids, x_start and y_start values and use it as input.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With your latest refactoring you did not address this concern. The legacy C++ implementation assumes that x_start and y_start are sorted such that element 0 refers to the starting position of vertex 0 using the internal vertex id. If the graph was created with renumbering, there is no way for the caller to know what order to put x_start and y_start.

Simplest solution would be to add a vertices parameter (optional like x_start and y_start) that identified the vertex id for each position.

Then in the implementation you can make a copy of this data into rmm device vectors, renumber the vertices and then sort the vertices/x_start/y_start by vertex id. I think page rank can be used as a model for handling the intial guess.

Also, documentation for x_start and y_start (and the new vertices parameter) should indicate that they are optional. And there should be a CAPI_EXPECTS call that verifies that if you specify x_start and y_start you must also specify vertices.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

x_start and y_start are optional. The documentation should reflect the following:

  • You can pass NULL for x_start and y_start and no initial starting position will be implied
  • If you pass a value for one you must pass a value for the other

Should also indicate that x_start and y_start the vertex order. You assume in the implementation that the values of x_start and y_start are sorted by vertex id, the documentation should make this clear.

* @param [in] verbose Output convergence info at each interation.
* @param [in] do_expensive_check
* A flag to run expensive checks for input arguments (if set to true)
* @param [out] result Opaque object containing the clustering result
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This result would be what pos above is. I'd delete the pos parameter above and move it to this position. By convention, the C API functions pass all input parameters before the output parameters.

} else if constexpr (!std::is_same_v<edge_t, int32_t>) {
unsupported();
} else {
// FIXME: Does FA2 expects store_transposed == false
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FA2 is expecting a legacy::GraphCOOView, which you're creating below. there's no reason to transpose the graph, you should be able to use it exactly as is.

I'd take out this logic. You can then remove the annotation in the C API documentation that the graph object might be modified... because in the case of this algorithm it would not be.

* needs to be transposed
* @param [out] pos Position of the vertices
* @param [in] max_iter Maximum number of iterations. Initial vertex positioning
* @param [in] x_start Initial vertex positioning (x-axis)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With your latest refactoring you did not address this concern. The legacy C++ implementation assumes that x_start and y_start are sorted such that element 0 refers to the starting position of vertex 0 using the internal vertex id. If the graph was created with renumbering, there is no way for the caller to know what order to put x_start and y_start.

Simplest solution would be to add a vertices parameter (optional like x_start and y_start) that identified the vertex id for each position.

Then in the implementation you can make a copy of this data into rmm device vectors, renumber the vertices and then sort the vertices/x_start/y_start by vertex id. I think page rank can be used as a model for handling the intial guess.

Also, documentation for x_start and y_start (and the new vertices parameter) should indicate that they are optional. And there should be a CAPI_EXPECTS call that verifies that if you specify x_start and y_start you must also specify vertices.

legacy_coo_graph_view,
pos.data(),
max_iter_,
x_start_ != nullptr ? x_start_->as_type<float>() : nullptr,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned above, this need to be sorted properly, so you'll need to make a copy and sort by vertex id after renumbering.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After our discussion, I sort both the x_start and y_start by key where the key refers to the position of the number_map


cugraph::internals::GraphBasedDimRedCallback* callback = nullptr;

rmm::device_uvector<vertex_t> vertices(graph_view.local_vertex_partition_range_size(),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would move this down to where you do x_axis and y_axis... I don't think you need this until constructing the result.

Copy link
Contributor

@rlratzel rlratzel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once the other review comments are addressed. I just had a few minor issues/questions.

Also, we'll need to make sure this is addressed before updating the cugraph Python API. Since this PR is adding to PLC, I don't believe the callback use cases using cugraph Python are affected (but we need to confirm).

cugraph_error_t,
)
from pylibcugraph._cugraph_c.array cimport (
cugraph_type_erased_device_array_t,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is used in this file. Can it be removed?


cdef extern from "cugraph_c/layout_algorithms.h":
###########################################################################
# triangle_count
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copy/paste error?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just curious about the legacy_ prefix for the file name. Is legacy referring to the C++ implementation not using primitives? If so, once it uses primitives, would this file need to be replaced? I figured this would just be called fa2.cpp and only the C++ impl would change once primitives are used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is legacy referring to the C++ implementation not using primitives

I believe that is correct @ChuckHastings

If so, once it uses primitives, would this file need to be replaced? I figured this would just be called fa2.cpp and only the C++ impl would change once primitives are used.

Right. No changes to the C (except the file name), and PLC API

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we targeted getting the API definition right for both legacy implementation and some future primitive based implementation. The implementation is currently solely a legacy implementation. The implementation of the C API will look totally different on a primitive based C++ implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation of the C API will look totally different

However the function definition in the C API will most likely be unchanged right? and regarding C the implementation, we will remove the legacy COO logic and call the primitive based C++ implementation right

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. Function definition would be the same. The functor definition might be the same... the implementation inside the operator() method would likely be totally different (simpler) as we wouldn't have to worry about legacy graph objects.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function definition would be the same.

ok, that's what I figured too. In that case, my preference would be to just call the file fa2.cpp since the _legacy prefix isn't needed and will be misleading later, but I don't feel strongly about it, and it can always be renamed later if needed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's helpful to have the legacy in the name when we're looking over the codebase later. If the transition occurs as we anticipate we'll just do a rename and preserve the history when we transition.

Copy link
Collaborator

@ChuckHastings ChuckHastings left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few documentation cleanup comments.

* needs to be transposed
* @param [out] pos Position of the vertices
* @param [in] max_iter Maximum number of iterations. Initial vertex positioning
* @param [in] x_start Initial vertex positioning (x-axis)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

x_start and y_start are optional. The documentation should reflect the following:

  • You can pass NULL for x_start and y_start and no initial starting position will be implied
  • If you pass a value for one you must pass a value for the other

Should also indicate that x_start and y_start the vertex order. You assume in the implementation that the values of x_start and y_start are sorted by vertex id, the documentation should make this clear.

cugraph::c_api::cugraph_layout_result_t** result,
cugraph_error_t** error)
{
force_atlas2_functor functor(handle,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add CAPI_EXPECTS calls to validate that x_start and y_start are either both NULL or are both specified.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the docs and in case x_start and y_start are not provided, it is now mentioned in the documentation that those will be selected randomly internally.

@rlratzel
Copy link
Contributor

/merge

@rapids-bot rapids-bot bot merged commit d7564af into rapidsai:branch-25.04 Mar 24, 2025
81 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CMake cuGraph improvement Improvement / enhancement to an existing function non-breaking Non-breaking change python
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Expose Force Atlas 2 through PLC
4 participants