Skip to content

Conversation

@giacomofiorin
Copy link
Member

@giacomofiorin giacomofiorin commented Jan 21, 2026

This PR contains several improvements:

  • Condense the computation of multiple variants of coordination numbers: all classes are now derived from coordnum with the exception of h_bond
  • Separate the pairlist update from the switching function computation
  • Skip computing the switching function if pairwise distance is too large for the given tolerance

Fixes #908

@giacomofiorin giacomofiorin marked this pull request as ready for review January 24, 2026 00:37
@giacomofiorin giacomofiorin changed the title WIP: Simplify and optimize coordination numbers Simplify and optimize coordination numbers Jan 24, 2026
@giacomofiorin giacomofiorin requested review from HanatoK and jhenin and removed request for HanatoK January 24, 2026 00:37
@giacomofiorin
Copy link
Member Author

giacomofiorin commented Jan 24, 2026

Side note: an additional simplification I wanted to add would be to change the h_bond class so that it can handle multiple pairs (i.e. not just a single pair). The alpha variable at that point could just use a single child h_bond object instead of multiple ones (and the same could be done for the angles, at which point it's clearly way out of scope for this PR).

Inlining the functions is required to avoid the performance loss after
refactoring.
Copy link
Member

@HanatoK HanatoK left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! Although I need to use inline for compute_pair_coordnum to restore the performance after refactoring.

@giacomofiorin
Copy link
Member Author

Looks good to me! Although I need to use inline for compute_pair_coordnum to restore the performance after refactoring.

Thanks! I have been removing inline from template functions because it's generally redundant. But reading a bit more, that generally happens when the template functions are fully specialized: here, we have now four nested levels of template functions calling one another...

Comment on lines +60 to +88
if (function_type() != "selfCoordNum") {

group2 = parse_group(conf, "group2");
if (!group2) {
return error_code | COLVARS_INPUT_ERROR;
}

if (int atom_number = cvm::atom_group::overlap(*group1, *group2)) {
error_code |= cvm::error("Error: group1 and group2 share a common atom (number: " +
cvm::to_str(atom_number) + ")\n",
COLVARS_INPUT_ERROR);
}

if (function_type() == "coordNum") {
get_keyval(conf, "group1CenterOnly", b_group1_center_only, group2->b_dummy);
get_keyval(conf, "group2CenterOnly", b_group2_center_only, group2->b_dummy);
}

if (function_type() == "groupCoord") {
// In groupCoord, these flags are hard-coded
b_group1_center_only = true;
b_group2_center_only = true;
}

size_t const group1_num_coords = b_group1_center_only ? 1 : group1->size();
size_t const group2_num_coords = b_group2_center_only ? 1 : group2->size();

num_pairs = group1_num_coords * group2_num_coords;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not using different init() implementations for each derived class of coordnum?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although there are some sub-cases that differ, the initialization of the cutoffs, exponents and pairlist is shared.

int pairlist_freq = 100;

/// Pair list
std::vector<bool> pairlist;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When using pairlist, the performance degrades a lot compared to the master branch. I suspect it is because that std::vector<bool> actually use bit fields. @giacomofiorin Have you compared the performance before and after this PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yet, good to know.

My main goal was to clean up the code and condense code paths, so it should be easy to revert to a C-style array (at the price of higher memory footprint)

Copy link
Member

@HanatoK HanatoK Jan 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's possible to change to vector<char> but it still doesn't solve the issue of performance loss. You could merge the PR as is if you don't think that the performance loss on CPU is a big issue...

Copy link
Member Author

@giacomofiorin giacomofiorin Jan 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll need more precise numbers for relevant use cases, I think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Incorrect number of pairs and doc error in selfCoordNum

4 participants