Improve Fluxion match performance #1007

milroy · 2023-02-08T08:24:29Z

As reported in issue #1001 flux mini submit experiences pronounced slowdown for most of the Fluxion match policies except first. With the debugging help of @grondo and @trws we traced one source to an un-inlined accumulate operation.
Adding the inline hint to the function allows the compiler to properly inline it and performance improves by a factor of >100 for all policies except lonodex and hinodex (or, e.g., lonode with --exclusive -N1.
Another hotspot we encountered in debugging is the accumulate operation in calc_multilevel_scores:

flux-sched/resource/policies/dfu_match_multilevel_id_impl.hpp

Line 71 in f7bdcb5

return std::accumulate (m_multilevel_scores.begin (),

This O(n) accumulate operation is called for each finish_vtx, and even worse, currently scales linearly with the number of matches. This is because the accumulated vector isn't cleared after each traversal. We can replace the accumulate operation by a single int64_t which stores the prefix value.

While these changes improve performance for all policies, lonodex and hinodex are still slow. Future PRs will further improve performance by optimizing graph filtration and lookups on edge iteration.

jameshcorbett

LGTM

Edit: sorry I didn't see all the slack conversation still ongoing, so review is a bit premature.

milroy · 2023-02-16T02:20:35Z

@grondo and @garlick : @trws, @tpatki, @JaeseungYeom, and Zeke discussed perf traces and further performance improvements today.

We decided that those improvements will take longer to implement and will create further PRs. The contributions here will speed up most policies and as such can be reviewed.

grondo

This LGTM and already tested without issue on hetchy.

FYI - I pulled these 2 patches into our latest RPM build for TOSS 4 so we'll get these into the next release.

Just noticed a formatting error in the 2nd commit message, but otherwise this should get merged, IMO.

Perhaps we should also reword the PR title before merging since the title will be included in release notes (since this doesn't fully fix the perf degradation)

resource/policies/dfu_match_multilevel_id.hpp

milroy · 2023-02-16T07:48:11Z

Perhaps we should also reword the PR title before merging since the title will be included in release notes (since this doesn't fully fix the perf degradation)

I'll rename to "Improve Fluxion match performance" since the changes improve performance for every policy I tried. Of course, the nodex policies are still slow and need to be improved further.

Problem: `perf` profiling together with a test replacement of `plus` with an in-place addition reveals that the `plus` function is not inlined by the compiler. Add inline keyword to coerce the compiler to inline the function and others in the namespace. The change improves match performance across a run of the issue reproducer by a factor of >100 for several policies.

Problem: the current design of the multilevel ID match policy makes use of a vector of factors (`m_multilevel_factors`) to keep track of all the added factors and accumulates them to generate a prefix for the overall vertex score. The O(n) operation is repeated for each vertex. Furthermore, the vector is not cleared between match traversals, so the operation becomes increasingly expensive. Replace the vector with a single `int64_t` which contains the sum of all vertex factors. Instead of popping the current vertex's factor from a vector, simply subtract its factor stored in `score_factor_t`. The change accelerates exclusive matching by a factor of about 6.

codecov · 2023-02-16T08:16:27Z

Codecov Report

Merging #1007 (7aeee01) into master (f7bdcb5) will decrease coverage by 0.1%.
The diff coverage is 90.9%.

❗ Current head 7aeee01 differs from pull request most recent head b21c605. Consider uploading reports for the commit b21c605 to get more accurate results

@@           Coverage Diff            @@
##           master   #1007     +/-   ##
========================================
- Coverage    74.4%   74.4%   -0.1%     
========================================
  Files          86      86             
  Lines        9429    9434      +5     
========================================
+ Hits         7019    7021      +2     
- Misses       2410    2413      +3

Impacted Files	Coverage Δ
resource/evaluators/fold.hpp	`14.2% <0.0%> (ø)`
resource/policies/dfu_match_multilevel_id.hpp	`66.6% <ø> (ø)`
resource/policies/dfu_match_multilevel_id_impl.hpp	`82.1% <100.0%> (-2.7%)`	⬇️

grondo · 2023-02-16T14:37:51Z

Thanks! Feel free to set MWP when you are ready.

milroy force-pushed the perf-fix branch from 4ff8507 to 9371810 Compare February 8, 2023 08:26

jameshcorbett approved these changes Feb 9, 2023

View reviewed changes

milroy force-pushed the perf-fix branch from 93241e6 to 7a4f59e Compare February 15, 2023 03:00

milroy changed the title ~~WIP: Fix Fluxion match performance degradation~~ Fix Fluxion match performance degradation Feb 15, 2023

milroy force-pushed the perf-fix branch 3 times, most recently from 5b06eeb to 5177b81 Compare February 15, 2023 03:09

milroy requested a review from jameshcorbett February 16, 2023 02:14

milroy requested a review from grondo February 16, 2023 02:21

grondo approved these changes Feb 16, 2023

View reviewed changes

resource/policies/dfu_match_multilevel_id.hpp Show resolved Hide resolved

milroy changed the title ~~Fix Fluxion match performance degradation~~ Improve Fluxion match performance Feb 16, 2023

milroy force-pushed the perf-fix branch from 5177b81 to 2f8f723 Compare February 16, 2023 07:55

milroy added 2 commits February 15, 2023 23:58

milroy force-pushed the perf-fix branch from 2f8f723 to b21c605 Compare February 16, 2023 07:59

milroy added the merge-when-passing mergify.io - merge PR automatically once CI passes label Feb 16, 2023

mergify bot merged commit 35d3c96 into flux-framework:master Feb 16, 2023

milroy deleted the perf-fix branch February 16, 2023 18:12

milroy mentioned this pull request Feb 23, 2023

Job submission slows down on Hetchy #1001

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve Fluxion match performance #1007

Improve Fluxion match performance #1007

milroy commented Feb 8, 2023 •

edited

Loading

jameshcorbett left a comment •

edited

Loading

milroy commented Feb 16, 2023

grondo left a comment

milroy commented Feb 16, 2023

codecov bot commented Feb 16, 2023

grondo commented Feb 16, 2023

Improve Fluxion match performance #1007

Improve Fluxion match performance #1007

Conversation

milroy commented Feb 8, 2023 • edited Loading

jameshcorbett left a comment • edited Loading

Choose a reason for hiding this comment

milroy commented Feb 16, 2023

grondo left a comment

Choose a reason for hiding this comment

milroy commented Feb 16, 2023

codecov bot commented Feb 16, 2023

Codecov Report

grondo commented Feb 16, 2023

milroy commented Feb 8, 2023 •

edited

Loading

jameshcorbett left a comment •

edited

Loading