Skip to content

Conversation

gojakuch
Copy link
Collaborator

@gojakuch gojakuch commented Aug 6, 2025

This PR continues on the work done in PIPECG by making it fulfill its initial purpose of reducing the number of dot products in the algorithm by merging them into a singe dot operation.

@gojakuch gojakuch added the 1:ST:WIP This PR is a work in progress. Not ready for review. label Aug 6, 2025
@ginkgo-bot ginkgo-bot added reg:testing This is related to testing. mod:core This is related to the core module. mod:cuda This is related to the CUDA module. mod:reference This is related to the reference module. type:solver This is related to the solvers mod:hip This is related to the HIP module. labels Aug 6, 2025
@pratikvn pratikvn self-requested a review August 7, 2025 07:53
Copy link
Member

@pratikvn pratikvn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

part 1/2. Still need to look at the algorithm itself.

@gojakuch gojakuch force-pushed the feat/pipe-cg-dotmerge branch 2 times, most recently from f24265c to 23a35a3 Compare August 9, 2025 19:23
@gojakuch
Copy link
Collaborator Author

gojakuch commented Aug 9, 2025

Addressed the suggestions yesterday, rebased now.
It fails the test comparing the reference implementation to omp on my machine, which is very odd, as they seem to do the same thing. the reference implementation is now totally correct though, it passes the tests. somehow smth is wrong with step_1, still debugging it

@gojakuch gojakuch force-pushed the feat/pipe-cg-dotmerge branch from 23a35a3 to 4c7a3d5 Compare August 15, 2025 18:34
@gojakuch gojakuch requested a review from pratikvn August 15, 2025 18:36
@gojakuch gojakuch added 1:ST:ready-for-review This PR is ready for review 1:ST:run-full-test and removed 1:ST:WIP This PR is a work in progress. Not ready for review. labels Aug 18, 2025
@gojakuch gojakuch changed the title Merge the dot products in PIPECG Merge dot products in PIPECG Aug 18, 2025
@gojakuch gojakuch force-pushed the feat/pipe-cg-dotmerge branch from b3aee64 to c2be7cb Compare August 30, 2025 20:10
Copy link

codecov bot commented Sep 2, 2025

Codecov Report

❌ Patch coverage is 95.87629% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 88.89%. Comparing base (36688a5) to head (7c24aea).
⚠️ Report is 4 commits behind head on develop.

Files with missing lines Patch % Lines
core/solver/pipe_cg.cpp 94.00% 3 Missing ⚠️
test/solver/solver.cpp 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #1908      +/-   ##
===========================================
- Coverage    89.10%   88.89%   -0.22%     
===========================================
  Files          857      857              
  Lines        71806    71865      +59     
===========================================
- Hits         63982    63883      -99     
- Misses        7824     7982     +158     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Member

@yhmtsai yhmtsai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM in general.

  • use submatrix creation
    Performance concern:
  • z1 z2 can be one vector after we have conjugate spmv
  • some access will be strided access


LocalVector* rw = this->template create_workspace_op<LocalVector>(
GKO_SOLVER_TRAITS::rw, conjoined_size);
auto r_unique = LocalVector::create(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use create_submatrix for right setup.
second one create array view include the non-existed memory

rw->get_values()),
b_stride * 2);
auto* r = r_unique.get();
auto w_unique = LocalVector::create(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here and following

beta, gko::detail::get_local(p), gko::detail::get_local(q),
gko::detail::get_local(f), gko::detail::get_local(g),
gko::detail::get_local(z), gko::detail::get_local(w),
gko::detail::get_local(z1), gko::detail::get_local(w),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note z1 is strided access which might lower your performance

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you think there's a way to avoid this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

currently no under this storage. You need change z storage and it will also require change the other to fit the need

yhmtsai
yhmtsai previously requested changes Sep 3, 2025
Copy link
Member

@pratikvn pratikvn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! LGTM!

Copy link

@gojakuch gojakuch force-pushed the feat/pipe-cg-dotmerge branch from 7c24aea to 781631d Compare September 23, 2025 22:24
@gojakuch
Copy link
Collaborator Author

Nice work! LGTM!

@pratikvn thanks and also thanks for helping me! I rebased the PR

@yhmtsai
Copy link
Member

yhmtsai commented Sep 24, 2025

Is it ready to review? and do you have performance comparison?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1:ST:ready-for-review This PR is ready for review 1:ST:run-full-test mod:core This is related to the core module. mod:cuda This is related to the CUDA module. mod:hip This is related to the HIP module. mod:reference This is related to the reference module. reg:testing This is related to testing. type:solver This is related to the solvers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants