-
Notifications
You must be signed in to change notification settings - Fork 99
Merge dot products in PIPECG #1908
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
part 1/2. Still need to look at the algorithm itself.
f24265c
to
23a35a3
Compare
Addressed the suggestions yesterday, rebased now. |
23a35a3
to
4c7a3d5
Compare
b3aee64
to
c2be7cb
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #1908 +/- ##
===========================================
- Coverage 89.10% 88.89% -0.22%
===========================================
Files 857 857
Lines 71806 71865 +59
===========================================
- Hits 63982 63883 -99
- Misses 7824 7982 +158 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM in general.
- use submatrix creation
Performance concern: - z1 z2 can be one vector after we have conjugate spmv
- some access will be strided access
core/solver/pipe_cg.cpp
Outdated
|
||
LocalVector* rw = this->template create_workspace_op<LocalVector>( | ||
GKO_SOLVER_TRAITS::rw, conjoined_size); | ||
auto r_unique = LocalVector::create( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use create_submatrix for right setup.
second one create array view include the non-existed memory
core/solver/pipe_cg.cpp
Outdated
rw->get_values()), | ||
b_stride * 2); | ||
auto* r = r_unique.get(); | ||
auto w_unique = LocalVector::create( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here and following
beta, gko::detail::get_local(p), gko::detail::get_local(q), | ||
gko::detail::get_local(f), gko::detail::get_local(g), | ||
gko::detail::get_local(z), gko::detail::get_local(w), | ||
gko::detail::get_local(z1), gko::detail::get_local(w), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note z1 is strided access which might lower your performance
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you think there's a way to avoid this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
currently no under this storage. You need change z storage and it will also require change the other to fit the need
ab40d16
to
995969d
Compare
6a73634
to
66f4ebe
Compare
66f4ebe
to
1da0064
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! LGTM!
|
7c24aea
to
781631d
Compare
@pratikvn thanks and also thanks for helping me! I rebased the PR |
Is it ready to review? and do you have performance comparison? |
This PR continues on the work done in PIPECG by making it fulfill its initial purpose of reducing the number of dot products in the algorithm by merging them into a singe dot operation.