-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add native PauliRot implementation in LightningKokkos [sc-71642] #855
Conversation
Co-authored-by: Luis Alfredo Nuñez Meneses <alfredo.nunez@xanadu.ai>
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #855 +/- ##
==========================================
+ Coverage 96.24% 97.29% +1.04%
==========================================
Files 212 168 -44
Lines 28109 21118 -6991
==========================================
- Hits 27054 20547 -6507
+ Misses 1055 571 -484 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job! I'm ready to approve. But first, could you please double-check the codecov coverage complains?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for checking!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice job. Thank you @vincentmr. Only a couple of comments
pennylane_lightning/core/src/simulators/lightning_kokkos/gates/BasicGateFunctors.hpp
Show resolved
Hide resolved
pennylane_lightning/core/src/simulators/lightning_kokkos/gates/BasicGateFunctors.hpp
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good achievement of performance.Thank you @vincentmr 🚀
Before submitting
Please complete the following checklist when submitting a PR:
All new features must include a unit test.
If you've fixed a bug or added code that should be tested, add a test to the
tests
directory!All new functions and code must be clearly commented and documented.
If you do make documentation changes, make sure that the docs build and
render correctly by running
make docs
.Ensure that the test suite passes, by running
make test
.Add a new entry to the
.github/CHANGELOG.md
file, summarizing thechange, and including a link back to the PR.
Ensure that code is properly formatted by running
make format
.When all the above are checked, delete everything above the dashed
line and fill in the pull request template.
Context:
Pauli rotations come up in many places, and importantly in the time evolution of qchem Hamiltonians. It is therefore worth considering ways to accelerate their execution.
Description of the Change:
Implement
applyPauliRot
. InvokeapplyPauliRot
directly from the SV class and add bindings to the Python layer.Benefits:
Faster Pauli rotations. I performed a benchmark on random
PauliRotation
s (runtime > 1.0 sec and at least 5 of them) through the Python layer. The data remains noisy with 5 samples because the performance varies depending on the specific "XYZ" sequence (which translates into more or less predictable memory access patterns). Overall, we see an advantage for 3+ qubits and up.I performed the same benchmark on an A100 card with the Kokkos-CUDA backend, but using at least 500 samples since the absolute timings quite small and get the following speed-ups.
Using a full workflow such as
to benchmark, we obtain timings as follows
For large enough molecules (>= 20 qubits, >= 1000 terms), the new PauliRot kernels have a clear advantage which only grows with molecular size. It is worth noting that with L-Kokkos-CUDA, even at the (24/10k) scale, evaluating the circuit is not the main bottleneck which is why it takes about the same time simulating HCN (2.64 sec.
apply_lightning
vs 32.5 sec.QNode
) and N2N2 (7.51 sec.apply_lightning
vs 36.4 sec.QNode
).Possible Drawbacks:
Related GitHub Issues:
[sc-69801]