-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
<complex>
: avoid unnecessary conversion of real arguments to complex
#2855
<complex>
: avoid unnecessary conversion of real arguments to complex
#2855
Conversation
…unnecessary conversion of the real argument to complex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also compared the assembler output to the operator+(const complex<_Ty>&, const double&)
implementation above (on line 1402) - the optimized assembler is exactly the same for both, and for debug the new version is slightly better (one function call instead of two). Would it be possible for you to modify those versions to use the new style of code? (not something that's necessary, hence the approval 😄 )
Will do. I'll also apply it to |
This transformation doesn't assume commutativity; operand order is preserved. Like the changes for `complex + T` and `complex * T`, this replaces 2 and 4 function calls with 1.
I've pushed a commit to Before: movups xmm0, XMMWORD PTR [rdx]
mov rax, rcx
movups XMMWORD PTR [rcx], xmm0
divsd xmm0, xmm2
movsd QWORD PTR [rcx], xmm0
movsd xmm0, QWORD PTR [rcx+8]
divsd xmm0, xmm2
movsd QWORD PTR [rcx+8], xmm0
ret 0 After: movups xmm1, XMMWORD PTR [rdx]
mov rax, rcx
movaps xmm0, xmm2
unpcklpd xmm0, xmm0
divpd xmm1, xmm0
movups XMMWORD PTR [rcx], xmm1
ret 0 Test program: #include <complex>
using namespace std;
complex<double> meow_sub(const complex<double>& l, double r) {
return l - r;
}
complex<double> meow_div(const complex<double>& l, double r) {
return l / r;
} |
I'm mirroring this to the MSVC-internal repo - please notify me if any further changes are pushed. |
Thanks for noticing and improving this codegen! 🚀 ✅ 🎉 |
microsoft#2855) Co-authored-by: Stephan T. Lavavej <stl@nuwen.net>
Mixed real/complex multiplication and addition are commutative (and I don't think there are any observable side effects), so it's unnecessary to strictly follow the Returns clauses in [complex.ops]/2 and [complex.ops]/5. Converting a real parameter on the LHS to
complex
results in a full complex/complex operation that the optimizer is unable to clean up.https://godbolt.org/z/KW3sWE6YY
Multiplication codegen
Addition codegen