-
Notifications
You must be signed in to change notification settings - Fork 265
Ref 314 #454
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ref 314 #454
Conversation
9d52380
to
c7980e9
Compare
@@ -176,6 +176,45 @@ namespace xsimd | |||
return _mm512_sub_epi32(lhs, rhs); | |||
} | |||
|
|||
static batch_type sadd(const batch_type& lhs, const batch_type& rhs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll benchmark that approach compared to the one based on a comparison + a blend.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, found a nice and efficient solution based on min/max for the unsigned version \o/
7df8a5e
to
b02d901
Compare
@JohanMabille ready for review. I did some benchmarking of the generic adds method, applying it to |
4ff5cb2
to
e613094
Compare
It's good to have xsimd_scalar as standalone as possible.
* int8,uint8,int16,uint16,int32,uint32,int64,uint64,float,double * sse2/sse4 * avx/avx2 * avx512 * fallback * neon
e613094
to
c57cede
Compare
@JohanMabille ready for another round of review :-) |
c57cede
to
44c0366
Compare
faster and simpler saturated add / sub for unsigned types when the builtin doesn't exist. It uses a min/max instead of an explicit comparison, for this instruction has a nice latency of 1 on sse and avx2 and avx512. Also add a doc entry.
44c0366
to
30ff9c4
Compare
@JohanMabille reen and cleaned-up o/ |
* Distributed under the terms of the BSD 3-Clause License. * | ||
* * | ||
* The full license is in the file LICENSE, distributed with this software. * | ||
****************************************************************************/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this file should live in the types
subfolder instead of math
to guarantee the non cyclic dependency math -> types
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file is only included from the math folder, so I think it's the right place.
@@ -16,6 +16,8 @@ | |||
#include <cmath> | |||
#include <utility> | |||
|
|||
#include "xsimd/math/xsimd_scalar.hpp" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@serge-sans-paille I mixed up with this include, sorry.
Awesome! |
This is a recommit of #314 rebased with some code cleanup and commit split, and hopefully a few bug fixes to come