-
Notifications
You must be signed in to change notification settings - Fork 443
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Web] Add flag to enable SIMD instructions in WASM #1376
base: main
Are you sure you want to change the base?
Conversation
(With my skeptical hat on.) Do you have some numbers to back this? I'm interested in how much this helps in a codebase of this size. |
@mosra Probably not on Magnum, but WASM doesn't even enable SSE2 instructions without this flag for Bullet so it should give a speed up there. Regardless, point taken retracting my claim slightly. |
It's not as simple as "enabling SSE2" since WASM has to work on ARM as well -- and that's why I'm skeptical, because different platforms have different instructions and what could directly map to a SSE instruction might have to be emulated on NEON and vice versa. But in any case I really want to know how this helps, seriously :) Did you try it out? I know from certain projects that hand-coded WASM SIMD can be four to six times times faster than scalar code, but have no idea about autovectorization, especially when combined with everything else we're running here. Is it 1%? 10%? 2x faster? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for doing this! I'm also curious to evaluate the speedup. @ldcWV It would be cool to try this with your upcoming JS physics benchmark.
@@ -4,12 +4,14 @@ | |||
set -e | |||
|
|||
BULLET=false | |||
USE_SIMD=false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some documentation would be nice; even just linking back to this PR.
I tried this on my webxr hand demo benchmark, which drops a lot of objects in a big pile and tries to step physics 60 times per second (or 16.7 ms per stepWorld() call). I repeated the benchmark 3 times with and without the --simd flag and here were the results: without --simd: with --simd: These numbers are the average ms between stepWorld() calls. Note that it's trying to achieve 16.67ms per stepWorld() call but cannot keep up. So it doesn't seem like this SIMD optimization has helped much in this case. |
Motivation and Context
quite significantly. The browser must support SIMD in WASM like modern versions of Chrome though.How Has This Been Tested
Types of changes
Checklist