Skip to content

x86(64) runtime performance irregularities #31503

Closed
@MagaTailor

Description

@MagaTailor

Mandel-rust benchmark produces the following results:

https://gist.github.com/petevine/b70b6e5a434f23b40ab5

TL;DR
32-bit code performance looks like this:
P2(3) > Core2 > P4 (x86_64 too)

P2(3) being the only ones to scale on 2 cores in all benchmarks.

It's either a sign of LLVM being buggy or I was more right about P4 codegen producing suboptimal code than I'd ever suspected. (x86_64 is affected too so it could be something else though)

Naturally, the common theme could be the use of SSE2 which is absent from the fastest code:

Configuration: re1: -2.00, re2: 1.00, img1: -1.50, img2: 1.50, max_iter: 2048, img_size: 1024, num_threads: 2
Time taken for this run (serial): 2469.21302 ms
Time taken for this run (scoped_thread_pool): 1248.45883 ms
Time taken for this run (simple_parallel): 1284.73761 ms
Time taken for this run (rayon_join): 1246.36625 ms
Time taken for this run (rayon_par_iter): 1337.93075 ms
Time taken for this run (rust_scoped_pool): 1240.33273 ms
Time taken for this run (job_steal): 1241.20777 ms
Time taken for this run (job_steal_join): 1246.34885 ms
Time taken for this run (kirk_crossbeam): 1244.10723 ms

Metadata

Metadata

Assignees

No one assigned

    Labels

    I-slowIssue: Problems and improvements with respect to performance of generated code.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions