Skip to content

Collect<Vec<u16>> from range doesn't optimize well. #43124

Closed
@oyvindln

Description

@oyvindln

(At least on x86-64 nightly)

Using code like this:
https://is.gd/nkoecB

The version using collect is significantly slower than creating a vec of 0-values and setting the values manually.

test using_collect ... bench:     117,777 ns/iter (+/- 6,424)
test using_manual  ... bench:       7,677 ns/iter (+/- 365)
test using_unsafe  ... bench:       3,866 ns/iter (+/- 394)

On the other hand, if using u32 instead with the same code collect is much better:

test using_collect ... bench:       7,677 ns/iter (+/- 555)
test using_manual  ... bench:      12,487 ns/iter (+/- 836)
test using_unsafe  ... bench:       7,741 ns/iter (+/- 413)

Same with u64:

test using_collect ... bench:      18,675 ns/iter (+/- 1,335)
test using_manual  ... bench:      29,692 ns/iter (+/- 1,864)
test using_unsafe  ... bench:      18,559 ns/iter (+/- 1,065)

I suspect this may be SIMD-related. Will see if there are similar results on stable.

Metadata

Metadata

Assignees

No one assigned

    Labels

    C-enhancementCategory: An issue proposing an enhancement or a PR with one.I-slowIssue: Problems and improvements with respect to performance of generated code.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions