Skip to content

Portable SIMD project group #29

Closed
Closed

Description

WARNING

The Major Change Process was proposed in RFC 2936 and is not yet in
full operation. This template is meant to show how it could work.

Proposal

Create a project group for considering what portable SIMD in the standard library should look like.

Motivation, use-cases, and solution sketches

While Rust presently exposes ALU features of the underlying ISAs in a portable way, it doesn't expose SIMD capabilities in a portable way except for autovectorization.

A wide variety of computation tasks can be accomplished faster using SIMD than using the ALU capabilities. Relying on autovectorization to go from ALU-oriented source code to SIMD-using object code is not a proper programming model. It is brittle and depends on the programmer being able to guess correctly what the compiler back end will do. Requiring godbolting for every step is not good for programmer productivity.

Using ISA-specific instructions results in ISA-specific code. For things like "perform lane-wise addition these two vectors of 16 u8 lanes" should be a portable operation for the same reason as "add these two u8 scalars" is a portable operation that does not require the programmer to write ISA-specific code.

Typical use cases for SIMD involve text encoding conversion and graphics operations on bitmaps. Firefox already relies of the Rust packed_simd crate for text encoding conversion.

Compiler back ends in general and LLVM in particular provide a notion of portable SIMD where the types are lane-aware and of particular size and the operations are ISA-independent and lower to ISA-specific instructions later. To avoid a massive task of replicating the capabilities of LLVM's optimizer and back ends, it makes sense to leverage this existing capability.

However, to avoid exposing the potentially subject-to-change LLVM intrinsics, it makes sense expose an API that is conceptually close and maps rather directly to the LLVM concepts while making sense for Rust and being stable for Rust applications. This means introducing lane-aware types of typical vector sizes, such as u8x16, i16x8, f32x4, etc., and providing lane-wise operations that are broadly supported by various ISAs on these types. This means basic lane-wise arithmetic and comparisons.

Additionally, it is essential to provide shuffles where what lane goes where is known at compile time. Also, unlike the LLVM layer, it makes sense to provide distinct boolean/mask vector types for the outputs of lanewise comparisons, because encoding the invariant that all bits of a lane are either one or zero allows operations like "are all lanes true" or "is at least one lane true" to be implemented more efficiently especially on x86/x86_64.

When the target doesn't support SIMD, LLVM provides ALU-based emulation, which might not be a performance win compared to manual ALU code, but at least keeps the code portable.

When the target does support SIMD, the portable types must be zero-cost transmutable to the types that vendor intrinsics accept, so that specific things can be optimized with ISA-specific alternative code paths.

The packed_simd crate provides an implementation that already works across a wide variety of Rust targets and that has already been developed with the intent that it could become std::simd. It makes sense not to start from scratch but to start from there.

The code needs to go in the standard library if it is assumed that rustc won't, on stable Rust, expose the kind of compiler internals that packed_simd depends on.

Please see the FAQ.

Links and related work

The Major Change Process

Once this MCP is filed, a Zulip topic will be opened for discussion. Ultimately, one of the following things can happen:

  • If this is a small change, and the team is in favor, it may be approved to be implemented directly, without the need for an RFC.
  • If this is a larger change, then someone from the team may opt to work with you and form a project group to work on an RFC (and ultimately see the work through to implementation).
  • Alternatively, it may be that the issue gets closed without being accepted. This could happen because:
    • There is no bandwidth available to take on this project right now.
    • The project is not a good fit for the current priorities.
    • The motivation doesn't seem strong enough to justify the change.

You can read [more about the lang-team MCP process on forge].

Comments

This issue is not meant to be used for technical discussion. There is a Zulip stream for that. Use this issue to leave procedural comments, such as volunteering to review, indicating that you second the proposal (or third, etc), or raising a concern that you would like to be addressed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions