Skip to content

std.mem.indexOfSentinel() vectorized path violates pointer aliasing rules #23184

Open
@alexrp

Description

@alexrp

zig/lib/std/mem.zig

Lines 1097 to 1142 in 8e0a4ca

// The below branch assumes that reading past the end of the buffer is valid, as long
// as we don't read into a new page. This should be the case for most architectures
// which use paged memory, however should be confirmed before adding a new arch below.
.aarch64, .x86, .x86_64 => if (std.simd.suggestVectorLength(T)) |block_len| {
const page_size = std.heap.page_size_min;
const block_size = @sizeOf(T) * block_len;
const Block = @Vector(block_len, T);
const mask: Block = @splat(sentinel);
comptime assert(std.heap.page_size_min % @sizeOf(Block) == 0);
assert(page_size % @sizeOf(Block) == 0);
// First block may be unaligned
const start_addr = @intFromPtr(&p[i]);
const offset_in_page = start_addr & (page_size - 1);
if (offset_in_page <= page_size - @sizeOf(Block)) {
// Will not read past the end of a page, full block.
const block: Block = p[i..][0..block_len].*;
const matches = block == mask;
if (@reduce(.Or, matches)) {
return i + std.simd.firstTrue(matches).?;
}
i += @divExact(std.mem.alignForward(usize, start_addr, block_size) - start_addr, @sizeOf(T));
} else {
@branchHint(.unlikely);
// Would read over a page boundary. Per-byte at a time until aligned or found.
// 0.39% chance this branch is taken for 4K pages at 16b block length.
//
// An alternate strategy is to do read a full block (the last in the page) and
// mask the entries before the pointer.
while ((@intFromPtr(&p[i]) & (block_size - 1)) != 0) : (i += 1) {
if (p[i] == sentinel) return i;
}
}
assert(std.mem.isAligned(@intFromPtr(&p[i]), block_size));
while (true) {
const block: *const Block = @ptrCast(@alignCast(p[i..][0..block_len]));
const matches = block.* == mask;
if (@reduce(.Or, matches)) {
return i + std.simd.firstTrue(matches).?;
}
i += block_len;
}
},

I think this is pretty clearly a violation of LLVM's pointer aliasing rules. LLVM could break this code whenever and would be fully justified in doing so.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugObserved behavior contradicts documented or intended behaviorstandard libraryThis issue involves writing Zig code for the standard library.

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions