Skip to content

Still another array access performance issue (in debug) #15685

Open
@IntegratedQuantum

Description

@IntegratedQuantum

Zig Version

0.11.0-dev.3105+e46d7a369

Steps to Reproduce and Observed Behavior

This is a follow-up to #13938 which was only fixed in release mode(which means I still have to rely on the workaround to be able to run my code in debug mode). The reproduction code is basically identical:

const std = @import("std");

const len = 32*32*32;

fn getIndex(i: u16) u16 {
	return i;
}

pub const Chunk = struct {
	blocks: [len]u16 = undefined,
};

pub noinline fn regenerateMainMesh(chunk: *Chunk) u32 {
	var sum: u32 = 0;
	var i: u16 = 0;
	while(i < len) : (i += 1) {
		sum += chunk.blocks[getIndex(i)]; // ← workaround: (&chunk.blocks)[...]
	}
	return sum;
}

pub fn main() void {
	var chunk: Chunk = Chunk{};
	for(&chunk.blocks, 0..) |*block, i| {
		block.* = @intCast(i);
	}
	const start = std.time.nanoTimestamp();
	const sum = regenerateMainMesh(&chunk);
	const end = std.time.nanoTimestamp();
	std.log.err("Time: {} Sum: {}", .{end - start, sum});
}

Running this in debug mode is taking 93 ms for iterating over a 32768 element array:

$ zig run test.zig
error: Time: 93325090 Sum: 536854528

Running this in a profiler, I can see that the bottleneck still is a memcpy that is called once every iteration and appears to be copying the entire array:
Screenshot at 2023-05-13 10-14-00
Screenshot at 2023-05-13 10-08-58

Expected Behavior

When applying the workaround

- sum += chunk.blocks[getIndex(i)]; // ← workaround: (&chunk.blocks)[...]
+ sum += (&chunk.blocks)[getIndex(i)]; // ← workaround: (&chunk.blocks)[...]

the performance is significantly better, taking only 0.4 ms instead of 93 ms:

$ zig run test.zig
error: Time: 357398 Sum: 536854528

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugObserved behavior contradicts documented or intended behavioroptimizationregressionIt worked in a previous version of Zig, but stopped working.

    Type

    No type

    Projects

    Status

    Optimize Machine Code

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions