Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows: Faster getenvW and a standalone environment variable test #23272

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

squeek502
Copy link
Collaborator

@squeek502 squeek502 commented Mar 17, 2025

Inspired by #23265, I thought I'd try applying the same strategy to the Windows implementation. Also adds a standalone test to make sure the functionality remains the same.


Instead of parsing the full key and value for each environment variable before checking the key for (case-insensitive) equality, we skip to the next environment variable once it's no longer possible for the key to match.

This makes getting environment variables about 2x faster across the board on Windows.

Note: We still have to scan to find the end of each environment variable (even the ones that are skipped, since we only know where it ends by a NUL terminator), so this strategy doesn't provide the same speedup on Windows as it does on POSIX (#23265)


Benchmark code
const std = @import("std");

pub fn main() !void {
    var arena = std.heap.ArenaAllocator.init(std.heap.page_allocator);
    defer arena.deinit();
    const allocator = arena.allocator();

    const Bench = enum {
        all,
        found,
        short,
        long,
        random,
        empty,
    };

    var args = try std.process.argsWithAllocator(allocator);
    defer args.deinit();

    _ = args.next();
    const bench: Bench = bench: {
        const str = args.next() orelse break :bench .all;
        break :bench std.meta.stringToEnum(Bench, str) orelse {
            std.debug.print("bench not recognized: {s}\n", .{str});
            std.process.exit(1);
        };
    };

    const num_iterations = 1000000;

    var env_map = try std.process.getEnvMap(allocator);
    defer env_map.deinit();

    var names: std.ArrayListUnmanaged([]const u8) = try .initCapacity(allocator, env_map.count());
    defer names.deinit(allocator);
    var longest_name: usize = 0;
    var it = env_map.iterator();
    while (it.next()) |entry| {
        if (entry.key_ptr.*.len > longest_name) longest_name = entry.key_ptr.*.len;
        names.appendAssumeCapacity(entry.key_ptr.*);
    }

    var name_mod_buf: std.ArrayListUnmanaged(u8) = try .initCapacity(allocator, longest_name + 1);
    defer name_mod_buf.deinit(allocator);

    var timer = try std.time.Timer.start();
    var prng = std.Random.DefaultPrng.init(0);
    const rand = prng.random();

    if (bench == .found or bench == .all) {
        const elapsed = elapsed: {
            timer.reset();
            for (0..num_iterations) |_| {
                const name = names.items[rand.uintLessThan(usize, names.items.len)];
                const value = try std.process.getEnvVarOwned(allocator, name);
                std.mem.doNotOptimizeAway(&value);
            }
            break :elapsed timer.read();
        };
        std.debug.print("found: {}/lookup\n", .{std.fmt.fmtDuration(elapsed / num_iterations)});
    }

    if (bench == .long or bench == .all) {
        const elapsed = elapsed: {
            timer.reset();
            for (0..num_iterations) |_| {
                const name = names.items[rand.uintLessThan(usize, names.items.len)];
                name_mod_buf.clearRetainingCapacity();
                name_mod_buf.appendSliceAssumeCapacity(name);
                // Append a random ascii character
                name_mod_buf.appendAssumeCapacity(rand.int(u7));
                const value = std.process.getEnvVarOwned(allocator, name_mod_buf.items) catch |err| switch (err) {
                    error.EnvironmentVariableNotFound => continue,
                    error.InvalidWtf8 => unreachable,
                    error.OutOfMemory => |e| return e,
                };
                std.mem.doNotOptimizeAway(&value);
            }
            break :elapsed timer.read();
        };
        std.debug.print("one char too long: {}/lookup\n", .{std.fmt.fmtDuration(elapsed / num_iterations)});
    }

    if (bench == .short or bench == .all) {
        const elapsed = elapsed: {
            timer.reset();
            for (0..num_iterations) |_| {
                const name = names.items[rand.uintLessThan(usize, names.items.len)];
                const name_trunc = name[0 .. name.len - 1];
                const value = std.process.getEnvVarOwned(allocator, name_trunc) catch |err| switch (err) {
                    error.EnvironmentVariableNotFound => continue,
                    error.InvalidWtf8 => unreachable,
                    error.OutOfMemory => |e| return e,
                };
                std.mem.doNotOptimizeAway(&value);
            }
            break :elapsed timer.read();
        };
        std.debug.print("one char too short: {}/lookup\n", .{std.fmt.fmtDuration(elapsed / num_iterations)});
    }

    if (bench == .random or bench == .all) {
        const elapsed = elapsed: {
            timer.reset();
            for (0..num_iterations) |_| {
                const len = rand.uintAtMost(usize, longest_name);
                name_mod_buf.items.len = len;
                for (name_mod_buf.items) |*c| {
                    c.* = rand.int(u7);
                }
                const value = std.process.getEnvVarOwned(allocator, name_mod_buf.items) catch |err| switch (err) {
                    error.EnvironmentVariableNotFound => continue,
                    error.InvalidWtf8 => unreachable,
                    error.OutOfMemory => |e| return e,
                };
                std.mem.doNotOptimizeAway(&value);
            }
            break :elapsed timer.read();
        };
        std.debug.print("random ascii: {}/lookup\n", .{std.fmt.fmtDuration(elapsed / num_iterations)});
    }

    if (bench == .empty or bench == .all) {
        const elapsed = elapsed: {
            timer.reset();
            for (0..num_iterations) |_| {
                const value = std.process.getEnvVarOwned(allocator, "") catch |err| switch (err) {
                    error.EnvironmentVariableNotFound => continue,
                    error.InvalidWtf8 => unreachable,
                    error.OutOfMemory => |e| return e,
                };
                std.mem.doNotOptimizeAway(&value);
            }
            break :elapsed timer.read();
        };
        std.debug.print("empty name: {}/lookup\n", .{std.fmt.fmtDuration(elapsed / num_iterations)});
    }
}
// all environment variable lookups are found
  benchenv.exe found ran
    1.98 ± 0.57 times faster than benchenv-master.exe found

// all environment variable lookups have their name truncated by one byte
  benchenv.exe short ran
    1.95 ± 0.43 times faster than benchenv-master.exe short

// all environment variable lookups have an extra ASCII character added to the name
  benchenv.exe long ran
    2.03 ± 0.37 times faster than benchenv-master.exe long

// all environment variable lookups are random strings of ASCII characters
  benchenv.exe random ran
    2.05 ± 0.53 times faster than benchenv-master.exe random

// looking up a zero-length string as the name
  benchenv.exe empty ran
    3.24 ± 0.61 times faster than benchenv-master.exe empty

(presumably, the magnitude of any speedup would also (on average) increase as the number of environment variables in the environment increases)

@squeek502 squeek502 changed the title Windows: Faster getenvW and standalone environment variable test Windows: Faster getenvW and a standalone environment variable test Mar 17, 2025
Instead of parsing the full key and value for each environment variable before checking the key for (case-insensitive) equality, we skip to the next environment variable once it's no longer possible for the key to match.

This makes getting environment variables about 2x faster across the board on Windows.

Note: We still have to scan to find the end of each environment variable (even the ones that are skipped; we only know where it ends by a NUL terminator), so this strategy doesn't provide the same speedup on Windows as it does on POSIX
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant