Skip to content

zig prereq races as of 0.11.0-dev.1782+b52be973d #14815

Closed
uber/hermetic_cc_toolchain
#75
@motiejus

Description

@motiejus

Dear maintainers,

We still observe some very-hard-to-reproduce races when building libc/rt prerequisites.

Some anecdotal evidence:

error: FileNotFound

When executed on a fresh installation, zig build-exe toolchain/launcher.zig (exact command) sometimes fails with:

error: FileNotFound

This happens only on a fresh $ZIG_GLOBAL_CACHE_DIR (which we keep in /tmp/bazel-zig-cc). We have seen this happen on Darwin x86_64 and Darwin M1. We may saw it on Linux, but I no longer have the logs to verify. My memory is poor.

I tried to reproduce this on my MacOS machine overnight, without success. But we do receive a couple of complaints a week consistently over the last few weeks. Note that the sample size is quite large.

libcompiler_rt.a: No such file or directory

This happened on our CI yesterday:

/tmp/bazel-zig-cc/o/4421bb1adcf01feee7185ccb98640027/libcompiler_rt.a: No such file or directory

Unfortunately, I can no longer access the build host nor access it's global cache dir. It may be related.

Summary

I understand this is very little information to troubleshoot effectively. Here are the steps I am trying to do:

  1. Capture the first error (FileNotFound) on any Linux machine and instruct the engineer to re-run the command under strace, to see which file they are missing. However, this was not reported on Linux for the last week or so: either it did not happen, or people learned to remove the cache directory and move on. Since this happens more on OSX, it would make sense to debug it here. However, our engineers cannot run dtruss for compliance reasons.
  2. I will try to repro this on my MacOS machine again, but slightly differently.

Food for thought: is it time to reconsider how error context is propagated during the build phase, so errors could be augmented with additional context?

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugObserved behavior contradicts documented or intended behavior

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions