Skip to content

Conversation

@jayconrod
Copy link
Contributor

Previously, it was one job that ran on Linux with CI runners.
We need to sign and notarize macOS binaries though, and we can
only do that on a macOS machine.

So this change splits the workflow into multiple jobs. The first
job checks the version string, next we build binaries (signing
not done yet), last we create the release.

@jayconrod jayconrod marked this pull request as ready for review October 18, 2024 20:21
@jayconrod
Copy link
Contributor Author

I was hoping to get this into shape this week, but Ulf and Yannic are both out next week, and I think only they have access to the real certificate. The dev certificate doesn't work for signing. So this will be blocked for a while.

Let's at least land this refactoring though.

@jayconrod
Copy link
Contributor Author

Copy link
Contributor

@minor-fixes minor-fixes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

w.r.t. the libc version requirement - do you think we should be testing for that on release? If so, it sounds like adding an sh_test that takes the built binary as data might not be the best path, if it depends on which remote runner it runs on - so would another job in this workflow make sense here?

Comment on lines +61 to +62
# macOS or Windows yet. We use a Debian 11 image because binaries built on
# newer versions are incompatible with older version due to the libc runtime
Copy link
Contributor

@minor-fixes minor-fixes Oct 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(for my own education) what would it take to build a compatible version without relying on the remote runner's userspace? Is this a deficiency in bazel + rules_go, or in the Go tooling, or in our own setup?

EDIT: It seems clear that rules_go does depend on a C++ toolchain, so I guess the first issue is that we don't define hermetic C++ toolchains in this repo

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could build a binary in "pure" mode, without a C++ toolchain. That means we wouldn't build any cgo code, Go's linker would not invoke the C linker, and we would not depend on libc.

The main drawback is that some standard library packages work differently without cgo. net uses its own DNS resolver (probably fine), and os/user reads /etc/passwd directly instead of integrating with PAM. We use os/user, so I'm worried that would change behavior on systems that don't use /etc/passwd.

- self-hosted
- os=linux
- arch=x64
- "engflow-container-image=docker://645088952840.dkr.ecr.eu-west-1.amazonaws.com/engflow-ci/debian12-dind-x64@sha256:763903935682de148b4e09fe1d7ef3bbc4ec829d59c3f41cb9519984639eaa06"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

debian12, despite the comment above?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! Glad you caught this.

if [[ "${OS}" != 'macos' ]]; then
return
fi
# Files in $RUNNER_TEMP are automatically removed on completion.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(future PR idea) if this is true, then release-create.sh could use $RUNNER_TEMP for the gh binary, and save on some cleanup logic - is that right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that would work. I just learned about RUNNER_TEMP from another script.

For gh specifically though, I think we should throw it into the CI runner container image. Now that we're using CI runners for Linux, it's much easier to add build-time dependencies like that.

@jayconrod
Copy link
Contributor Author

w.r.t. the libc version requirement - do you think we should be testing for that on release? If so, it sounds like adding an sh_test that takes the built binary as data might not be the best path, if it depends on which remote runner it runs on - so would another job in this workflow make sense here?

I thought about this a bit. It seems valuable to confirm that the binary can run on Debian 11. We could try running it from a script in the same job, but that only proves the binary runs on the same platform where it was built, which doesn't add much value. We could try running the script in a different job, but that also feels tautological: it would be just as easy to typo the container image for that job as the build job.

We could try inspecting the binary itself. The relevant output is:

$ objdump -T engflow_auth
...
0000000000000000      DF *UND*  0000000000000000 (GLIBC_2.34) pthread_attr_getstacksize
0000000000000000      DF *UND*  0000000000000000 (GLIBC_2.34) pthread_setspecific
0000000000000000      DF *UND*  0000000000000000 (GLIBC_2.2.5) freeaddrinfo
0000000000000000      DO *UND*  0000000000000000 (GLIBC_2.2.5) stderr
0000000000000000      DF *UND*  0000000000000000 (GLIBC_2.2.5) seteuid

Each glibc symbol has the minimum version needed. 2.34 is the version we don't want. But this seems overly brittle and implementation specific. I also really don't want to parse and compare semver in bash.

@jayconrod
Copy link
Contributor Author

I'll merge this now. Happy to make changes in a follow-up though.

@jayconrod jayconrod merged commit c972047 into main Oct 21, 2024
@jayconrod jayconrod deleted the jay-macos-sign branch October 21, 2024 21:12
jayconrod added a commit that referenced this pull request Oct 21, 2024
Before #53, I fixed a typo spotted by @minor-fixes without
manually testing it: the Linux build now runs on Debian 11.
That means the engflow_auth binary installed by login.sh
does not work because v0.0.7 was built on Debian 12.
We need v0.0.8 instead.
jayconrod added a commit that referenced this pull request Oct 22, 2024
Before #53, I fixed a typo spotted by @minor-fixes without
manually testing it: the Linux build now runs on Debian 11.
That means the engflow_auth binary installed by login.sh
does not work because v0.0.7 was built on Debian 12.
We need v0.0.8 instead.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants