Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/build: add LUCI netbsd-arm builder #63698

Open
bsiegert opened this issue Oct 23, 2023 · 22 comments
Open

x/build: add LUCI netbsd-arm builder #63698

bsiegert opened this issue Oct 23, 2023 · 22 comments
Assignees
Labels
Builders x/build issues (builders, bots, dashboards) NeedsFix The path to resolution is known, but the work has not been done. new-builder OS-NetBSD
Milestone

Comments

@bsiegert
Copy link
Contributor

Hostname: netbsd-arm-bsiegert

netbsd-arm-bsiegert.csr.txt

/cc @golang/release

@gopherbot gopherbot added the Builders x/build issues (builders, bots, dashboards) label Oct 23, 2023
@gopherbot gopherbot added this to the Unreleased milestone Oct 23, 2023
@cherrymui cherrymui added the NeedsFix The path to resolution is known, but the work has not been done. label Oct 30, 2023
@bsiegert
Copy link
Contributor Author

Any updates? It's been a month.

@dmitshur
Copy link
Contributor

Thanks for pinging; it looks like we missed this new-builder issue, sorry.

We'll pick this up at the start of next week since this week is short due to US holidays.

@dmitshur dmitshur self-assigned this Nov 27, 2023
@dmitshur dmitshur moved this to In Progress in Go Release Nov 27, 2023
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/545536 mentions this issue: main.star: add netbsd-arm, netbsd-arm64, openbsd-riscv64 builders

gopherbot pushed a commit to golang/build that referenced this issue Nov 28, 2023
For golang/go#63698.
For golang/go#63614.
For golang/go#64176.

Change-Id: I2a203cd2a1e2e80ee44cfd5ce11c1ce5bbd60002
Reviewed-on: https://go-review.googlesource.com/c/build/+/545536
Reviewed-by: Heschi Kreinick <heschi@google.com>
TryBot-Bypass: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org>
@dmitshur
Copy link
Contributor

dmitshur commented Nov 30, 2023

Here's the resulting certificate: netbsd-arm-bsiegert-1701367164.cert.txt.

The builder definitions have been added in CL 545536 so your bot should be able to connect once you follow the rest of the steps on your end.

We have some more work to do to make the dependencies built for the netbsd/arm port and available in CIPD, which will be needed for the builds to complete successfully. We'll update this issue once that's done.

@dmitshur
Copy link
Contributor

dmitshur commented Dec 8, 2023

more work to do to make the dependencies built for the netbsd/arm port and available in CIPD

I mailed crrev.com/c/5086069 for this.

@dmitshur
Copy link
Contributor

That CL is submitted and the dependencies are built.

If you give it a shot to connect with the builder, we can see what the next steps are for this.

@dmitshur dmitshur assigned bsiegert and unassigned dmitshur Dec 15, 2023
@bsiegert
Copy link
Contributor Author

Thanks Dmitri! The bot is now up and running at https://chromium-swarm.appspot.com/bot?id=netbsd-arm-bsiegert. It shows up as

@github-project-automation github-project-automation bot moved this from In Progress to Done in Go Release Dec 29, 2023
@bsiegert bsiegert reopened this Dec 29, 2023
@bsiegert
Copy link
Contributor Author

bsiegert commented Dec 29, 2023

Sorry, hit Submit too fast.

It shows up as

cipd_platform=netbsd-armv6l
cpu=evbarm | evbarm-32

@dmitshur
Copy link
Contributor

I'm not seeing successful builds in https://ci.chromium.org/ui/p/golang/builders/ci/gotip-netbsd-arm?limit=200, and the builder isn't showing up as connected now. Can you take a look at what its current status is? I'll reopen this issue so we can track what's still left to do here.

@dmitshur dmitshur reopened this Jan 23, 2024
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/558517 mentions this issue: main.star: fix cipd_platform value for GOHOSTARCH=arm

@dmitshur dmitshur moved this from Done to In Progress in Go Release Jan 26, 2024
gopherbot pushed a commit to golang/build that referenced this issue Jan 26, 2024
Handle another way in which cipd_platform is slightly
different from Go's GOHOSTOS and GOHOSTARCH values.

For golang/go#65241.
For golang/go#63698.
For golang/go#63601.

Change-Id: I3caad897b821208939b8b411663ba417c4c21df7
Reviewed-on: https://go-review.googlesource.com/c/build/+/558517
TryBot-Bypass: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org>
@dmitshur
Copy link
Contributor

dmitshur commented Jan 26, 2024

CL 558517 fixed the cipd_platform value in the builder definition, and triggered some work, e.g., https://chromium-swarm.appspot.com/task?id=675f6b5f66994610. It's failing with an internal failure:

swarming_bot_logs: 2024-01-26 16:28:59.383: Starting run_isolated script
swarming_bot_logs: 2024-01-26 16:28:59.537: Trimming caches. min_ts: 1704472139, free_disk: 35253248000, min_free_space: 62578626346
swarming_bot_logs: 2024-01-26 16:28:59.542: trimming cache with dir /home/swarming/.swarming/cas_cache
swarming_bot_logs: 2024-01-26 16:28:59.546: trimming cache with dir /home/swarming/.swarming/c
swarming_bot_logs: 2024-01-26 16:28:59.549: trim_caches: took 0 seconds
swarming_bot_logs: 2024-01-26 16:29:04.267: Installed CIPD client
10397 2024-01-26 16:29:05.190 E: internal failure: Expecting value: line 1 column 1 (char 0)
Traceback (most recent call last):
  File "/home/swarming/.swarming/swarming_bot.2.zip/client/run_isolated.py", line 858, in map_and_run
    with data.install_packages_fn(run_dir, cas_client_dir) as cipd_info:
  File "/usr/pkg/lib/python3.10/contextlib.py", line 135, in __enter__
    return next(self.gen)
  File "/home/swarming/.swarming/swarming_bot.2.zip/client/run_isolated.py", line 1199, in install_client_and_packages
    package_pins = _install_packages(run_dir, cipd_cache_dir, client,
  File "/home/swarming/.swarming/swarming_bot.2.zip/client/run_isolated.py", line 1124, in _install_packages
    pins = client.ensure(
  File "/home/swarming/.swarming/swarming_bot.2.zip/client/cipd.py", line 245, in ensure
    result_json = json.load(jfile)
  File "/usr/pkg/lib/python3.10/json/__init__.py", line 293, in load
    return loads(fp.read(),
  File "/usr/pkg/lib/python3.10/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "/usr/pkg/lib/python3.10/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/pkg/lib/python3.10/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
Expecting value: line 1 column 1 (char 0)

We need to figure out what's causing that and fix it to make progress here.

Edit: I think https://source.chromium.org/chromium/infra/infra/+/main:luci/client/cipd.py;l=245 is the relevant line. The check for exit_code happens slightly later, on line 260, so it's possible something went wrong during the invocation of cipd ensure, we just don't see what it was in the log above. If you can find a way to reproduce it locally and share its output, that'd be helpful.

@dmitshur dmitshur self-assigned this Jan 26, 2024
@dmitshur
Copy link
Contributor

dmitshur commented Feb 5, 2024

In addition to trying to reproduce this by running cipd ensure manually on the builder, you might be able to check if there's more information in /home/swarming/.swarming/logs/task_runner.log (or run_isolated.log).

@bsiegert
Copy link
Contributor Author

bsiegert commented Feb 6, 2024

Thanks for the pointers, I will take a look and report back.

@bsiegert
Copy link
Contributor Author

bsiegert commented Feb 7, 2024

This from run_isolated.log looks related:

22683 2024-01-26 16:03:45.157 U: Installed CIPD client
22683 2024-01-26 16:03:45.159 I: Installing packages {'': [('infra/tools/luci/bbagent/${platform}', 'git_revision:1f801c4894a7ced859ae672642feeeb8960da330')]} into /home/swarming/.swarming/w/ir
22683 2024-01-26 16:03:45.283 D: Running ['/home/swarming/.swarming/cipd_cache/bin/cipd', 'ensure', '-root', '/home/swarming/.swarming/w/ir', '-ensure-file', '/tmp/cipd-ensure-file-y5cmbnt9.txt', '-verbose', '-json-output', '/tmp/cipd-ensure-result-0pjif55u.json', '-cache-dir', '/home/swarming/.swarming/cipd_cache/cache', '-service-url', 'https://chrome-infra-packages.appspot.com/']
22683 2024-01-26 16:03:45.769 D: cipd client: runtime: this system has multiple CPUs and must use
22683 2024-01-26 16:03:45.806 D: cipd client: atomic synchronization instructions. Recompile using GOARM=7.
22683 2024-01-26 16:03:45.906 E: internal failure: Expecting value: line 1 column 1 (char 0)

So the cipd binary needs to be recompiled with GOARM=7 set.

@dmitshur
Copy link
Contributor

dmitshur commented Feb 7, 2024

Downloading the Go binary from here and ranning go version -m on it prints:

	build	CGO_ENABLED=0
	build	GOARCH=arm
	build	GOOS=netbsd
	build	GOARM=6

So it is built with GOARM=6 now (even though cross-compilation default for GOARM is 7 as of Go 1.21.).

Searching finds entries like this, this, and this that all suggest making a change isn't quite straightforward, because the "v6l" suffix of the "netbsd-armv6l" CIPD platform dimension corresponds to GOARM=6.

Maybe it's possible to make it work with GOARM=6 anyway, through changes to the atomic operations, if it turns out there's not much to do? For example, I recently did crrev.com/c/5268803 which was enough to resolve the problem for linux/arm (with GOARM=6). But if it's much more invasive, that path might be harder. Edit: I see this is coming from the Go runtime, i.e., here and seems you'd need not to have multiple CPUs to work around it.

If the builder for this port cannot work with GOARM=6 binaries and really needs GOARM=7, we can see how involved that might be.

@bsiegert
Copy link
Contributor Author

bsiegert commented Feb 8, 2024

I wonder why the Chromium infra thinks the architecture is "armv6l". That's a different sub-architecture. On this machine, uname -p prints earmv7hf, not the older earmv6hf. We do not have a ARMv6 builder at the moment, these are kind of old and crufty in general.

FWIW, http://wiki.netbsd.org/ports/evbarm/ shows the different sub-architectures.

This comment says:

  For example, on ARMv7 machines we claim that we are in fact running ARMv6
  (which is subset of ARMv7), since we don't really care about v7 over v6
  difference and want to reduce the variability in supported architectures
  instead.

Which is clearly a wrong assumption.

hubot pushed a commit to luci/luci-py that referenced this issue Oct 22, 2024
For the purposes of the Go netbsd-arm builder, we need to be able
to distinguish ARMv6 from later (v7+) architectures. Only ARMv7 has
the synchronization instructions that allow Go binaries to run on
multi-core machines.

This special-cases the 'netbsd' OS type to return armv7l for 32-bit
ARMv7 machines.

I have not added FreeBSD and OpenBSD support since their values for
machine and processor are a bit different.

Part of golang/go#63698

Change-Id: I798c7ec459b453ebfbc6bf2912f1bd06b8919325
Reviewed-on: https://chromium-review.googlesource.com/c/infra/luci/luci-py/+/5943105
Reviewed-by: Vadim Shtayura <vadimsh@chromium.org>
Commit-Queue: Benny Siegert <bsiegert@google.com>
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/623015 mentions this issue: main.star: use v7l suffix for netbsd-arm builder

gopherbot pushed a commit to golang/build that referenced this issue Oct 28, 2024
It was updated to target GOARM=7 because it can't work with GOARM=6.

For golang/go#63698.

Change-Id: I164a49d4a2bf6f680e4a980e90f6e2968f1906a7
Reviewed-on: https://go-review.googlesource.com/c/build/+/623015
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
@dmitshur
Copy link
Contributor

dmitshur commented Oct 31, 2024

https://ci.chromium.org/b/8732718088961275521 was the first successful build - congrats on reaching this milestone!

The build took over 2 hrs to complete. Given netbsd-arm64 takes around 25 min and still has its SLOW_HOSTS factor set to 2 (main.star#L343), netbsd-arm likely needs to also set something higher than 1. Perhaps 5 to mirror openbsd-arm, as a starting point? (CL 624075 does this.)

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/624075 mentions this issue: main.star: add netbsd-arm to SLOW_HOSTS

gopherbot pushed a commit to golang/build that referenced this issue Nov 1, 2024
As mentioned in https://go.dev/issue/63698#issuecomment-2450448780,
this might be a good starting point. Can be adjusted later as needed.

For golang/go#63698.

Change-Id: I851852981527e1dcdae18f535f16a37117791ce6
Reviewed-on: https://go-review.googlesource.com/c/build/+/624075
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Benny Siegert <bsiegert@gmail.com>
Auto-Submit: Benny Siegert <bsiegert@gmail.com>
@bsiegert
Copy link
Contributor Author

bsiegert commented Nov 2, 2024

The builder is chugging along fine.

One issue though: on build.golang.org, it looks like the LUCI builds are not shown at all.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/624875 mentions this issue: dashboard: set netbsd-arm{,64} builders to 0 expected

gopherbot pushed a commit to golang/build that referenced this issue Nov 4, 2024
These are LUCI now. The buildlets won't come back.

Part of golang/go#63698.

Change-Id: I7e8fb2f1892d2f9e81e69a101e26072c4dd08788
Reviewed-on: https://go-review.googlesource.com/c/build/+/624875
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/624995 mentions this issue: luci-config: remove known issue for netbsd-arm

gopherbot pushed a commit to golang/build that referenced this issue Nov 4, 2024
Part of golang/go#63698.

Change-Id: Ib1c5722aba24095d27c23cb69065a80d4c127dd4
Reviewed-on: https://go-review.googlesource.com/c/build/+/624995
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Builders x/build issues (builders, bots, dashboards) NeedsFix The path to resolution is known, but the work has not been done. new-builder OS-NetBSD
Projects
Status: In Progress
Development

No branches or pull requests

5 participants
@bsiegert @dmitshur @gopherbot @cherrymui and others