Skip to content

Can't run custom board build until large Build Boards CI job has run (giving cached submodules) #9301

Open
@tyeth

Description

@tyeth

This is a copy of the ticket I've sent to github, as it seems the submodule fetch/init is not behaving as expected.

Firstly, there is a single board CI action, which builds firmware for a single embedded board.
See here: https://github.com/tyeth/circuitpython/blob/main/.github/workflows/build-boards.yml

This action can reuse submodule assets from the cache instead of freshly obtaining them. This takes place in the subaction for espressif boards (with the board I've been targeting) - see here: https://github.com/tyeth/circuitpython/blob/main/.github/actions/deps/ports/espressif/action.yml

That cached submodule asset only exists if successfully run before, which can be either from the multiple (500boards) boards CI action, or from the a single board CI action.

Most of the time I cannot wait the 2hours to get the 500board build job to finish, so I cancel it at each merge, and instead run the single Build Board (Custom) action - see here: https://github.com/tyeth/circuitpython/blob/main/.github/workflows/build-board-custom.yml

If I run this single board build (board is lilygo_tdisplay_s3) then it always fails if the submodule cache is empty, and having turned on git tracing I see it attempts to fetch the submodule and seems to lose git context and ask for the parent repo remote name from the submodule (obviously has different remotes).

If I compare what the 500board build job does then it appears to call the same file in exactly the same way, but works.
Please help me identify the difference and why git is doing such madness in the background of a simple call, breaking my single board build job.

I've now (last night) let the 500board build job finish, so I have cached submodule assets, and now the single board build works (relying on the cache instead of updating the submodule during the job).

Bad output from single build job showing "user-fork" (parent repo remote name) in the git fetch line that fails for the submodule:
https://github.com/tyeth/circuitpython/actions/runs/9358284335/job/25759793899#step:9:4428

Output from 500 boards build task showing successful submodule fetch which uses 'origin' instead of user-fork (still parent repo remote name):
https://github.com/tyeth/circuitpython/actions/runs/9358335114/job/25760005736#step:4:4430

In my head the actual commands in the build action, or subaction even, are near identical and so should behave as such. I also don't see why the remote name of the parent repo is used when fetching submodule information. The other thing to note, is that the single board [Build Board (Custom)] will clone adafruit's copy of circuitpython first, then add the user-fork as a remote and finally checkout user-fork/{currentBranch}, so this is why the script fails. In actual fact if the origin remote didn't exist everywhere by default then the main build script would fail too.
[To recreate simply rename the remote of a repo, then with submodules at least 1 folder nested, run the submodule update --init --recursive
like this line: https://github.com/tyeth/circuitpython/blob/main/.github/actions/deps/ports/espressif/action.yml#L35C12-L35C71 ]

And secondly, when running the full CircuitPython build CI job (500 board targets) it timed out attempting to publish assets (1 error and a few task warned about timeout/retry issues).
Is this expected for a normal user account, that they cannot run such a large suite? I believed I'd run it before on my account, rather than the organisation account of Adafruit (which obviously has a larger job runner pool).

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugthird-partyAwaiting action on a third party for a fix or an answer to a request

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions