Skip to content

Add support for cleaning up SDK files #485

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Sep 27, 2022

Conversation

dennisameling
Copy link
Contributor

@dennisameling dennisameling commented Sep 24, 2022

Closes #398

In another MR, @dscho's suggestion was as follows:

Hmm. Maybe we can separate out the cleanup logic into its own commit, and then figure out how to detect self-hosted runners and only spend the time to clean up on those?

I totally agree with this and have looked into several options to implement this.

Options

Roughly speaking, I think we have the following options:

  1. Provide an explicit cleanup input parameter that folks can use on self-hosted runners (implemented in this PR)
  2. Leverage the RUNNER_TEMP variable for outputDirectory to write temporary files. From the docs:

The path to a temporary directory on the runner. This directory is emptied at the beginning and end of each job. Note that files will not be removed if the runner's user account does not have permission to delete them. For example, D:\a\_temp

  1. Get the labels from the GitHub Action run. If it contains self-hosted, execute the cleanup logic. Looks like we'd need to do an API call to https://api.github.com/repos/OWNER/REPO/actions/jobs/JOB_ID to get those labels, which is a bit involved because it needs a GITHUB_TOKEN as well. https://github.com/octokit/action.js should make things a bit easier, but the requirement for the GITHUB_TOKEN remains.
  2. File a feature request against https://github.com/actions/runner to add something like RUNNER_TYPE (self-hosted or github-hosted) to the runner's environment variables

Feasibility check of options

  1. Easy to implement. Gives the Action consumer full control over this logic. Defaults to false, to not bother new/basic consumers with it
  2. I like this option a lot. GitHub also recommends depending on these variables instead of hardcoding paths. At first, I was a bit concerned that they mention "The directory is emptied at the beginning and end of each job" because a single workflow can contain multiple jobs, potentially slowing things down. However, GitHub-hosted runners are only used for a single job, so they shouldn't experience any slowdowns with this approach. That's assuming that they don't bother cleaning up RUNNER_TEMP but instead destroy the entire VM right away. Self-hosted runners will clean up this folder, so that's exactly the behavior we're looking for, right? What about the path input parameter in your Action? Should that also use RUNNER_TEMP as a base folder in this case, or is that parameter expected to have exact paths (e.g. C:\my-dir)?
  3. While quite easy to implement, I don't like adding a requirement to Action consumers to have them provide a GITHUB_TOKEN.
  4. I'm actually leaning towards option 2 instead of this one. If we go for option 2, we don't have to implement any logic for self-hosted runners, but simply rely on the Actions Runner to take care of things for us. We can just be like "Here's some temporary files, please clean them up after the job" and don't have to think about it anymore.

Curious to hear your thoughts on this!

@dennisameling
Copy link
Contributor Author

Just tested option 2, RUNNER_TEMP, in dennisameling@7f942b4. Tested with a workflow that runs 2 jobs both on self-hosted and github-hosted runners: https://github.com/dennisameling/git/actions/runs/3117749199/jobs/5056605664

The results:

  • The jobs on both runners finish as normal. What's interesting is that even though all steps show the ☑️ check mark, the yellow spinner keeps running on the job. This means that the runner is silently removing the files in RUNNER_TEMP. While that's great, this process also runs on the Hosted Runner, which I didn't expect. I was hoping for the runner to be clever enough to understand "I'm on a Hosted Runner, so no need to clean up this RUNNER_TEMP folder as the VM will be destroyed soon".
  • The cleanup part seems to take ~30 seconds both on the self-hosted and github-hosted runner, which is quite OK I think!
Untitled_.Sep.24.2022.11_23.AM.mp4

I'd suggest the following:

  • Proceed with option 1 (explicit cleanup input parameter) for now, so that users on self-hosted runners can request a cleanup.
  • (optional) File a bug in https://github.com/actions/runner that we don't expect RUNNER_TEMP to be cleaned up on Hosted Runners, as it adds unnecessary additional runtime to jobs on VMs that will be destroyed anyway. If the GitHub team agrees and updates their logic accordingly, we can move to the RUNNER_TEMP solution in the (hopefully near) future.

dennisameling and others added 3 commits September 27, 2022 17:16
In the upcoming commit, we will implement support for removing the SDK
files in a post-action step. To this end, we need access to Git for
Windows' `bash` (if it exists on the agent) and to the default location
of the SDK that is written by this Action. Let's export those to allow
for reusing the functionality in other source files.

Signed-off-by: Dennis Ameling <dennis@dennisameling.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
On self-hosted runners, we need to do this, otherwise these files will
be left in place after the workflow run has finished.

Signed-off-by: Dennis Ameling <dennis@dennisameling.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
@dscho
Copy link
Member

dscho commented Sep 27, 2022

Proceed with option 1 (explicit cleanup input parameter) for now, so that users on self-hosted runners can request a cleanup.

I concur! Thank you for working on this, it is highly appreciated.

@dscho dscho merged commit 4eeb49d into git-for-windows:main Sep 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Self-hosted runner: C:/git-sdk-64-full' already exists
2 participants