-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FR] Hash-pin GitHub workflow Actions #4025
Comments
@pnacht this repository inherits a git upstream at https://github.com/jaraco/skeleton. Any changes of such nature should go through that repo. |
Ah, hadn't realized that! I was wondering what the references to jaraco/skeleton were about! I'll send the changes over there, then. |
I'm still not sure if that would work, though as it's an upstream for tens of downstream repos. And I don't know if introducing this intrusiveness wouldn't cause a disproportional amount of maintenance burden. |
In my experience, version pinning adds a great deal of toil to the process without adding a lot of value. This project, and other skeleton-derived projects, attempt to use the "work at head" model, similar to how Google operates. Pinning works against that model and introduces manual steps and changelog noise. I find that I don't have enough time as it is to address all of the issues my projects face, so I've attempted to reduce the friction to as low as possible. The per-repo, per-action, per-week multipliers here sound like quite a bit of work if we're talking about more than one or two repos. But as webknjaz has pointed out, there's an additional complication of conflicts/consistency across projects. The way actions work, they run on the derived repos and not on the parent repos, so they'll not benefit from the project scaling that jaraco/skeleton provides and may even conflict. So I've given quite a bit of background on the cost, I'd like to understand more about the value proposition. Is there a design doc somewhere that captures why this practice is a good one and what threats it protects against? Another question I'd hope the design would answer - what is expected of the user when reviewing updated pins and what's to prevent a user from pinning to a malicious version? Maybe there's a better way to achieve the desired result - maybe by having Github Actions keep a service that only allows pulling trusted, published versions. |
(replying here to keep things going, but let me know if you'd rather migrate this to a jaraco/skeleton issue) Hey @jaraco, thanks for the thorough reply!
I unfortunately don't have a design doc to point you to, but I can share the following references, all of which recommend hash-pinning Actions:
So, the main value obtained from hash-pinning Actions is that we can be certain the Action we're running does what we expect it to do. This is roughly equivalent to Google's habit of vendoring in dependencies: we ensure that the Action's behavior won't change unless we explicitly act to make it so, since we're fixing it to a specific commit hash. This adds resilience to the workflows: if an Action publishes a broken release (or is taken over by a malicious attacker), we won't be affected.
These multipliers can be simplified somewhat to "just" per-repo, per-month. Dependabot can be configured to only scan for new versions on a monthly basis instead of weekly, which reduces the workload by ~4x. And GitHub has (yesterday!) announced that Dependabot can now be configured to send batch PRs. So instead of receiving one PR per Action (which many, many people complained about!), you'd now receive one PR updating all Actions.
The suggestion here is to only accept hash-bumps from dependabot, not from "random" contributors. Since dependabot is an automated tool controlled by GitHub, we can have significantly greater confidence that the hashes are safe than if they came from an external contributor. A reviewer can then decide how skeptical they wish to be of the dependabot PR. They can choose to simply trust the tool or confirm the new hash matches the commit pointed to by the respective Action's tag, which can be a bit of a hassle. But note that simply trusting dependabot is already a step up since it reduces the number of parties you need to trust from roughly "all maintainers of all Actions used in the repository" to 1 (dependabot). The skeleton currently only uses actions from GitHub and
Oh, for sure! GitHub actually had plans to create "immutable releases" for GitHub Actions earlier this year, but after their layoffs that got pushed back to the "Future" milestone (aka 🤷♂️). That would absolutely be the best solution. This is the best we've got in the meantime.
Indeed, this is a serious issue. There's a workaround I'll describe below, but it isn't clean, so I won't put up much of a fight if you think it's not worth the hassle. The (inelegant!) solution I see here is for the skeleton to accept a PR that hash-pins its dependencies and sets up Dependabot to keep them up to date. Only the skeleton never merges those dependabot PRs.* Instead, that responsibility falls to each and all of the dependent projects. Which isn't great! They'd each receive a monthly dependabot PR bumping all their Actions to the latest version. Since the skeleton won't have similar (but still conflicting) diffs in its history when it gets merged back into each child, there shouldn't be any conflicts. The only exception I'd maybe see to this rule would be in case of an actual vulnerability being detected in an Action, in which case I'd suggest bumping the skeleton version as well. But this is rare, so I'd argue the ensuing conflicts would be a reasonable cost. Also, maybe bump up the versions whenever the skeleton's history is collapsed? 🤷♂️ But again, this is undeniably a workaround. * My original idea was for the skeleton to disable Dependabot in the settings, so the yml configuration is available for the dependents, but the skeleton doesn't get "spammed" with PRs. However, apparently only forks can disable dependabot via settings, not "normal" repos. (I sent GH a feature request for this... if you're interested and want to vote it up!) So we're stuck with an active dependabot in the skeleton. |
This still feels like a major burden for a single person to manage. I'd probably suggest exploring reusable workflows, with a separate repository for those. This comes close to another idea I've been thinking of — composable skeletons, instead of a big combined one. |
Thanks for the detailed and thoughtful response.
These references were helpful in showing that it's a broadly recommended practice, which has undoubtedly influenced my perspective. Thanks.
My concern is that neither dependabot nor the reviewer will have special knowledge about the state of the Action repo when the PR is created. Of course, if the repo is compromised while the dependency is pinned, that will delay the adoption of that tag until the next refresh. If the Action is compromised shortly before the PR is sent, however, dependabot will pin the exploited version (and likely extend the duration of exposure). Thinking about it statistically, the average expected exposure will be unimproved over the unpinned behavior unless additional investment is put into validating a trusted pin. Moreover, it introduces an additional attack vector in Dependabot+Github vs. simply trusting Github. Imagine that Dependabot is compromised never to update pins if an exploit has been pinned. Moreover, since an attacker has visibility to dependabot configs, they could easily optimize the timing of their attack to take advantage of the known pinning periods of target projects to amplify their attack. For these reasons, I feel like the recommendations above are misguided (overestimating the value of pinning) and the true value of such a change is negligible and potentially net-negative even if no additional toil is introduced.
Oh. That's good news. Thanks for the reference. That sounds like a reasonable plan. It doesn't prevent malicious actors from publishing an attack in an immutable action, but it would help prevent the attack from going undetected and if audits and scans were added, would also increase the security there.
Indeed - I'm already using composable repos to handle two concerns, general packaging (jaraco/skeleton) and tidelift enrollment (jaraco/tidelift). These composable repos work best when the artifacts they manage themselves support composition (typically through distinct files). Merging jaraco/tidelift is messy because it adds content to files that exist in jaraco/skeleton, so requires explicit manual merging into some location in a file (e.g. README.rst adds a badge (near the top) and a section (at or near the bottom)), institutional knowledge that's basically only stored in my head. It barely works and only because it's such a modest attempt (anything more sophisticated would be untenable). It's also why I bemoan the existence of If Github supported a debian style Trusted Action ForksIn lieu of a robust framework for trusted actions from Github, another possible intermediate workaround that we might consider is for groups of projects (such as those I maintain) to maintain their own fork of actions. It would be a vendoring of sorts, akin to Google's third_party, but would be for skeleton-based projects. The owner (me) would fork each of the actions used in the projects, update the skeleton to point to these forks, and then periodically pull from the canonical upstream actions. This approach would have the same effect as pinning, but would allow the pins to be bumped for all projects in one step. This approach still falls prey to the same expected exposure issue as dependabot pinning, but has an additional measure of protection in that if an exploited version is pinned, the corrective action is applied in one place and any malicious code would have been copied for better auditing (whereas the upstream attacker could have erased their attack upstream). This approach would have the additional benefit of avoiding extraneous flux in the repo, as periodic pin updates would be refactored into the forked action repos. |
In 2249a77, I'm illustrating the use of jaraco/checkout, which I forked and then transferred over the tags from upstream. The approach seems to work, and would be far less toilsome (only requiring pulling and pushing a few repos a month). Ideally, these "trusted" repos could be maintained centrally, by some supply-chain security team and they would be responsible to review the changes between syncs and validate them. Remove Sensitive DataAnother approach to achieve the desired results could be to remove the need for trusted/sensitive data in the actions. For example, If those sensitive values are removed from all environments, does that effectively remove the attack vectors? I don't think there's anything left in the environment that would be valuable to an attacker. |
One aspect I insisted on calling out in the Trusted Publishing docs is the necessity of a separate publish job that is protected with an environment that requires a human to click a button for the job to even start: https://github.com/marketplace/actions/pypi-publish#trusted-publishing. As for the other places, I think that privilege elevation (when a "better" |
AFAIK the organizations can restrict which actions are allowed in the repo settings. Many set it up in a way that only org-local actions are allowed and make forks into those orgs. So I think this approach would work. One corner case would be that if you sync an upstream action update with a bug that only appears in some cases, you won't see the effect up until you merge the update into the downstream repos that would be affected. |
Also, note that the privileges of the built-in You'd use |
So during PyCon, @asottile demoed me an approach where a standardized workflow checks for a file existence and includes it conditionally. Not sure which repo that was tho but I think something similar might work. |
pre-commit/pre-commit for example (via the reusable workflow) |
Hey @jaraco and @webknjaz, once again, thank you for your detailed replies. I'll be sure to share them with the rest of the team. Just a last comment to answer your question:
Well, you can only "remove" the Should that credential be exfiltrated (by a malicious Action or release-time dependency), it could be used to commit directly to any project without branch protection. But absolutely, removing the Just saw @webknjaz's comments:
The skeleton's workflows already do this, actually! In that case the top-level is defined with The skeleton's release job has an additional I'll close this issue now. Thank you everyone once again for this great conversation. |
What's the problem this feature will solve?
When developing with CI workflows, it's common to version-pin dependencies (i.e.
actions/checkout@v3
). However, version tags are mutable, so a malicious attacker could overwrite a version tag to point to a malicious or vulnerable commit instead.Describe the solution you'd like
Pinning workflow dependencies by hash ensures the dependency is immutable and its behavior is guaranteed.
These hashes can be automatically updated by dependabot. Whenever a new version of an Action is released, you'll receive a PR updating both its hash and the version comment (see this repo as an example).
I'll send a PR along with this issue to hash-pin the Actions and configure Dependabot to keep them updated.
Alternative Solutions
No response
Additional context
My name is Pedro, and I work with Google and the OpenSSF to improve the security of open-source projects. My colleague Joyce has made some contributions to this project already (see #3833 and – indirectly – a731056).
Code of Conduct
The text was updated successfully, but these errors were encountered: