Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename "hosted" to "dedicated"? #947

Open
MarkLodato opened this issue Aug 17, 2023 · 23 comments
Open

Rename "hosted" to "dedicated"? #947

MarkLodato opened this issue Aug 17, 2023 · 23 comments
Labels
build-environment-track Issues/PRs related to the SLSA BuildEnv track clarification Clarification of the spec, without changing meaning

Comments

@MarkLodato
Copy link
Member

There is recurring confusion over the word "hosted", with many readers incorrectly interpreting that to mean some sort of external cloud provider. Instead, the intention is really just that it runs on a dedicated machine rather than an individual's workstation.1 In fact, the requirements say exactly that:

All build steps ran using a hosted build platform on shared or dedicated infrastructure, not on an individual’s workstation.

For v1.1, any thoughts on replacing "hosted" with "dedicated"? Would that make the intent more clear?

Sample changes:

Before After
L2: Hosted build platform L2: Dedicated build platform
"All build steps ran using a hosted build platform on shared or dedicated infrastructure," "All build steps ran using a dedicated build platform on shared or dedicated infrastructure,"
"A build platform is often a hosted, multi-tenant build service" "A build platform is often a dedicated, multi-tenant build service"

Footnotes

  1. In the case of reproducible builds, the rebuilder that is trusted would be the dedicated machine; it's fine for the original build to be on a workstation since it's bit-for-bit identical.)

@MarkLodato MarkLodato added the clarification Clarification of the spec, without changing meaning label Aug 17, 2023
@github-project-automation github-project-automation bot moved this to 🆕 New in Issue triage Aug 17, 2023
@arewm
Copy link
Member

arewm commented Aug 17, 2023

I think that this clarity is helpful except for the second change as where dedicated is repeated. I was trying to think of an alternative for the first use,

An alternative with a different word:

All build steps ran using a single-purpose build platform on shared or dedicated infrastructure

Or maybe it works sufficiently well with it removed:

All build steps ran using a build platform on shared or dedicated infrastructure,

@MarkLodato
Copy link
Member Author

Yeah, I didn't love the double "dedicated" but gave up trying to solve it. Your second suggestion (removing it before "build platform") sounds good to me.

@joshuagl
Copy link
Member

The change to dedicated seems reasonable to me, especially with the removal of dedicated before build platform. 👍

@david-a-wheeler
Copy link
Member

I think the term "dedicated" is more confusing, not less. The term "dedicated" has other meanings.

Instead, I think the problem is that there's no clear definition of the term "hosted". The earlier text about hosted never says "The term hosted means..." or anything else that indicates it's a definition. E.g.:

A hosted system is ... and is not ...

@david-a-wheeler
Copy link
Member

Current text:

Hosted: All build steps ran using a hosted build platform on shared or dedicated infrastructure, not on an individual’s workstation.
Examples: GitHub Actions, Google Cloud Build, Travis CI

First cut revisions:

Hosted: All build steps ran using a hosted build platform. A hosted build platform is a shared or dedicated infrastructure used for building and is maintained by a team. An individual’s workstation is, by definition, not a hosted build platform.
Examples: GitHub Actions, Google Cloud Build, Travis CI

@CircuitSwan
Copy link

Hosted: A system on which all build steps run. A hosted build platform may be external or internal, shared or dedicated infrastructure used for building which is well maintained. An individual’s workstation is, by definition, not a hosted build platform.

Examples: GitHub Actions, Google Cloud Build, Travis CI

@CircuitSwan
Copy link

Also we should add this to the "terminology" page and link to it

@david-a-wheeler
Copy link
Member

david-a-wheeler commented Aug 21, 2023

Here's another try:

Hosted build platform: A system on which all build steps run (in particular its hardware and operating system). A hosted build platform may be external or internal, shared or dedicated infrastructure used for building. Such a system must be well maintained, including hardening against attack, and not controlled by the individual requesting a build (to provide separation of concerns). An individual’s workstation is, by definition, not a hosted build platform.
Examples: GitHub Actions, Google Cloud Build, CircleCI

I think a key part of being "hosted" is that it emphasizes a "separation of concerns" (the build platform is operated by different people than the person who uses it). Obviously those who write the build scripts can cause the build to do bad things, but then the version control system tracks who did that.

I hope you don't mind but I switched Travis to CircleCI, I think that's a better example.

@jkjell
Copy link

jkjell commented Aug 21, 2023

Given how we're trying to (re)define this, the examples seem orthogonal to the definition. It sounds more like we're defining a property of the Build Platform, of which most CI systems would be included, right? For instance, can we define a negative example? We defined the negative property (i.e. a developers laptop).

@arewm
Copy link
Member

arewm commented Aug 21, 2023

While an individual's laptop is a negative property, I think it fits well as a negative example too. If you want a different specific example then you would likely need to call out some specific package/community directly since these are by definition not distributed systems.

Would it be simper to use the word managed instead? We can also alleviate some confusion around hosted by specifically including on-premise and cloud infrastructure as fitting the definition. Here is an example of the change with context expanded to the containing paragraphs:

Before After
L2: Hosted build platform L2: Managed build platform
"A package’s build platform is the infrastructure used to transform the software from source to package. This includes the transitive closure of all hardware, software, persons, and organizations that can influence the build. A build platform is often a hosted, multi-tenant build service, but it could be a system of multiple independent rebuilders, a special-purpose build platform used by a single software project, or even an individual’s workstation." "A package’s build platform is the infrastructure used to transform the software from source to package. This includes the transitive closure of all hardware, software, persons, and organizations that can influence the build. A build platform is often a managed, multi-tenant build service, but it could be a system of multiple independent rebuilders, a special-purpose build platform used by a single software project, or even an individual’s workstation."
"All build steps ran using a hosted build platform on shared or dedicated infrastructure, not on an individual’s workstation. Examples: GitHub Actions, Google Cloud Build, Travis CI." "All build steps ran using a managed build platform on shared or dedicated infrastructure either owned by the build platform or hosted on public infrastructure. Examples: GitHub Actions, Google Cloud Build, Travis CI. Counter examples: Individual's workstations."

@MarkLodato
Copy link
Member Author

Maybe we should focus more on why the requirement exists, before we choose a name or definition. @david-a-wheeler raised separation of concerns, and I also raised https://slsa.dev/spec/v1.0/principles#corollary-minimize-the-number-of-trusted-platforms. But we don't have agreement here, as discussed in yesterday's spec meeting. Once we agree on what the objective of the requirement is, that should help us narrow down what does and does not satisfy that objective.

@david-a-wheeler
Copy link
Member

@MarkLodato :

Maybe we should focus more on why the requirement exists, before we choose a name or definition.

Fair enough. Since this isn't documented (or, it appears, agreed on), I suggest working backwards to identify a list of reasons people might want this requirement, then try to hone in on the ones we (as a group) agree are important, so that we can clearly state it. BTW, I think it's quite possible to include a requirement for multiple reasons (that's not a problem).

@arewm
Copy link
Member

arewm commented Aug 22, 2023

Some initial whys that come to mind:

  • Increases isolation between build platform developers/operators and the build platform itself. Specifically, it would require a malicious actor to pivot after a compromise. For example, an exploit from a compromised email read on a build platform wouldn't immediately grant access to the build system.
  • Enable specific operational controls (in the future) to be implemented on systems whose dedicated purpose is the build platform.
  • Actions taken by the developers/operators on the build platform's systems would not be conflated with a those on a separate system (i.e. when assessing access logs).

However, none of these really make sense in terms of the build track's levels itself.

In the discussions of the future Build track's levels, it has been mentioned that each track should effectively have a primary goal which all levels can be measured against to reach some goal (should this be official/semi-official and documented somewhere on the website?). For the build track, the goal is to generate an accurate, complete, and authentic provenance describing the build.

In looking at the L2 for provenance, the hosted requirement seems to fit most with the clause:

Define trust: Identify the build platform and other entities that are necessary to trust in order to trust the artifact they produced. [ref]

To this end, having a dedicated/hosted/[...] is a step in defining the entity that is the build platform. If the platform is run on some shared resource then we are not able to as clearly indicate where the transitive closure ends in order to define the platform

System that allows tenants to run builds. Technically, it is the transitive closure of software and services that must be trusted to faithfully execute the build. It includes software, hardware, people, and organizations. [ref]

@sudo-bmitch
Copy link
Contributor

Thinking about the value this gives me, I'd phrase this as wanting a "well maintained and properly secured build server".

We can this list examples of what we consider typically approved (SaaS solutions and on-prem hosted CI) and rejected (developer personal machine). Importantly, I wouldn't consider a 5 year old unpatched Jenkins server exposed to the public internet a well maintained server, and so it shouldn't be approved just because it's a separate server that's not the developers laptop.

@arewm
Copy link
Member

arewm commented Aug 22, 2023

This conversation will likely run up against the Build Platform Operations track, so we will need to ensure that we keep those conversations distinct (while still understanding how they relate). I don't think the running (i.e. patching, firewall rules, etc) of the build platform would not fall within the hosted/dedicated clarification. Any clarification here should assume that the platform is well-intended/operationalized.

@david-a-wheeler
Copy link
Member

@arewm said:

  • Increases isolation between build platform developers/operators and the build platform itself. Specifically, it would require a malicious actor to pivot after a compromise. For example, an exploit from a compromised email read on a build platform wouldn't immediately grant access to the build system. ...
  • Actions taken by the developers/operators on the build platform's systems would not be conflated with a those on a separate system (i.e. when assessing access logs).

In our discussions these were the primary purpose I had in mind. As I said earlier, "I think a key part of being "hosted" is that it emphasizes a "separation of concerns" (the build platform is operated by different people than the person who uses it)."

It also provides some resilience if the lead maintainer disappears. I'm personally dealing with this as a side project. The lead and friend of mine (Norm Megill) died unexpectedly, and he used his personal computer to do all the builds. That computer was going to soon disappear, so I had to do a transition to move building from his personal system to a build that can be maintained by others.

  • Enable specific operational controls (in the future) to be implemented on systems whose dedicated purpose is the build platform.

That's not a bad reason, but I suspect we want to identify a few specific controls that would make it worth the trip.

@joshuagl
Copy link
Member

One reason the requirement exists is to reduce the number of systems a consumer must trust. I think of this often in the context of getting packages from a Linux distro vs. an upstream produced package. If I trust my distro vendor, I can get hundreds of trusted packages as part of that decision. If I want to individually retrieve all of my packages from the upstream locations, I have to decide whether a trust the output of multiple build systems (if I can even determine what the build system is).

@sudo-bmitch
Copy link
Contributor

It also provides some resilience if the lead maintainer disappears.

I think this is a bigger challenge that should be addressed directly rather than with indirect build server requirements. There are other issues caused by a single point of failure, including access to the repositories to push new releases, signing build results, and the general likelihood of the project continuing without a key maintainer. Given that the issue spans beyond the build process itself, I'm not sure if we want a "no single point of failure" requirement for just the build that may get extended later, or if there's a better way to capture it.

This does raise a general concern of mine that OpenSSF may want a track for lone developer projects that want to improve their security without dealing with projects like SLSA and others that mark them as insecure, removing any incentive to add higher level security features like reproducible builds.

@david-a-wheeler
Copy link
Member

@sudo-bmitch : The OpenSSF Best Practices badge is specifically designed so the "passing" and "silver" criteria can be met by a single developer. "Gold" can't, because it includes requirements that require multiple developers, but there are many things that can be practically done even by a single-person project. I think that's true for many other things as well.

In the case of hosting, a single-person project can choose to use a hosted system, so that they don't have to do it all themselves. I don't see why a single-person project can't do this. If I'm mistaken, please enlighten me!

@sudo-bmitch
Copy link
Contributor

In the case of hosting, a single-person project can choose to use a hosted system, so that they don't have to do it all themselves. I don't see why a single-person project can't do this. If I'm mistaken, please enlighten me!

@david-a-wheeler I'm kinda debating both for and against something at the same time, which is confusing. I completely agree that a single developer could implement a hosted build, which means that SLSA 1.0 may not prevent issues encountered by projects when a lead maintainer is no longer available.

If we were to fix that with a requirement to avoid single points of failure, then my suggestion is to ensure the solo developers have some kind of incentive to continue adding security features to their projects and not just stop once they hit that requirement. (I.e. in Best Practices have a Solo-Gold, which I believe would get a lot of attention given the number of single maintainer projects out there.)

Either way, I think the "single point of failure" and the hosted build platform should probably be kept separate, so I'll stop here to avoid derailing the issue.

@MarkLodato
Copy link
Member Author

For the build track, the goal is to generate an accurate, complete, and authentic provenance describing the build.

I agree with this characterization, with the addition from levels.md:

The primary purpose of the build track is to enable verification that the artifact was built as expected

This is mostly a repeat of what was said above, but in case repeating helps us get toward consensus:

  • Reduces attack surface => less risk of compromise of the accuracy/authenticity of the provenance.
  • Provides a central place to harden, both for the next level (L3) and the operational concerns raised in this thread. This in turn further reduces risk of compromise.
  • Similarly, provides a clear place to investigate when responding to a compromise.
  • Allows us to reduce the number of trusted systems over time (mentioned by @joshuagl). This also works for large companies: if you have 1000 engineers that each build on their workstations, you can be compromised if any of them are compromised. By concentrating trust into orders-of-magnitude fewer builders, you reduce the attack surface and the overall risk of compromise. You can also concentrate your effort.

By the way, if the reasoning gets long, we could optionally hide it behind <details> tag.

@steve-work-account
Copy link

steve-work-account commented Sep 8, 2023

I was confused on the wording of "hosted" and asked in slack.
After being directed to this issue, and understanding what hosted means in this context, It might also be worth considering having an entry on the Terminology page.

https://slsa.dev/spec/v1.0/terminology#build-model

That has a clear definition of what is meant by hosted and then inserting a link to that definition on:
https://slsa.dev/spec/v1.0/levels#build-l2-hosted-build-platform

@chizou
Copy link

chizou commented Dec 13, 2023

Can the examples be updated as well? Every example provided is a hosted service, which I think is adding to current confusion and might continue to confuse even if you use change the verbiage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build-environment-track Issues/PRs related to the SLSA BuildEnv track clarification Clarification of the spec, without changing meaning
Projects
Status: 🆕 New
Development

No branches or pull requests

10 participants