Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Signing for release artifacts #187

Closed
bgilbert opened this issue May 24, 2019 · 43 comments
Closed

Signing for release artifacts #187

bgilbert opened this issue May 24, 2019 · 43 comments
Assignees
Labels
releng Related to Fedora Release Engineering team/input

Comments

@bgilbert
Copy link
Contributor

bgilbert commented May 24, 2019

Set up signing for release artifacts.

Open questions:

  • Where/how should we integrate signing into our release pipeline?
  • What format should signatures take? They'll need to be verified automatically by coreos-installer (in a streaming fashion), and also some users will probably write their own verification tooling. CHECKSUM files aren't a great fit for this; it'd be nicer to just provide GPG detached signatures.
  • Does releng need to receive the entire blob to sign, or can we just send its hash?
  • Should we use the existing Fedora release signing keys, or our own keys? The former gives us key rotation for free, but we'd probably want to ensure we sign each release with the key corresponding to its Fedora major. That'd mean we'd need to synchronously switch keys as a rebase is promoted through the streams.

Predecessor ticket: #87
COSA discussion: coreos/coreos-assembler#268

@bgilbert bgilbert added the releng Related to Fedora Release Engineering team/input label May 24, 2019
@dustymabe dustymabe added the jira for syncing to jira label May 24, 2019
@dustymabe
Copy link
Member

I can try to pick this one up if no one else is.

@dustymabe
Copy link
Member

  • What format should signatures take?

@bgilbert - Let me try to understand what we want: Here are a few options:

  1. sign the artifact itself and deliver detached signature (probably slow because need to shove the entire artifact over to the signing server):
    • fcos-XYZ.qemu.qcow
    • fcos-XYZ.qemu.qcow.sig
  2. sign a checksum of the artifact and deliver checksum + detached signature:
    • fcos-XYZ.qemu.qcow
    • fcos-XYZ.qemu.qcow.CHECKSUM
    • fcos-XYZ.qemu.qcow.CHECKSUM.sig
  3. deliver checksum file with inline signature of checksum file text:
    • fcos-XYZ.qemu.qcow
    • fcos-XYZ.qemu.qcow.CHECKSUM -> contains inlined ASCII signature

My understanding is that we want this for each artifact rather than for a collection of artifacts. i.e.:

  1. deliver checksum file with inline signature that contains checksums of all artifacts for build:
    • fcos-XYZ.qemu.qcow
    • fcos-XYZ.iso
    • fcos-XYZ.openstack.qcow
    • fcos-XYZ.CHECKSUM -> contains checksums of all artifacts + inlined ASCII signature.

4. is closest to what the rest of Fedora is doing today

@ajeddeloh
Copy link
Contributor

+1 for a sig for each artifact. Option 4 is super annoying from a user perspective. If we can get option 1 to work with the fedora signing infra that'd be best in my opinion.

@dustymabe
Copy link
Member

If we can get option 1 to work with the fedora signing infra that'd be best in my opinion.

but isn't that a bit of a waste on both the server and the client ? i.e. computing the data for a hundreds of MiB on both sides vs a small amount of data?

@ajeddeloh
Copy link
Contributor

Computers are fast and gpg is hard. GPG verification isn't terribly slow or anything. I'd prefer a simpler approach (for users) over something faster but easier to mess up. I'm assuming the .sig would be a detached signature so it would be a trivial amount to download.

@dustymabe dustymabe added the meeting topics for meetings label May 30, 2019
@bgilbert
Copy link
Contributor Author

I have a pretty strong preference for option 1. All of the other options are more inconvenient to verify. In addition, the stream and release metadata should include content hashes (#98 (comment)) and the bucket won't allow directory listing, so in order for users to find the artifacts they'll already have the hashes anyway.

@dustymabe
Copy link
Member

cc @jlebon @cgwalters @lucab @arithx

@cgwalters
Copy link
Member

All of the other options are more inconvenient to verify.

How's that? If we provide a checksum file it's literally just sha256sum -c --ignore-missing fcos-xyz.checksum plus gpg --verify. (Or replace sha256sum with sha512sum probably)

@cgwalters
Copy link
Member

The Fedora signing server is definitely slow with large artifacts, it's a constant problem for RPMs, and I doubt it's going to be improved anytime soon as it's ultimately backed by hardware, not amenable to scaling, etc. Why force some slow hardware token to compute RSA over 600MB when you could just give it < 1k?

@cgwalters
Copy link
Member

@cgwalters
Copy link
Member

(The checksum file approach would require us of course to generate one from our meta.json but that's pretty easy)

@ajeddeloh
Copy link
Contributor

How's that? If we provide a checksum file it's literally just sha256sum -c --ignore-missing fcos-xyz.checksum plus gpg --verify. (Or replace sha256sum with sha512sum probably)

That's... not great. People may not know about -c --ignore-missing and either verify by looking at hashes manually (garbage) or skip verification. I don't have those args memorized. I'd need to spend time looking through the man pages. If the barrier to entry is high enough, people will just skip verification, which is not good.

Why force some slow hardware token to compute RSA over 600MB when you could just give it < 1k?

Because I'd rather have us hit that pain than our users, especially if it just means taking a little longer on releases.

The signed checksum file approach is nearly a "rough consensus standard"

I'd argue it's not very good. Plain old gpg sigs are a rough standard on other (smaller) things we ship (etcd, ignition-valdiate, etc) and it'd be nice to have a consistent way to verify things.

@bgilbert
Copy link
Contributor Author

100% agree with @ajeddeloh.

Why force some slow hardware token to compute RSA over 600MB when you could just give it < 1k?

It's computing RSA over a hash, in any event, and the hash should be computed off-token. I won't ask too many questions about that particular implementation, though. 😕

Ideally we'd be able to compute the hash on our end and submit that, though I don't expect it. (Some PGP implementations can accept a hash directly; gpg won't.)

@cgwalters
Copy link
Member

I don't have those args memorized. I'd need to spend time looking through the man pages.

But those 4 distribution links already document how to do this and work in very similar ways.
(Interestingly, QubesOS does both approaches)

And let's continue:

OK, apparently Arch Linux does direct signatures.

The OpenBSD install guide explicitly mentions using -c --ignore-missing BTW too, although they invented their own not-GPG thing.

@cgwalters
Copy link
Member

Ideally we'd be able to compute the hash on our end and submit that, though I don't expect it. (Some PGP implementations can accept a hash directly; gpg won't.)

Hum, but if the Fedora signing server is using a hardware token, maybe it supports providing the hash - if this is true actually it would also help for signing RPMs.

But I can say definitively that asking the current server to sign several gigabytes (particularly if done at any kind of frequency) is asking for a lot of pain. And the clear "rough consensus" here is to use signed checksum files.

@bgilbert
Copy link
Contributor Author

But those 4 distribution links already document how to do this and work in very similar ways.

A bad interface is a bad interface, no matter how well documented or widely used. If we're telling people to type --ignore-missing we're doing something wrong. Every extra step adds friction that causes people to not bother verifying their download.

That also doesn't get into other use cases, such as 1) streaming verification, as needed by coreos-installer or user-built equivalents, and 2) verification from Python/Go/Rust without shelling out.

Aside from "we'd have to wait longer for signing", which is a tooling problem we shouldn't inflict on users, what's the actual downside to simplifying?

@cgwalters
Copy link
Member

what's the actual downside to simplifying?

Inability to distinguish between "item corrupted in transit/on disk" versus "item has a bad signature".

@bgilbert
Copy link
Contributor Author

There will be hashes in the release and stream metadata.

@cgwalters
Copy link
Member

There will be hashes in the release and stream metadata.

That's true. On the other hand those are both our own bespoke/unique-to-us formats without a pre-baked command shipped in every operating system (and container base image, etc.) to use to verify.
(On the other other hand, using JSON for this is sane and easy to parse, and more importantly extensible, so the decision to write those makes sense I think)

@bgilbert
Copy link
Contributor Author

We will intentionally not have directory/bucket listings enabled for our release bucket, nor predictable URLs. If a user has obtained an artifact, they've gotten it one of two ways:

  • Programmatically, via the stream metadata (or release metadata in corner cases), in which case the hash is right there,
  • Interactively, via a download page that parses the stream metadata client-side.

In the latter case, the download page can show the hash, or even synthesize a CHECKSUMS file for download.

@cgwalters
Copy link
Member

You've put some thought into this and I see your perspective. Having a "smart" website would certainly be nice; not having a "dumb directory listing" is...an interesting call. There's some unique infrastructure here which on one hand is valuable, but on the other hand also unique.

I think I've established that most extant distributions are doing the checksums files, and a reasonable percentage of people who have used any of those other operating systems will find it completely familiar.

But eh, I've no opposition to it.

I think the pivotal thing really becomes the pain of using the Fedora GPG signer - and it's reasonable to consider that a bug to be fixed, but like I said I doubt it will be. I suspect other hardware signers are going to hit similar limitations (isn't fero also flaky?). Kind of the point of hardware signing is that it doesn't trivially scale 😉

@bgilbert
Copy link
Contributor Author

For context: the primary reason to omit the directory listing is to avoid having users or tooling implicitly grab the latest release (which circumvents our ability to control or stop rollouts) and to discourage them from grabbing stale releases. (A secondary reason is to avoid the Container Linux problem that every new artifact instantly became a compatibility constraint.) We want artifact access to always go through the stream metadata or something derived from the stream metadata. You're right that we're incurring tradeoffs in the process.

I think we should at least have a conversation with Fedora infra to see what's possible. If direct artifact signing is just completely infeasible, clearly we'll have to do something else.

Re fero: it has an unfixed bug that prevents it from being used to sign Go binaries, but it's been completely reliable for Container Linux signing.

@ajeddeloh
Copy link
Contributor

To clarify: fero's problem is it can't sign large things. The CL signing process just signs a hash (not actually sure the details there) and small json files so it hasn't been an issue there.

@cgwalters
Copy link
Member

Sorry I had only partially followed the design for the metadata; I apologize for needing to reiterate here the rationale. I understand the goals.

However, we're also simultaneous to this trying to get the RHCOS content showing up at https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/4.1/4.1.0-rc.4/ signed...and that is definitely going to use the same "signed checksum file" approach... And we're going to be training admins/tools to do that. Maybe in some future if this works out for FCOS we could try to adapt it for RHCOS too? I dunno...

To clarify: fero's problem is it can't sign large things.

Which is the same issue that the Fedora signer has...this should be some sort of...sign? 😉

Anyways I withdraw my argument around CHECKSUM files, happy to proceed with the stream stuff. But that still leaves the issue of signing directly.

I also wanted though to follow up to:

That also doesn't get into other use cases, such as 1) streaming verification, as needed by coreos-installer or user-built equivalents, and 2) verification from Python/Go/Rust without shelling out.

RE 1, tooling can easily verify the checksum in a streaming fashion too right? Just pipe it to a checksum at the same time as downloading, this is quite standard. It's what libostree does downloading objects, etc.

As far as 2...I would strongly suspect that a lot of people would shell out to gpg2 --verify. There is an implementation of OpenPGP for Go I think (and rust) but it's a huge dependency to carry.

@cgwalters
Copy link
Member

(The gpg dependency of course is why OpenBSD decided to go their own way, see also ostreedev/ostree#1233 )

@dustymabe
Copy link
Member

I think we should at least have a conversation with Fedora infra to see what's possible. If direct artifact signing is just completely infeasible, clearly we'll have to do something else.

Is the above ^^ where we are at this point? Seems like we've got some voting for 1. and some voting for 4. Should I find out what the limitations are for 1. from Fedora infra and then we'll continue the discussion?

@bgilbert
Copy link
Contributor Author

bgilbert commented Jun 4, 2019

@dustymabe Yup, I think that's the next step.

@dustymabe
Copy link
Member

dustymabe commented Jun 5, 2019

I have added this as a topic for the fedora infra meeting tomorrow.

The meeting is on 2019-06-06 from 15:00:00 to 16:00:00 UTC in #fedora-meeting-1: https://apps.fedoraproject.org/calendar/infrastructure/#m22

@bgilbert bgilbert removed the meeting topics for meetings label Jun 5, 2019
@jlebon
Copy link
Member

jlebon commented Jun 6, 2019

Is there a summary of the discussions from the FCOS community meeting and releng meeting? Or better, a link to the design doc :)

@dustymabe
Copy link
Member

Is there a summary of the discussions from the FCOS community meeting and releng meeting? Or better, a link to the design doc :)

not yet. i'm working on something for it

@dustymabe
Copy link
Member

In the meeting yesterday the infra team claimed that option 1. was feasible with the fedora signing server setup. I have created a Project Proposal for the infrastructure team so that we can get this work designed and scheduled

I have also created a few follow on tickets because discussing the design has identified more things that need to be done:

@dustymabe dustymabe self-assigned this Jun 13, 2019
@dustymabe
Copy link
Member

marking as blocked while we wait on feedback on the Project Proposal

@dustymabe
Copy link
Member

I've added the Project Proposal to the meeting agenda for the fedora infra meeting today.

@dustymabe
Copy link
Member

Discussed in the infra meeting. The proposed project is unlikely to get completed in time for the first preview release. A contingency plan right now is to collaborate with the infra team and get artifacts signed manually.

See https://pagure.io/fedora-infrastructure/issue/7884#comment-579900

@dustymabe dustymabe removed the jira for syncing to jira label Sep 5, 2019
@dustymabe dustymabe assigned jlebon and unassigned dustymabe Sep 27, 2019
@jlebon
Copy link
Member

jlebon commented Oct 9, 2019

@dustymabe I think #199 is separate from this ticket, right?

@jlebon
Copy link
Member

jlebon commented Oct 9, 2019

Should we use the existing Fedora release signing keys, or our own keys? The former gives us key rotation for free, but we'd probably want to ensure we sign each release with the key corresponding to its Fedora major. That'd mean we'd need to synchronously switch keys as a rebase is promoted through the streams.

Just to record a decision here about this that was made OOB as well as rediscussed in the latest community meeting: we want FCOS to use the Fedora release signing keys.

We could make bumping needed configs part of the SOP for transitioning across major versions. Note also that the key for the N+1 release is shipped in the N release (for obvious reasons). Additionally, the OSTree remote we ship today accepts signing with any key in /etc/pki/rpm-gpg (something which I'm actually not a fan of, but works in our favour in this case).

Other things to note:

  • We could teach RoboSignatory about the FCOS versioning scheme and make it use the right key from that.
  • The SOP could also include instructions to add an update barrier so that very old boot images can reach the latest release securely.

@jlebon
Copy link
Member

jlebon commented Oct 9, 2019

Note also BTW that signing will be done in the pipeline itself via fedora-messaging, so we can trivially verify at build time that the right key was used.

@dustymabe
Copy link
Member

@dustymabe I think #199 is separate from this ticket, right?

At this point I think the answer is yes. While we do still want the commits imported, that work is not required for this ticket to be considered complete.

When I wrote that I think we hadn't worked out all of the details and maybe I thought we needed to import the commit into the infra before we could sign it, thus it would be needed for this ticket.

@dustymabe
Copy link
Member

dustymabe commented Oct 9, 2019

  • We could teach RoboSignatory about the FCOS versioning scheme and make it use the right key from that.

That seems like a nice simple answer. Considering that we've done some work on integrating our signing into robosignatory and we're familiar with it a bit more I think this makes sense.

Note also BTW that signing will be done in the pipeline itself via fedora-messaging, so we can trivially verify at build time that the right key was used.

+1 for this too. On the robosig side we "detect fedora major release from version and use the corresponding key" and on the pipeline side we verify it was signed with the key we expected it to be signed by. That would give us confidence on both sides.

@jlebon
Copy link
Member

jlebon commented Oct 16, 2019

I think we can close this now. Will open a follow-up ticket for checking at build time that the OSTree is signed with the correct key, and potentially making RoboSignatory smarter.

@dustymabe
Copy link
Member

thanks @jlebon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
releng Related to Fedora Release Engineering team/input
Projects
None yet
Development

No branches or pull requests

5 participants