Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot upgrade from N-2 releases due to missing GPG key #749

Closed
openstacker opened this issue Feb 18, 2021 · 17 comments
Closed

Cannot upgrade from N-2 releases due to missing GPG key #749

openstacker opened this issue Feb 18, 2021 · 17 comments

Comments

@openstacker
Copy link

Host system details

Provide the output of rpm-ostree status.

[root@k8s-v1-16-9-xjanjbsicidd-master-0 ~]# rpm-ostree status
State: idle
AutomaticUpdates: disabled
Deployments:
● ostree://fedora:fedora/x86_64/coreos/stable
Version: 31.20200407.3.0 (2020-04-21T19:37:39Z)
Commit: 89e17cc21b6aa3bea8959d1e6957fda157168d57ba6805d8a36142184edc2901
GPGSignature: Valid signature by 7D22D5867F2A4236474BF7B850CB390B3C3359C4

Expected vs actual behavior

[root@k8s-v1-16-9-xjanjbsicidd-master-0 ~]# rpm-ostree deploy bc7745fe1b5b77385fd39595ed82c142483a904dd4dc855b538e6d18d4f1641f
Validating checksum 'bc7745fe1b5b77385fd39595ed82c142483a904dd4dc855b538e6d18d4f1641f'
error: Commit 20de1953c18bd432a8ed4e19b91c64978100dba7d1c4813f91f8cf4d4d2411b4: Signature made Wed Feb 3 18:16:59 2021 using RSA key ID 49FD77499570FF31
Can't check signature: public key not found

Deploy success.

Steps to reproduce it

Create a Fedora CoreOS 31 server and try to upgrade to any version later than that.

Based on the error, I understand it's because the RPM GPG key for 33 is missing on the server created based on 31, but I don't think it should fail like this, because I'm not asking to upgrade to 33 directly.
bc7745fe1b5b77385fd39595ed82c142483a904dd4dc855b538e6d18d4f1641f is still in 31.

And BTW, is there any way I can manually add the 33 GPG key to old server so as to avoid this kind of issue? Thanks.

@openstacker
Copy link
Author

@dustymabe Hi Dusty, can you please shed some light on this? Thank you very much.

@openstacker
Copy link
Author

I just added a new remote and specify the key for fedora coreos 33 there. And the verify error just gone. But I got a new error as below:

[root@k8s-v1-16-9-xjanjbsicidd-master-0 remotes.d]# rpm-ostree deploy bc7745fe1b5b77385fd39595ed82c142483a904dd4dc855b538e6d18d4f1641f
Validating checksum 'bc7745fe1b5b77385fd39595ed82c142483a904dd4dc855b538e6d18d4f1641f'
2 metadata, 0 content objects fetched; 17 KiB transferred in 3 seconds
error: Checksum bc7745fe1b5b77385fd39595ed82c142483a904dd4dc855b538e6d18d4f1641f not found in fedora:fedora/x86_64/coreos/stable

@lucab
Copy link
Contributor

lucab commented Feb 18, 2021

@openstacker thanks for the report, however this seems specific to Fedora CoreOS so I'm transferring the issue there.

In general, the FCOS version you are using is quite ancient and should go through several intermediate updates first, before reaching the current stable version. You shouldn't be performing updates manually, in the default setup the OS already takes care of periodic auto-updates through zincati.service.

For signing keys specifically, https://docs.fedoraproject.org/en-US/fedora-coreos/update-barrier-signing-keys/ describes how the mechanism works.

@lucab lucab transferred this issue from coreos/rpm-ostree Feb 18, 2021
@jlebon
Copy link
Member

jlebon commented Feb 18, 2021

I just added a new remote and specify the key for fedora coreos 33 there. And the verify error just gone. But I got a new error as below:

That error is because that commit has never actually been released on the stable branch. (It corresponds to 32.20200726.3.0, which was nixed because of coreos/fedora-coreos-streams#158 (comment)). Out of curiosity, how did you get that checksum?

For signing keys specifically, docs.fedoraproject.org/en-US/fedora-coreos/update-barrier-signing-keys describes how the mechanism works.

Hmm, actually I think this reveals a flaw in the design. I don't think even with Zincati it would have upgraded. The issue is that rpm-ostree wants to verify that the checksum provided lives on the same branch. To do that, it needs to pull the commit metadata objects starting from the tip and go down the tree until it finds the matching checksum. If the tip has moved far enough that the signing key rotated twice, it won't be able to verify it without the user manually importing it.

For disclosure, I'm responsible for that rpm-ostree behaviour, and while it gets in the way here, I still think it's the right thing to do as a default behaviour. We could maybe have a switch or something to override that which Zincati could use? Something like --skip-checksum-validation.

@jlebon
Copy link
Member

jlebon commented Feb 18, 2021

I don't think even with Zincati it would have upgraded.

Yeah indeed, booting an f31 FCOS, here's the Zincati logs:

Feb 18 16:01:18 ibm-p8-kvm-03-guest-02 zincati[899]: [INFO ] new release '32.20200615.3.0' selected, proceeding to stage it
Feb 18 16:01:21 ibm-p8-kvm-03-guest-02 zincati[899]: [ERROR] failed to stage deployment: rpm-ostree deploy failed:
Feb 18 16:01:21 ibm-p8-kvm-03-guest-02 zincati[899]:     error: Commit 20de1953c18bd432a8ed4e19b91c64978100dba7d1c4813f91f8cf4d4d2411b4: Signature made Wed Feb  3 18:16:59 2021 using RSA key ID 49FD77499570FF31
Feb 18 16:01:21 ibm-p8-kvm-03-guest-02 zincati[899]:     Can't check signature: public key not found
Feb 18 16:01:21 ibm-p8-kvm-03-guest-02 zincati[899]:

@jlebon
Copy link
Member

jlebon commented Feb 18, 2021

We could maybe have a switch or something to override that which Zincati could use? Something like --skip-checksum-validation.

Hmm actually, nowadays we could just use OSTree ref bindings for this. It's not as strong as verifying that the commit is on the same branch (because ref bindings can be for multiple branches), but it's probably enough so we could drop that rpm-ostree behaviour. (And actually in the FCOS case, we would only have a single ref in the list anyway so it effectively is as strong.)

@jlebon jlebon changed the title rpm-ostree deploy failed Cannot upgrade from N-2 releases due to missing GPG key Feb 18, 2021
jlebon added a commit to jlebon/fedora-coreos-tracker that referenced this issue Apr 29, 2021
This was well-intentioned but sadly it doesn't actually work in
practice:

coreos#749

Let's just skip doing that for now until we address it.
@dustymabe dustymabe added the meeting topics for meetings label Apr 29, 2021
@jlebon
Copy link
Member

jlebon commented May 5, 2021

We discussed this in the community meeting today.

A few alternatives that came out were:

  1. add a flag to rpm-ostree deploy to have it skip branch validation and have Zincati use that
  2. same as 1, but have Zincati use it only in some detected situation (e.g. some flag in the update metadata, or based on the age of the node)
  3. same as 1, but sanity-check that the fetched commit is on the same stream as the one we expect by looking at the commit metadata
  4. keep the rpm-ostree validation, and just sign the OSTree commit with multiple GPG keys (this also helps with rpm-ostree startup delays because of GPG key loading #761); this inherently means there's a cutoff after which we don't support upgrading nodes

Personally, I'm leaning now more towards 3. Then, arbitrarily old nodes will still be able to update to the latest (or at least, we're not consciously breaking that path), but we still get some validation that the commit belongs to the same stream (which is what branch protection was trying to help with).

@jlebon jlebon removed the meeting topics for meetings label May 5, 2021
@dustymabe
Copy link
Member

3. seems reasonable to me.

jlebon added a commit to jlebon/rpm-ostree that referenced this issue May 7, 2021
In Fedora CoreOS, updates are driven by Zincati and we thus completely
trust the information it gives us. The branch validation rpm-ostree does
is thus not necessary. It's also harmful in the case where the node is
extremely out of date because it may not be able to GPG verify the
commit at the tip of the branch (e.g. because the GPG key isn't yet in
the tree).

See: coreos/fedora-coreos-tracker#749
jlebon added a commit to jlebon/rpm-ostree that referenced this issue May 7, 2021
In Fedora CoreOS, updates are driven by Zincati and we thus completely
trust the information it gives us. The branch validation rpm-ostree does
is thus not necessary. It's also harmful in the case where the node is
extremely out of date because it may not be able to GPG verify the
commit at the tip of the branch (because the GPG key isn't yet in the
tree).

See: coreos/fedora-coreos-tracker#749
jlebon added a commit to jlebon/rpm-ostree that referenced this issue May 7, 2021
In Fedora CoreOS, updates are driven by Zincati and we thus completely
trust the information it gives us. The branch validation rpm-ostree does
is thus not necessary. It's also harmful in the case where the node is
extremely out of date because it may not be able to GPG verify the
commit at the tip of the branch (because the GPG key isn't yet in the
tree).

See: coreos/fedora-coreos-tracker#749
@jlebon
Copy link
Member

jlebon commented May 7, 2021

rpm-ostree patch: coreos/rpm-ostree#2819

For the commit metadata validation, I think that logic makes the most sense in Zincati? I.e. Zincati would first fetch the commit metadata of the target commit (using the API or just shelling out to ostree pull --commit-metadata-only), check fedora-coreos.stream, and only if it matches carry on to rpm-ostree deploy revision=... --skip-branch-check.

@jlebon jlebon added the meeting topics for meetings label May 12, 2021
@jlebon
Copy link
Member

jlebon commented May 12, 2021

I.e. Zincati would first fetch the commit metadata of the target commit (using the API or just shelling out to ostree pull --commit-metadata-only),

Discussed this with @lucab. We can just stage it as a locked deployment as usual and then we can inspect the metadata via rpm-ostree status --json, which Zincati already calls.

@travier
Copy link
Member

travier commented May 12, 2021

Another option is to import the GPG keys into the node via a trusted channel to allow rpm-ostree to verify commits signautres and update the node.

@dustymabe
Copy link
Member

dustymabe commented May 18, 2021

WORKAROUND

If you find yourself in this position a quick workaround is to grab the keys from releases you may be missing. Something like:

curl -L https://src.fedoraproject.org/rpms/fedora-repos/raw/rawhide/f/RPM-GPG-KEY-fedora-33-primary | sudo tee /etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-33-primary
curl -L https://src.fedoraproject.org/rpms/fedora-repos/raw/rawhide/f/RPM-GPG-KEY-fedora-34-primary | sudo tee /etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-34-primary

After this Zincati/rpm-ostree can continue to process updates.

@jlebon
Copy link
Member

jlebon commented May 19, 2021

Zincati part filed here: coreos/zincati#558

@dustymabe dustymabe removed the meeting topics for meetings label May 19, 2021
dustymabe added a commit to dustymabe/fedora-coreos-streams that referenced this issue May 25, 2021
This update barrier is pretty much useless because of
coreos/fedora-coreos-tracker#749, but we
decided to put them in place anyway just to keep following the process.

As far as the process goes. See past discussion on this topic:

- coreos/fedora-coreos-tracker#480 (comment)
- https://docs.fedoraproject.org/en-US/fedora-coreos/update-barrier-signing-keys/
@dustymabe
Copy link
Member

@jlebon @bgilbert: Considering the direction we're going here should we change the rebase template back to telling people to put the update barriers in place? We changed it in #810

@jlebon
Copy link
Member

jlebon commented May 25, 2021

@jlebon @bgilbert: Considering the direction we're going here should we change the rebase template back to telling people to put the update barriers in place? We changed it in #810

SGTM!

dustymabe added a commit to coreos/fedora-coreos-streams that referenced this issue May 26, 2021
This update barrier is pretty much useless because of
coreos/fedora-coreos-tracker#749, but we
decided to put them in place anyway just to keep following the process.

As far as the process goes. See past discussion on this topic:

- coreos/fedora-coreos-tracker#480 (comment)
- https://docs.fedoraproject.org/en-US/fedora-coreos/update-barrier-signing-keys/
dustymabe added a commit to dustymabe/fedora-coreos-tracker that referenced this issue May 26, 2021
We decided to continue to do this even though it's broken right now.
We have a plan to fix it in the future so let's leave the process in
place.

xref: coreos#749 (comment)
@dustymabe
Copy link
Member

@jlebon @bgilbert: Considering the direction we're going here should we change the rebase template back to telling people to put the update barriers in place? We changed it in #810

SGTM!

#844

jlebon pushed a commit that referenced this issue May 26, 2021
We decided to continue to do this even though it's broken right now.
We have a plan to fix it in the future so let's leave the process in
place.

xref: #749 (comment)
@dustymabe
Copy link
Member

I think we should probably close this since coreos/rpm-ostree#2819 and coreos/zincati#622 were implemented.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants