Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support using comments to select parts to encrypt #974

Closed
wants to merge 1 commit into from

Conversation

mitar
Copy link
Contributor

@mitar mitar commented Dec 19, 2021

This PR adds support to annotate comments with a string (e.g., sops:enc) which can then be matched with a regex. If it matches, the corresponding value (the one which follows the comment) is encrypted while other values are not. (There is also the opposite regex available, to select those values which should not be encrypted.)

This enables the YAML file to have the same structure encrypted and decrypted, without having to add suffixes or manage complex regexes to match keys. See #543 for more discussion.

cmd/sops/main.go Outdated Show resolved Hide resolved
config/config.go Outdated Show resolved Hide resolved
config/config_test.go Outdated Show resolved Hide resolved
@mitar
Copy link
Contributor Author

mitar commented Mar 4, 2022

Rebased this as well.

@mitar
Copy link
Contributor Author

mitar commented Sep 30, 2022

Rebased to latest develop.

@mitar mitar force-pushed the encrypted-comment-regex branch 2 times, most recently from da0ec2e to 969b8df Compare July 17, 2023 17:21
@mitar
Copy link
Contributor Author

mitar commented Jul 17, 2023

I rebased to latest main.

@Gui13
Copy link

Gui13 commented Aug 16, 2023

Hey guys, we are interested in the merging of this PR, it solves a long-standing issue of the --encrypted-suffix which requires us to perform sed -i 's/encrypted_suffix//g' on all our files after sops decryption.

The linked issue, #543, is 4 years old, and this change elegantly solves the issue.

Is there something I can do to help on this?

@hiddeco hiddeco added this to the v3.9.0 milestone Aug 16, 2023
@hiddeco
Copy link
Member

hiddeco commented Aug 16, 2023

This did not make the cut for v3.8.0, as this version will contain many changes already (especially in the area of rewriting all key source implementations to their latest API clients). Making it risky to add much more on top of it. It is however scheduled to be looked at for v3.9.0.

sops.go Outdated Show resolved Hide resolved
sops.go Outdated Show resolved Hide resolved
sops.go Show resolved Hide resolved
@mitar
Copy link
Contributor Author

mitar commented Sep 22, 2023

This PR now depends on #1300.

@mitar
Copy link
Contributor Author

mitar commented Sep 22, 2023

@felixfontein: I addressed all review comments. I moved unrelated changes to a separate commit which is also part of #1300, so once that is merged it will not be in this PR anymore.

I also rebased to the current main branch.

@mitar
Copy link
Contributor Author

mitar commented Sep 25, 2023

I rebased after #1300 was merged.

Signed-off-by: Mitar <mitar.git@tnode.com>
Copy link
Contributor

@felixfontein felixfontein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed a problem: when I store this as x.yml

# x
x: x

and run

sops --encrypt --unencrypted-comment-regex ENC x.yml | sops --decrypt --input-type yaml --output-type yaml /dev/stdin

I get

MAC mismatch. File has A4ABD4448C49562D828115D13A1FCCEA927F52B4D5459297F8B43E42DA89238BC13626E43DCB38DDB082488927EC904FB42057443983E88585179D50551AFE62, computed 9D5B4CB3A99652A79C6149B4C1B9F17CE15E5BBB5E13D2054610BDE0C7BE57B58CAEE9B503F432AF0F27F857D26D549494D2D86CF3E68F3DA78554089C86CD8F

This is caused because ENC matches the encrypted comment.

I'm not really sure what to do here. Potentially shouldBeEncrypted should know whether it is decrypting, and if it is, and the regex for the comment matches, it should first check whether the value looks like an encrypted string (https://github.com/getsops/sops/blob/main/aes/cipher.go#L186) before accepting the regex match. And at the same time, when it encounters such comments (that match the regex) during encryption, it should reject the file. Does that sound reasonable? Do you have another idea?


Conversely, you can opt in to only left certain keys without encrypting by using the
``--unencrypted-comment-regex`` option, which will leave the values and comments
unencrypted when they have a preeceding comment that matches the supplied regular expression.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment can also be on the same line, like foo: bar # ENC. (Which is the case because that comment is moved before x: y.)

@mitar
Copy link
Contributor Author

mitar commented Sep 25, 2023

Thank you for the review.

This is caused because ENC matches the encrypted comment.

Nice catch. I didn't really expect for people to use such simple regexes. Personally I use sops:enc which does not seem to have the issue you are describing (it cannot appear in encoded string).

Maybe we should just document this issue and suggest a regex (e.g., sops:enc)? So documentation could be something like: "do not pick a regex which can match encrypted values" (with link to the format of encrypted values)?

But it is not really nice for the security tool like this one to be able to be misconfigured.

Potentially shouldBeEncrypted should know whether it is decrypting, and if it is, and the regex for the comment matches, it should first check whether the value looks like an encrypted string.

I think we should do a check like:

  • Try to match the comment with encryptedValueRegexp (matching the structure of encrypted values) and with given regexp.
  • If only one of them match, good, we know what to do.
  • If both match, then we transform the comment by removing content matched by the encryptedValueRegexp and try to match given regexp again.
  • If it still matches, good, we know what to do.
  • If it does not match, we abort and complain that the regexp is too broad.

I think this is slightly better in sense that it informs the user about misconfiguration and that they should fix the regexp.

And at the same time, when it encounters such comments (that match the regex) during encryption, it should reject the file.

So during encryption, every time we encrypt any comment, we try to match the result with the given regexp, and if it does match the regexp, we abort and complain that the regexp is too broad. If I understand it correctly, I think this is fine.

So we should just make sure to help the user pick a good regexp by detecting too broad (or should we way "too simple"?) regexp and complain in such case.

@mitar
Copy link
Contributor Author

mitar commented Oct 11, 2023

@felixfontein Do you agree with my proposal above?

@felixfontein
Copy link
Contributor

Hmm, reading all this again after some time mainly shows me that this is something we have to be very, very careful with.

One other solution that came to my mind when looking at this again: during encryption, once a comment is encrypted, check whether the encrypted comment matches UnencryptedCommentRegex or EncryptedCommentRegex (when provided). If any of them matches, reject the whole file.

@mitar
Copy link
Contributor Author

mitar commented Nov 7, 2023

One other solution that came to my mind when looking at this again: during encryption, once a comment is encrypted, check whether the encrypted comment matches UnencryptedCommentRegex or EncryptedCommentRegex (when provided). If any of them matches, reject the whole file.

Hm, isn't this the same as we have been discussing already? I wrote:

So during encryption, every time we encrypt any comment, we try to match the result with the given regexp, and if it does match the regexp, we abort and complain that the regexp is too broad. If I understand it correctly, I think this is fine.

So I think this is great. We seems to be in agreement what to do when encrypting. Just check if the result matches the regexp and abort if it does.

But I think the question is what to do when decrypting. So I can imagine a scenario where you encrypted only few lines with a comment like sops:enc. Great. But then in the encrypted file you decide to change sops:enc to enc comment. And then you try to decrypt that with an updated CLI argument. What should happen? This is why I proposed:

I think we should do a check like:

  • Try to match the comment with encryptedValueRegexp (matching the structure of encrypted values) and with given regexp.
  • If only one of them match, good, we know what to do.
  • If both match, then we transform the comment by removing content matched by the encryptedValueRegexp and try to match given regexp again.
  • If it still matches, good, we know what to do.
  • If it does not match, we abort and complain that the regexp is too broad.

I think this is slightly better in sense that it informs the user about misconfiguration and that they should fix the regexp.

Maybe a simpler way would be to store (and MAC) used regexp to encrypt into config section of the file. And when decrypting only use the regexp from the config section. So if you want to change the regexp/comment, you have to decrypt and re-encrypt? In general it might be nice user experience to store all those CLI config switches into config section so that you do not have to specify them again when decrypting (and match them exactly, and know what was used when encrypting).

What do you think?

@felixfontein
Copy link
Contributor

One other solution that came to my mind when looking at this again: during encryption, once a comment is encrypted, check whether the encrypted comment matches UnencryptedCommentRegex or EncryptedCommentRegex (when provided). If any of them matches, reject the whole file.

Hm, isn't this the same as we have been discussing already? I wrote:

So during encryption, every time we encrypt any comment, we try to match the result with the given regexp, and if it does match the regexp, we abort and complain that the regexp is too broad. If I understand it correctly, I think this is fine.

So I think this is great. We seems to be in agreement what to do when encrypting. Just check if the result matches the regexp and abort if it does.

Yes, this part we agree upon.

But I think the question is what to do when decrypting. So I can imagine a scenario where you encrypted only few lines with a comment like sops:enc. Great. But then in the encrypted file you decide to change sops:enc to enc comment. And then you try to decrypt that with an updated CLI argument. What should happen?

IMO: if you modify the encrypted file so it won't decrypt anymore, it's your own fault if it fails.

This is why I proposed:

I think we should do a check like:

  • Try to match the comment with encryptedValueRegexp (matching the structure of encrypted values) and with given regexp.
  • If only one of them match, good, we know what to do.
  • If both match,

In case both match, I would simply error out. The user did something wrong (assuming we don't have a bug) and they need to fix it manually. TBH I would simply check the encryptedValueRegexp regular expression, and if it matches, assume that the comment is encrypted (and not even check the other regexp).

then we transform the comment by removing content matched by the encryptedValueRegexp and try to match given regexp again.

  • If it still matches, good, we know what to do.
  • If it does not match, we abort and complain that the regexp is too broad.

I think this is slightly better in sense that it informs the user about misconfiguration and that they should fix the regexp.

I would avoid that. It makes decryption less efficient (two regular expressions to test instead of potentially only one per comment), and increases complexity for a situation that should not arise if users do not modify sops encrypted files by hand.

Maybe a simpler way would be to store (and MAC) used regexp to encrypt into config section of the file.

Right now the MAC only checks the message, and not the metadata. I would avoid extending it. We could add another MAC which only covers metadata, but also that is something that should not happen in this PR.

And when decrypting only use the regexp from the config section.

That's what should happen anyway.

So if you want to change the regexp/comment, you have to decrypt and re-encrypt? In general it might be nice user experience to store all those CLI config switches into config section so that you do not have to specify them again when decrypting (and match them exactly, and know what was used when encrypting).

If you have to specify any switch again when decrypting, it is a bug that needs to be fixed ASAP. The only options that should affect decryption are general things like --output, --output-type, --ignore-mac.

@felixfontein
Copy link
Contributor

I rebased this PR to do some (unrelated) work on top of it; the resulting commit is contained in #1387.

@mitar
Copy link
Contributor Author

mitar commented Dec 26, 2023

Thanks. I will look into this again after holidays.

@mitar
Copy link
Contributor Author

mitar commented Dec 26, 2023

BTW, if you are currently working on this and you are mentally in this logic and willing, feel free to wrap up with this PR yourself. If it is easier for you.

@felixfontein
Copy link
Contributor

Not right now, but I might try to invest more time in this during the next days.

@felixfontein
Copy link
Contributor

I'll work on it in #1392.

@mitar
Copy link
Contributor Author

mitar commented Jan 4, 2024

This is continued in #1392.

@mitar mitar closed this Jan 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants