Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't shell out to GPG on import #437

Merged
merged 5 commits into from
Oct 19, 2022

Conversation

znewman01
Copy link
Contributor

Fixes: #428

Description of the changes being introduced by the pull request:

I recommend that you read this PR commit-by-commit.

This PR does the following:

  1. Adds functions to securesystemslib.gpg.constants to replace the constants that depended on shelling out. (Now the file contains more than constants, oops. They can't go in util, though, because that would create a circular dependency.)

  2. Replaces the constants with lazy calls to the underlying functions. This is a huge hack.

  3. Use the function calls instead of the constants internally.

    If you don't want to make breaking changes, we should stop here.

  4. Remove the GPG constants. This is technically a breaking change (but see the commit message for justification about why it's probably okay).

  5. Catch timeouts, as originally proposed in Checking for gpg availability sometimes times out #428.

Please verify and check that the pull request fulfils the following requirements:

  • The code follows the Code Style Guidelines
  • Tests have been added for the bug fix or new feature
  • Docs have been added for the bug fix or new feature

This way, we (eventually) won't shell out while importing this library.

Signed-off-by: Zachary Newman <zjn@chainguard.dev>
See secure-systems-lab#428.

This commit is over-complicated, but (mostly) not breaking (you can continue to
use `HAVE_GPG` and `GPG_COMMAND`).

In this commit:

- Add cacheing (`functools.lru_cache`) to `is_available_gnupg`.
- Add `gpg_command()` to replace `GPG_COMMAND`.
- Add `have_gpg()` to replace `HAVE_GPG`.
- Replace `GPG_COMMAND` and `HAVE_GPG` with a `lazy.wrap_thunk()` wrapper for
  their corresponding functions.
  - This wrapper lazily runs the underlying thunk and passes
    through *everything* to the result of calling it.
  - This is a terrible hack and I'm sorry.
  - Some things still break, like `StringIO`.
- Add a test that imports `securesystemslib.constants.gpg` with a busted
  `$GNUPG` to make sure that importing the library doesn't try to shell
  out. This failed before this change.

Signed-off-by: Zachary Newman <zjn@chainguard.dev>
Signed-off-by: Zachary Newman <zjn@chainguard.dev>
After the previous commit, they're not used *inside* securesystemslib.

Outside, we can check [sourcegraph]. This will break a few folks, notably
[in-toto] (just tests) and [DataDog/integrations-core]. But it's an easy enough
fix.

[sourcegraph]: https://sourcegraph.com/search?q=context:global+lang:python+securesystemslib+AND+%28HAVE_GPG+OR+GPG_COMMAND+OR+GPG_VERSION_COMMAND+OR+GPG_SIGN_COMMAND+OR+GPG_EXPORT_PUBKEY_COMMAND%29&patternType=standard
[in-toto]: https://github.com/in-toto/in-toto/blob/c1a5e6b7468ccc74d7e79bc8bec2caf96ddb31f1/tests/test_verifylib.py#L58
[DataDog/integrations-core]: https://github.com/DataDog/integrations-core/blob/0e20752603f0d43db44090c8777d4ea69ca7111a/datadog_checks_dev/datadog_checks/dev/tooling/signing.py#L13

Signed-off-by: Zachary Newman <zjn@chainguard.dev>
Fixes secure-systems-lab#428. Okay to do this because we will fail tests if GPG is unexpectedly
unavailable (secure-systems-lab#434).

Signed-off-by: Zachary Newman <zjn@chainguard.dev>
@jku
Copy link
Collaborator

jku commented Oct 18, 2022

You really went the extra mile to keep the constants backwards compatible 😮 (until the 4th commit) ... I'm still reporting you to the Python police, but I admit that's impressive.

I'll have another look before clicking approve but I think this will work (and hopefully the constants can be removed a bit later).

@jku
Copy link
Collaborator

jku commented Oct 18, 2022

I'll have another look before clicking approve but I think this will work (and hopefully the constants can be removed a bit later).

oh, forgot to say: I am not making a statement about when the API breaking 4th commit should be included: I'll leave this to some Real Maintainers. Just saying this seems good to me with and without the last commits.

Copy link
Member

@lukpueh lukpueh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much appreciated fix and impressive commit story, @znewman01!

I am for merging the full PR. Fixing up relevant in-toto tests should be no problem and I can do a coordinated release. cc @SantiagoTorres, @adityasaky

DataDog should also be fine, because AFAICS client code is not affected. But let's check:
@trishankatdatadog, do you see any issues with a breaking securesystemslib release that replaces GPG_COMMAND used in datadog_checks_dev/.../signing.py with gpg_command()?

import logging
import os

from securesystemslib import process

log = logging.getLogger(__name__)


def is_available_gnupg(gnupg):
@functools.lru_cache(maxsize=3)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May I ask why maxsize=3?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to put a maxsize for py3.7, which is in the tox matrix. Otherwise would use the default.

I picked 3 because we should only call it three times under normal circumstances (gpg, gpg2, $GNUPG). That's perhaps a little bit too clever but I had a hard time justifying other numbers.

def __bool__(self):
return bool(lazy())

return Wrapper()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very smart! Took me a while to understand what this is doing. I was a bit confused by the name of the nested function. Do I understand correctly that directly calling thunk in the Wrapper dunder-methods would be enough for lazy evaluation, but calling the nested lazy makes so that thunk is called only once?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I understand correctly that directly calling thunk in the Wrapper dunder-methods would be enough for lazy evaluation, but calling the nested lazy makes so that thunk is called only once?

Yes, exactly.

setenv =
GNUPG = false
commands =
python -c "import securesystemslib.gpg.constants"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a test that imports securesystemslib.constants.gpg with a busted
$GNUPG to make sure that importing the library doesn't try to shell
out. This failed before this change.

This failed before occasionally (e.g. on timeout), right? I don't think we need this in addition to [testenv:py38-no-gpg].

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@znewman01, please remove unless necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It failed every time before, not occasionally.

Note that I set GNUPG = false in this environment, which will give an error. That's different from unset GNUPG, which will find no GPG.

I don't believe that py38-no-gpg tests the same thing. This test checks "if we shell out when importing gpg.constants we fail"; py38-no-gpg tests "if we have no GPG available, tests still pass."

It might be clearer to replace false with a script like touch /tmp/oops_i_ran and check that that file wasn't created after import but that felt a little too clever.

As to the question—is this necessary? I don't know; are any tests necessary? Without it, we're not really checking that the issue is fixed—I could revert the rest of the PR and the existing tests would still pass. Up to the maintainers whether you want a test for the fix or not.

@trishankatdatadog
Copy link
Contributor

DataDog should also be fine, because AFAICS client code is not affected. But let's check:
@trishankatdatadog, do you see any issues with a breaking securesystemslib release that replaces GPG_COMMAND used in datadog_checks_dev/.../signing.py with gpg_command()?

@ofek

@ofek
Copy link
Contributor

ofek commented Oct 19, 2022

Nope, we'll update when it happens 👍

Copy link
Member

@lukpueh lukpueh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We got a 👍 from DataDog. Please address inline comment, and we are good to go.

setenv =
GNUPG = false
commands =
python -c "import securesystemslib.gpg.constants"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@znewman01, please remove unless necessary.

@lukpueh lukpueh merged commit fb65dfe into secure-systems-lab:master Oct 19, 2022
lukpueh added a commit to lukpueh/securesystemslib that referenced this pull request Nov 21, 2022
Only tox envs that are in the format `py<MAJOR><MINOR>` are run
automatically in CI, all others need to be enabled explicitly.

In secure-systems-lab#437 a new tox env 'py311-test-gpg-fails' was added but not
enabled in CI, this is changed in this patch.

Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
@woodruffw
Copy link
Contributor

Would it be possible to get a release cut with these changes?

It looks like the last release was cut the day before, and it'd be nice to be able to bump securesystemslib as a subdependency for things like sigstore-python to avoid these kinds of import side effects 🙂.

lukpueh added a commit to lukpueh/in-toto that referenced this pull request Jan 12, 2023
secure-systems-lab/securesystemslib#437 replaced the HAVE_GPG
bool constant with a function with similar effect, to allow lazy
evaluation of whether GPG is present.

That is, we only shell out to check if gpg is present, when we
call have_gpg() and not when we import the module, where the name
is defined, which was the case before.

This commit adopts the rename in in-toto tests. No other code in
in-toto is affected by the rename.

Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
alanssitis pushed a commit to alanssitis/in-toto-py that referenced this pull request Feb 8, 2023
secure-systems-lab/securesystemslib#437 replaced the HAVE_GPG
bool constant with a function with similar effect, to allow lazy
evaluation of whether GPG is present.

That is, we only shell out to check if gpg is present, when we
call have_gpg() and not when we import the module, where the name
is defined, which was the case before.

This commit adopts the rename in in-toto tests. No other code in
in-toto is affected by the rename.

Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
alanssitis pushed a commit to alanssitis/in-toto-py that referenced this pull request Feb 8, 2023
secure-systems-lab/securesystemslib#437 replaced the HAVE_GPG
bool constant with a function with similar effect, to allow lazy
evaluation of whether GPG is present.

That is, we only shell out to check if gpg is present, when we
call have_gpg() and not when we import the module, where the name
is defined, which was the case before.

This commit adopts the rename in in-toto tests. No other code in
in-toto is affected by the rename.

Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Checking for gpg availability sometimes times out
6 participants