Skip to content

authz: add OpContext::for_saga_action #672

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 8, 2022
Merged

authz: add OpContext::for_saga_action #672

merged 1 commit into from
Feb 8, 2022

Conversation

davepacheco
Copy link
Collaborator

@davepacheco davepacheco commented Feb 7, 2022

This change takes a first cut at adding the ability to make authz checks from saga actions.

It works like this:

  • When an API call creates a saga, it includes in the saga's parameters a new struct called authn::saga::Serialized. This represents a serialized authn::Context. There's a authn::saga::Serialized::for_opctx that serializes the authn::Context of the currently-authenticated user. (In other words: we take the user that's authenticated for the current HTTP request and store that in the saga's parameters.)
  • Within a saga action, you use OpContext::for_saga_action to reconstruct an OpContext from the authn::saga::Serialized that's stored in the saga's parameters.

I implemented this end-to-end for one of the datastore functions that was being used by a saga action (disk_update_runtime).

The motivation for this is that I want to start doing authz checks in datastore methods that are invoked from saga actions. With this change, those authz checks will be done against whatever user made the HTTP request that created the saga. This seemed like a reasonable first cut.

There's an additional benefit which is that OpContext::for_saga_action crates a slog::Logger with metadata from steno about the name of the current saga action. Thus, all saga actions now get a logger that includes that metadata. We should update Steno to make more information available here (like the current saga template name and saga id, or even a logger that we can start from).

Some design thoughts:

  • It's worth noting that serializing just the user's identity isn't necessarily the long-term plan.
    • We may want something more cryptographically verifiable -- e.g., some token that proves the user actually created the saga in question, or that a trusted Nexus instance did, or whatever. I'm not sure. But right now we trust whatever's in the database record.
    • A different approach would be to pre-execute any authorization checks that would be needed in the saga and store some token into the saga parameters that proves we did those authz checks up front.
    • Yet a different approach would be to store the current user's roles into the saga parameters and do authz checks against those roles.
  • Critically, we can change the contents of authn::saga::Serialized to store whatever we want and change how this works later.
  • authn::saga::Serialized could just be an authn::Context right now. I'm reluctant to make authn::Context impl Serialize because I don't want us to accidentally try to serialize it some place without thinking through the implications (like the ones I mentioned above).

In summary: I know this isn't fully baked but I think it's a step in the right direction, and lacking even a crude facility here is blocking the work to protect more endpoints with authz.

Copy link
Collaborator

@smklein smklein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, happy to see that this is possible. I have some naming nits / questions below, but at a high level...

  • Serializing the actor ID definitely seems like a reasonable place to start. Although getting a "verifiable token" would add defense in depth, relying on the trustworthiness of the DB seems like a fair start.
  • In the PR message, you mentioned possibly storing roles in the saga parameters - this seems possible, but it also seems like it might make role modification somewhat of a pain, no?

Comment on lines +15 to +21
/// Serialized form of an `OpContext`
// NOTE: This structure must be versioned carefully. (That's true of all saga
// structures, but this one has a particularly large blast radius.)
#[derive(Debug, Deserialize, Serialize)]
pub struct Serialized {
kind: authn::Kind,
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was the rationale behind creating a new struct instead of:

  • Allowing a caller to get authn::Kind from an authn::Context
  • Implementing From<authn::Kind> for authn::Context?

(I know the comment says OpContext, but this seems much more specifically targeting authn::Context)

That's basically what this is right? Something which should stay 1:1 with the "current authentication information", or some scheme to re-acquire it? Since Kind includes "all the possible ways the user may be authenticated", that seems like a reasonable proxy.

Either way - could we change some of the names being used here?

  • I know this is intended to be used by sagas, but this data structure had nothing directly to do with sagas. If a caller wanted to record "who authenticated an operation at a point-in-time", they could use this code, but that seems like it would be a divergence from the name.
  • If we stick with a separate struct - Serialized seems a bit generic, no? Tons of structures which are serializable exist in the authn module.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with your feeling that there's something off about the structure and name here. Let me address your specific points first:

What was the rationale behind creating a new struct instead of:

* Allowing a caller to get `authn::Kind` from an `authn::Context`

* Implementing `From<authn::Kind>` for `authn::Context`?

...

(I know the comment says OpContext, but this seems much more specifically targeting authn::Context)

That's basically what this is right? Something which should stay 1:1 with the "current authentication information", or some scheme to re-acquire it? Since Kind includes "all the possible ways the user may be authenticated", that seems like a reasonable proxy.

I've been trying to keep the internal details of authn::Context as private as possible because I expect them to change as we flesh out authn. What you said is currently true. I don't know if that will be true going forward.

Either way - could we change some of the names being used here?

* I know this is intended to be used _by_ sagas, but this data structure had nothing directly to do with sagas. If a caller wanted to record "who authenticated an operation at a point-in-time", they _could_ use this code, but that seems like it would be a divergence from the name.

There are different reasons that someone might want a serialized idea of "who authenticated an operation". These use cases might have very different security properties. You might want this information for a debug log, a formal audit log, or to do an authz check in the context of the request, or to propagate the authentication to some other context, possibly at a later time (as in the case of a saga action, but could also be true if we passed this information in a request to some other control plane service). Some of these representations could involve actual secrets (e.g., a bearer token), while others absolutely should never include a secret (the debug log entry). This change creates one representation that's (barely) suitable for one of these purposes. As you said it's simple enough to be used for many of them today. But it seems unlikely that will be true in the future.

Now, we could go either way on this. If we expect to have multiple representations in the future, we could label this structure as general-purpose today and trust that our future selves will be careful enough to recognize when security demands a different representation and build new representations as needed. Or we could label it today as "suitable for use in this specific case only" and trust our future selves, with a second use case in hand, to decide whether it's suitable and refactor as needed. I generally tend towards the latter -- I don't like to describe something as generic when I only really know about one use case.

* If we stick with a separate struct - `Serialized` seems a bit generic, no? Tons of structures which are serializable exist in the `authn` module.

Yeah, as we've previously discussed, I'd been trying to let the module name describe it. It's authn::saga::Serialized -- a serialized authn [context] for use in sagas. authn::saga::SerializedIdentity? authn::SerializedIdentityForSaga? authn::SagaIdentity?


So like I said, I agree with you that there's something funny about the structure or naming here. But I'm not sure how better to phrase it. To restate my goals:

  • have a type that can be easily incorporated directly into a saga's parameters
  • have it be very easy to get that type from an OpContext and to reconstruct an OpContext from it [along with other information that's also part of the OpContext]
  • so far, that type could be authn::Context directly or as you said authn::Kind. That sort of ties the in-memory form (which is mainly intended for the use case of authorizing an operation in the context of this request) to the serialized form (which is intended for storing to disk and reusing at a later time in some other context).
  • I also want to make it hard for someone to (accidentally) serialize the internal implementation details of an authn::Context to somewhere else (e.g., the public API or some other database record or a log file) without considering the consequences -- i.e., if somebody wants to do that, I want them to ask "hmm, maybe authn::Context [or authn::Kind] doesn't impl Serialize for a reason?"

@davepacheco
Copy link
Collaborator Author

davepacheco commented Feb 8, 2022

In the PR message, you mentioned possibly storing roles in the saga parameters - this seems possible, but it also seems like it might make role modification somewhat of a pain, no?

The idea in this case would be to isolate role modification from a running saga. Without sagas: if you make a request that requires some permission, and someone revokes the role that granted that permission, the request might still succeed if the roles were loaded before the role was revoked.

What if that request kicked off a saga instead? Suppose you try to create an Instance, you have permission, and we create the saga for doing so, but you lose the required permission halfway through. What should happen? I see a couple of options:

  1. Your permissions (roles) are locked in when the saga is created. You have whatever permissions you had when you created the saga. In this case, the Instance creation should proceed normally.
  2. Your permissions (roles) are not locked in -- we re-load them whenever we do an authz check. In this case, any step in the saga might fail, even in ways that might seem non-sensical. Maybe you had some permission in the previous action, but not this one; or, you had some permission A in step 2, and that permission implies permission B, which you didn't have in step 3. At the end, presumably the Instance creation should fail.

The motivation for this suggestion is that I think both choices are okay from a product perspective but the second approach seems really hard to maintain and test.

@davepacheco
Copy link
Collaborator Author

I'm going to land this as-is for now to unblock other work but I'm still on board to evolve the Serialized struct and naming.

@davepacheco davepacheco merged commit 97fe9d1 into main Feb 8, 2022
@davepacheco davepacheco deleted the authz-work branch February 8, 2022 21:57
@andrewjstone
Copy link
Contributor

In the PR message, you mentioned possibly storing roles in the saga parameters - this seems possible, but it also seems like it might make role modification somewhat of a pain, no?

The idea in this case would be to isolate role modification from a running saga. Without sagas: if you make a request that requires some permission, and someone revokes the role that granted that permission, the request might still succeed if the roles were loaded before the role was revoked.

What if that request kicked off a saga instead? Suppose you try to create an Instance, you have permission, and we create the saga for doing so, but you lose the required permission halfway through. What should happen? I see a couple of options:

  1. Your permissions (roles) are locked in when the saga is created. You have whatever permissions you had when you created the saga. In this case, the Instance creation should proceed normally.
  2. Your permissions (roles) are not locked in -- we re-load them whenever we do an authz check. In this case, any step in the saga might fail, even in ways that might seem non-sensical. Maybe you had some permission in the previous action, but not this one; or, you had some permission A in step 2, and that permission implies permission B, which you didn't have in step 3. At the end, presumably the Instance creation should fail.

The motivation for this suggestion is that I think both choices are okay from a product perspective but the second approach seems really hard to maintain and test.

I really don't think there is a good reason to do (2). It is much harder to reason about. IIf you view a saga logically as an atomic action that is made up of different actions, then (1 )makes sense. You don't want to worry about how long the action takes to run and if the state of the system changes. (1) also has the benefit that you can drop privileges after kicking off a saga. This likely isn't a v1 thing, but there could be some sort of auth system that gives a user permission to create and delete vms for 5 minutes. They purposefully drop those privileges to not break things accidentally. If for some reason the privilege went away, the user would still expect the VM to be created. For scenarios where short lived privileges are not routine we could always tag the VM in the UI as being created by a user that doesn't have permission anymore and let an admin clean it up with another saga.

The drop privileges method is how we managed Github repos at VMware. We'd get 30 min to say change some configuration If for some reason that kicked off a command that took longer than the permission, I wouldn't have expected it to fail.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants