Skip to content

authz: update disk lookup and protect disk get/attach/detach #662

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 20 commits into from
Feb 3, 2022

Conversation

davepacheco
Copy link
Collaborator

@davepacheco davepacheco commented Feb 1, 2022

In this change, I set out to add authz protection for GET disk using the same pattern described in #592 and extended in several other PRs. To recap, you'd expect:

  • disk_fetch(authz::Project, Name): the main way to fetch a whole disk record
  • disk_lookup_path(organization_name, project_name, disk_name): the main way to find the disk id for a path in the API

This change grew a lot larger because there are tendrils in various places. So now there's also:

  • disk_lookup_by_id(Uuid): this does a "lookup" (the way disk_lookup_by_path does, returning an authz::Disk and not the full database record) using the disk's id
  • disk_refetch(authz::Disk): this is desirable for complicated reasons. Basically: if you have an authz::Disk and you want the database record, this function gives it to you (after doing an authz check).

More on disk_refetch:

`disk_refetch` looks at first glance like a wart. Maybe it is. Why do we need it?

With the conventions we have today, the only way to get a full database record would be using disk_fetch(authz::Project, disk_name). That doesn't help you if you have the disk's id -- you don't have its parent Project (and no easy way to get it) and you don't have the disk's name.

Why not make this disk_fetch_by_id? That's essentially what this is: disk_refetch takes an authz::Disk, which, from a user's perspective, is just a disk_id. The difference is that authz::Disk also includes information about how you looked up the Disk in the first place. If we accepted only a disk id (Uuid) here, and the disk didn't exist, we'd generate a 404 that said "disk with id 123 doesn't exist". But the user may not have come into this function looking for disk id 123. For example, the disk attach code path does an initial mapping from path to id using disk_lookup_path. Then it does a call to the Sled Agent and then a database operation. Then it uses disk_refetch to get the latest version of the record. If the disk was gone at that point, the user would get a 404 "disk with id 123 doesn't exist", which makes no sense because they didn't ask for the disk with id 123 -- they asked for disk "d1" in project "p1" in organization "o1".

That might sound contrived or pedantic (and maybe it is?). But the point here is to preserve information about how a resource was looked up. I think that'll be valuable both for end users and our own debugging. It also forces us to be careful with our own lookups: each operation should do one lookup, probably early on, and hang onto the authz::Disk (or whatever) for the duration of the operation. That also ensures that we don't accidentally switch objects because we do another lookup later.

Other changes here:

  • To support disk_lookup_by_id, we also need project_lookup_by_id and organization_lookup_by_id.
  • Renamed *_lookup_path to *_lookup_by_path (that's for you, @david-crespo -- I think it's a good call)
  • Added a new "internal-api" internal user, which is used for handling some requests to the internal API. (Before this, there was no OpContext for requests to the internal API. Now, those APIs can have OpContexts, and they'll have this user. This is still basically going to be a super-user, but it's a step in the right direction.)
  • Added type aliases for authz::ProjectChild for authz::Disk and authz::Instance, mostly for clarity for readers. authz::ProjectChild is no longer exposed from authz.
  • Added protection to the disk attach/detach endpoints because it was easier than not doing that, with all the changes I have here
  • Fixes a bug where a POST to detach a disk from an Instance returned the previous state ("attached", usually) instead of the new one ("detaching"). I updated the disks integration test to check for this.

@davepacheco davepacheco changed the title authz: protect disk GET (plus a lot of refactoring) authz: update disk lookup and protect disk get/attach/detach Feb 1, 2022
@davepacheco davepacheco mentioned this pull request Feb 1, 2022
71 tasks
@davepacheco
Copy link
Collaborator Author

This is ready for review but depends on #661.

Comment on lines +1274 to +1278
// TODO-security For Containers in the hierarchy (like Organizations and
// Projects), we don't do an authz check in the "lookup_by_path" functions
// because we don't know if the caller has access to do the lookup. For
// leaf resources (like Instances and Disks), though, we do. We could do
// the authz check here, and in disk_lookup_by_id() too. Should we?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I worry that having a separation between what's auth'd and what's not auth'd being differentiated by logical hierarchy might potentially lead to bugs. Is there some way we could be more explicit in describing that a particular reference has been auth'd? I mean, we could continue with naming schemes like _noauthz or _no_auth which is helpful at the callsite... I wonder more if there's something we can do with the return type of these lookups though. One could mistakenly assume that authz::Disk meant authorized disk at first glance. Would it be possible to have a type, be it a newtype pattern or otherwise, that wrapped the authz::* structs to indicate that they had indeed been authorized. The authorize function could be the only thing to return that type and then we could lean on some level of type safety to save us from ourselves.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's a compelling idea. One line of thinking I've had: authorize could return some newtype around () that essentially just says "someone has successfully called authorize() before". Obviously that's only a little useful -- they might have authorized a different object for a different thing. So you could store something about the object that was authorized in that return value. But that's only useful if somebody checks it, right? How would they check it? They'd ask if somebody had authorized some action on some resource -- it would look just like the authorize() call itself. They may as well just call authorize().

I think you're suggesting that we'd encode what was authorized in the type. That'd be neat. I think that would need to include both the resource and the action. Something like AuthorizationResult<Resource=authz::Disk,Action=Read>? It's tricky, though -- a function that accepts that for Action=Read should also accept it for Action=Write. Do you have more thoughts on how this might work?


I think the root problem we're trying to solve is that it's easy for developers to make mistakes with authz, particularly forgetting to do an authz check. I think there are other opportunities to improve this too. I'd love to get to a place where there's a sort of datastore kernel (for lack of a better word) where internally various functions may do authz checks or not, but the public interface would always do the appropriate authz check. We'd need something similar protecting access to Sled Agent, Crucible, etc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's very likely we'd want more more than an empty new type. My intuition is that we'd want it to be the authz resource reference as well as the actions as you mention. As for what it would look like, I think it'd take some more detailed exploration. If there's a small set of actions with increasing permissions then something using a bit flag and some logic like anything above a certain value can take an action. I don't exactly know, but having a single field that can represent granted permissions (be it bit flags or otherwise) seems like it'd be something we'd want to include.

I think my biggest preference for how this would work is that we have a macro that's used at the datastore level that dictates what permissions it takes. That macro would take care of all the auth/validation stuff to ensure the action could be performed. We'd then pair that with a linter to fail anything without explicit macro usage. Something like that anyway.

I'm not 100% sure, but I definitely think that it's worth exploring. I wouldn't stop the presses to get to that now, it seems more like a pre-launch/post-demo priority.

Copy link
Contributor

@zephraph zephraph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a solid step in the right direction. I still want to talk more about how to make our auth system safer (especially if we can leverage types to do so), but that's orthogonal to the work here.

Also, I know it's understood but we should remember to make time for performance testing our db performance to ensure the approach of many smaller queries is better than the utilization of joins.

Base automatically changed from authz-work to main February 3, 2022 18:00
@davepacheco
Copy link
Collaborator Author

Also, I know it's understood but we should remember to make time for performance testing our db performance to ensure the approach of many smaller queries is better than the utilization of joins.

Yes, I'd say even stronger: I don't expect many small queries to be better than JOINs here and I do hope we get a chance to improve that.

@davepacheco
Copy link
Collaborator Author

Thanks for taking a look!

@davepacheco davepacheco marked this pull request as ready for review February 3, 2022 20:59
@davepacheco davepacheco enabled auto-merge (squash) February 3, 2022 21:06
@davepacheco davepacheco merged commit ed28c2a into main Feb 3, 2022
@davepacheco davepacheco deleted the authz-instances-disks branch February 3, 2022 22:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants