-
Notifications
You must be signed in to change notification settings - Fork 487
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Propose Updated Protos for the SPIRE Server API #1057
Comments
It would be super convenient to have APIs that force updating federated bundles instead of waiting for the refresh hint. |
I put some thought into this last week. I took the "if we started over again, what would we like to see" approach. While futzing around with some different options, I noted that some proto types are shared between the registration api and the node api... Since we also know we want to refactor the node api, the scope of my exploration ended up widening significantly. What I ended up with was a suite of APIs which are somewhat "resource" oriented. For instance, one API for managing agents, another for managing bundles, and so on. Below I'll describe the structure of the APIs, related protos, types etc. Note: I stopped short of defining the actual protos. This is simply an organizational proposal, and I'm looking for as much feedback as possible. Once we decide on the organization, we can begin modeling the protos themselves. The Top LevelThe current location stays as is (
|
Have we ever considered representing configuration with protos? Eg, Envoy accepts static json config (or dynamic config via grpc) which is defined by a proto. It's nice since it's easy to validate configuration, document, and work with. |
+1 to removing Some leading questions to poke at the API break down: Should node entries be a part of the entry API, or should "agent" registrations be separated from "workload" registrations (this distinction doesn't explicitly exist in the datastore API, but there was been proposals to add it)? Do you want to separate the APIs into categories? For example, the issuer API is (IMO) internal, the agent API is fairly operator focused / introspection focused (token generation and agent eviction are exceptions). The entry API and the svid API are the best examples IMO of "many 3rd party integrations will be built and it needs to be very stable". The issuer API probably doesn't need to maintain that guarantee since its internal. So long as there is a compatibiltiy story between the server and agent anything should be fair game. The entry and svid APIs probably deserve more detailed review as they are a bit more set in stone once released. I agree there should be subdirectories within One thing worth thinking about is putting together a proposal for the protos locally to see if there is anything that comes out to challenge this proposal. I agree its too early to publish but it would suck to come to an agreement and then find issues only once it gets translated to proto. |
Is it worth considering an For example, |
I definitely think that this would benefit from a lightweight RBAC system. Being able to say "okay this admin can evict agents" or "this admin can federate bundles" or "this admin can register workloads" will be really great to have. ...of course I personally would like even more fine-grained ACLs such as "only the admin that created a workload registration can modify it", but that may be asking too much at this stage. Even having coarse-grained ACLs on the |
There has been a feature request for an API to manage federated domains which I think makes sense (be able to kick off federation w/o restarting a server). The difficulty there will be deciding which SPIRE server is in charge of updating the federated bundle. We haven't really solved that for HA anyway so it isn't a new problem. |
My original intention for |
I've done a little design work on what the Issuer API might look like. It is based on the SPIFFE Identity Classification proposal (#1338). I've got a commit (4b79ec1) in the api-refactor-design branch if you want to check it out in its entirety. ** Updates **
Here is the basic service description: service Issuer {
rpc AttestForX509SVID(stream AttestForX509SVIDRequest) returns (stream AttestForX509SVIDResponse);
rpc RenewX509SVID(stream RenewX509SVIDRequest) returns (stream RenewX509SVIDResponse);
rpc NewX509SVID(NewX509SVIDRequest) returns (NewX509SVIDResponse);
rpc NewJWTSVID(NewJWTSVIDRequest) returns (NewJWTSVIDResponse);
rpc NewDownstreamX509CA(NewDownstreamX509CARequest) returns (NewDownstreamX509CAResponse);
} AttestForX509SVIDrpc AttestForX509SVID(stream AttestForX509SVIDRequest) returns (stream AttestForX509SVIDResponse); AttestForX509SVID attests the caller using node attestation and results in one or more X509-SVIDs being returned to the caller. The RPC uses a bidirectional stream to accomodate challenge/response-based node attestation methods. AuthorizationNone RequestThe request initially contains the attestation parameters: namely the attestation data from and the parameters the issuer needs to mint the X509-SVID (i.e. public key). Subsequent requests sent over the stream will convey responses to challenges issued by the issuer (if any). message AttestForX509SVIDRequest {
message Params {
spire.type.AttestationData data = 1;
X509SVIDParams params = 2;
}
oneof step {
// Attestation parameters. These are only sent in the initial request.
Params params = 1;
// The response to a challenge issued by the attestor. Only sent in
// response to a challenge received by the issuer.
bytes challenge_response = 2;
}
} ResponseThe response will either have a challenge that needs to be answered, or an attesation result. The result is EITHER:
message AttestForX509SVIDResponse {
message Result {
spire.type.svid.X509SVID canonical_identity = 1;
repeated spire.type.svid.X509SVID attributed_identities = 2;
}
oneof step {
// Attestation results. If set, attestation has completed.
Result result = 1;
// A challenge issued by the attestor. If set, the caller is expected
// to send another request on the stream with the challenge response.
bytes challenge = 2;
}
} RenewX509SVIDrpc RenewX509SVID(stream RenewX509SVIDRequest) returns (stream RenewX509SVIDResponse); RenewX509SVID is used to renew a canonical identity previously obtained through AttestForX509SVID. It cannot be used to renew any other type of identity. Canonical identities are by definition unique. SPIRE closely tracks SVIDs issued for canonical identites to ensure that only a single SVID is active for a given canonical identity. RenewX509SVID is the process by which a new SVID is made active (making the existing one stale). A bidirectional stream facilitates two-phase commit of this action for increased robustness, i.e. the caller needs to acknowledge that it has received the new SVID and done whatever is required to switch over to using the SVID (e.g. persist to disk). Only after receiving this acknowledgement does the server switch consider the new SVID active. AuthorizationCallers must present an active canonical identity. RequestThe first request on the stream contains the parameters the issuer needs to mint the X509-SVID (i.e. public key). The second request contains the acknowledgement that the renewed SVID has been received and that the caller will use it for subsequent RPCs. message RenewX509SVIDRequest {
message Ack {};
oneof step {
// Parameters for the X509-SVID
X509SVIDParams params = 1;
// An acknowledgement by the receiving end that it has received the
// renewed canonical identity SVID and will be using it in future
// requests.
Ack ack = 2;
}
} ResponseThe response contains the renewed SVID message RenewX509SVIDResponse {
// The renewed X509-SVID
spire.type.svid.X509SVID svid = 1;
} NewX509SVIDrpc NewX509SVID(NewX509SVIDRequest) returns (NewX509SVIDResponse); NewX509SVID is used by delegatees (those to whom identities have been delegated) to obtain an X509-SVID for the delegated identity. AuthorizationCallers must present an SVID for an identity that is directly or indirectly delegated the identity being requested. For canonical identities, the SVID must be the active SVID. RequestThe request contains the UUID for the identity to issue an X509-SVID for. This UUID is used to look up the registration entry for the identity, which contains the SPIFFE ID and additional properties for the SVID (i.e. TTL, DNS names, etc) The request also contains parameters the issuer needs to mint the X509-SVID (e.g. public key) message NewX509SVIDRequest {
// The ID of the registration entry describing the identity/SVID
string entry_id = 1;
// Parameters for the X509-SVID
X509SVIDParams params = 2;
} ResponseThe response contains the newly signed X509-SVID for the delegated identity. message NewX509SVIDResponse {
// The newly issued X509-SVID
spire.type.svid.X509SVID svid = 1;
}
NewJWTSVIDrpc NewJWTSVID(NewJWTSVIDRequest) returns (NewJWTSVIDResponse); NewJWTSVID is used by delegatees (those to whom identities have been delegated) to obtain a JWT-SVID for the delegated identity. AuthorizationCallers must present an SVID for an identity that is directly or indirectly delegated the identity being requested. For canonical identities, the SVID must be the active SVID. RequestThe request contains the UUID for the identity to issue an JWT-SVID for. This UUID is used to look up the registration entry for the identity, which contains the SPIFFE ID and additional properties for the SVID (i.e. TTL, etc) The request also contains parameters the issuer needs to mint the JWT-SVID (e.g. audience) message NewJWTSVIDRequest {
// The ID of the registration entry describing the identity/SVID
string entry_id = 1;
// Parameters for the JWT-SVID
JWTSVIDParams params = 2;
} ResponseThe response contains the newly signed JWT-SVID for the delegated identity. message NewJWTSVIDResponse {
// The newly issued JWT-SVID
spire.type.svid.JWTSVID svid = 1;
} NewDowstreamX509CArpc NewDownstreamX509CA(NewDownstreamX509CARequest) returns (NewDownstreamX509CAResponse); NewDownstreamX509CA mints a new X509 CA intermediate appropriate for use by a downstream X509 CA for minting X509-SVIDs. AuthorizationCallers must present an SVID for a downstream identity (i.e. one that has been marked downstream at time of registration). RequestThe request contains the public key for the downstream X509 CA. The issuer will determine other properties for the CA certificate. message X509CAParams {
// The public key of the downstream X509 CA to use in the new CA certificate.
bytes public_key = 1;
}
message NewDownstreamX509CARequest {
X509CAParams params = 1;
} ResponseThe response contains the CA certificate chain and the root CAs. The certificate chain will include the CA certificate and any intermediates necessary to chain up to a certificate present in the roots. message X509CA {
// CA certificate and any intermediates part of the chain back to the root
// (DER encoded). The CA certificate is the first.
repeated bytes ca_cert_chain = 1;
// Root CA certificates (DER encoded).
repeated bytes root_cas = 2;
}
message NewDownstreamX509CAResponse {
X509CA x509_ca = 1;
} |
I think the implication here is that attributed identities will share the same key. I don't know that it's actually a problem, but I do think it's different than what we do today. If we had an agent that used a node attestor plugin that produced only attributed identities, would the agent need to maintain one connection per attributed identity in order to be authorized to fetch its full set of delegated entries? The result message feels a little smelly too in that consumers may not know what to expect, need to check both fields... It makes me wonder if canonical identity should be mandatory even if the underlying attestation type doesn't support it well... just thinking out loud
Since it will be common for consumers to need more than one identity at a time (agent boot comes to mind, as does bulk registration), and the authorization lookup is expensive, I think it would make sense to allow agents to request more than one X509-SVID at a time?
small nit: this is really the entry id and not the identity id, since a single identity can be represented by multiple entries. Is the intention to move away from entry-related language in this API?
Seems prudent to follow the X509SVIDParams pattern by either re-using the params message or introducing a new (similar) one?
This makes me wonder about #1190. Should we have a NewDownstreamJWKS RPC here? Or does that really belong in a Bundle API? Or, should we have an entirely separate API for "downstream" related needs? Overall, I think this looks great and makes a lot of sense. The canonical/attributed bits that poke through on the Attest call feels like it could use some attention... I do like that this proposal doesn't include too much of the "new language" however the fact that the only bit where it does poke through is the part that feels a little if-y gives me pause. |
Yeah, that is a little rough. Your musings about about returning a canonical identity even for attestors that don't support it might solve that. Attestors that don't support it could just generate a canonical identity with a UUID just to provide a single identity to use to authenticate with the server.
Yes. Good idea.
Whoops! This was left over from some other exploratory work around registering "identities" instead of the overly generic term "entry". I'll edit above.
SGTM Regarding Downstream stuff. I think I'd favor a separate API. That kind of an API does feel more user-centric than service-centric... I wonder how the following would feel:
|
Another advantage of splitting out NewX509SVID, is that it feels more natural to add a |
I guess # 1 could be done independently of # 2 attributed identities are backed by an entry and therefore the shape of each RPC would be the same. Having them separate could however lead to opportunities to optimize the "authorization" crawl that ensures the caller is authorized for the entry. |
After some discussion with @evan2645, I've removed the language from the Issuer API regarding canonical/attributed/delegated, etc (7c2c420). This allows us to move forward without locking into the new nomenclature. At the same time this new proposal should allow us to adopt the new nomenclature later through the addition of new RPCs without muddying the RPCs we want to use now for agents. |
Ok, the api-refactor-design branch on SPIRE has been updated with the latest design. The proto/spire/next directory contains the new API descriptions: https://github.com/spiffe/spire/tree/api-refactor-design/proto/spire/next The new APIs are: |
I've gone through the most recent set of protos (and not prior iterations). some of these are notes about the APIs, but at least in two cases there are some api nits based on underlying concerns around SPIRE.
|
If I call
will this delete it?
Only mildly concerned because there's an extra entry created by the CreateJoinToken RPC. Would be nice if DeleteAgent returned a bit more info (how many entries were deleted, maybe?). Edit: The comment on DeleteAgentRequest says it's a SPIFFE ID, but should that be an entry ID instead? This is what I mean by the API being too stringly-typed; easy to misuse this way. |
It would be nice if you could Delete with a filter, similar to how you List. |
That's a nice idea. Unless we somehow encoded the ID into the token itself (which may have some interesting metadata leak implications), it would require datastore schema and plugin changes, which means we wouldn't likely be able to implement that until 0.12.0. With the API as structured, that is something we could eventually introduce (and maybe provide a new bool field to opt into the new behavior so we stay compatible with older code).
Good catch. I'll fix this.
I think this is a potentially useful use case, but introducing it also means adding a new RPC to UpstreamAuthority to support it as the UpstreamAuthority interface is opinionated about the type of X509-SVID it signs (CA intermediates only). Adding a new RPC to serve this purpose could be done a backwards compatible manner since the RPC would be optional. This feature in general could also something we could introduce to the MintX509SVID RPC at a later point without impacting compatibility.
I think I'd prefer them separate. We've got into trouble in the past with RPCs that had many "modes", even just in terms of code complexity. I have lots of other little nitpicky reasons why I'd personally prefer to see them seperate. Like you said, arguments either way :) In general I think we're better off with multiple RPCs over an overloaded RPC.
This is a valid point. I don't have a strong opinion here. For the Entry API, since BatchCreateEntry naturally covers both the CreateEntry and CreateEntryIfNotExists, it could lead to some simplification for sure. If nobody else disagrees, I'd be for it.
Yes.....I can see the difficultly managing the per-entry opt-in for federation. There are other federation related things that we've considered as well, including configuring federation relationships via the API instead of the config file. We ultimately punted because that brought up hard questions like 1) which server(s) are polling the federation bundle endpoint of the foreign trust domain, 2) if more than one server , how do they prevent racing on updates, 3) if one server, how do we choose. These questions are currently sidestepped, but not explicitly handled, by the current experimental configuration section that configures federation.
That might be ok. It doesn't reduce the validation that has to happen, but maybe simplifies it a bit. A SPIFFEID type could line up quite nicely with the strong types we've added for SPIFFE IDs in the go-spiffe v2 library. Unless others disagree, I'd be for it. |
Regarding DeleteAgentRequest, it is synonymous with the current Maybe this points out an omission in the API. If we had a DeleteJoinToken request, would that clear things up? Maybe DeleteJoinToken could identify the novelty entry, if any, and delete it as well. |
Here are some thoughts after looking at this proposal:
|
Hmm, that's a fair point and your suggested names mirror what is currently in the registration API. I'm somewhat torn. On one hand, like you said, I think it is a little confusing that most of those operations don't make sense when talking about the bundle for the trust domain of the SPIRE server. On the other hand, does it ever make sense to just want to list all of the bundles the server knows about, independent of the trust domain of the server? I'm curious what others thoughts are here...
Can you give an example of an API where you'd like to see the status? In particular, using the status message to convey the results of an operation only works if the RPC returns success. This comes down to a discussion around whether or not the API should rely on transport mechanisms to convey application level status.
That might be useful. The detail message type that any given RPC returns would probably need to become part of the API. Considering that I'd probably advocate for punting on adding the details field until a clear use case arose. |
Regarding |
Considering how expensive the entry authorization check is, and is likely to remain for some time, I think it's also probably worth having "batch" versions for the SVID API's |
I see that the Status is included in all Batch*Response messages. I was speculating if there could be some other response messages that could benefit of including additional details in a successful response. After going over all the RPCs, it seems that all the responses that have a list of results (where each one has its own status) already have the Status included. My comment tended to know if it was thoroughly considered to be included in all relevant places, which looks like it was. |
Here's the changes I plan on taking into the api-refactor-design branch:
Here are things I don't think we should put in at this time:
Here are some things I'd still love opinions on:
|
What is the use case for deleting a token? Token TTL is mandatory IIRC, so they could/should be self-cleaning. Maybe you'd want to somehow revoke a token once it has been created, but it's hard for me to see how that would be useful in the sorts of workflows I'm familiar with (e.g. passing into/pulled from a provisioning system, or generated by a human). I would imagine that in most cases tokens are used shortly after they're created - and if they're not, they expire. Join token is weird in more than one way. It is a secret of sorts... and we store it in the db. It's the only "secret" that we really manage like that. Letting them leak back out feels a little weird to me - web services in which API keys can be generated but not retrieved afterwards come to mind. |
This sounds like a great idea. As for naming, I suppose I prefer the pluralized version as it feels more natural, but not at the cost of breaking consistency with other batch RPCs |
@evan2645, regarding join tokens, that's a good point. We'd actually discussed this earlier IRL when we added CreateJoinToken. I think relying on pruning to remove the tokens is probably a good idea. I do think pruning could get a little smarter about also removing related novelty entries though. Right now, those dangle. |
Re: |
Do we really need to keep the Batch prefix in those RPCs? E.g. I personally think that |
The "Batch" prefix came from RPC naming conventions in the Google API Style Guideline, which provided some good guidance and prior art for some of the v2 design. From what I can tell in their guidance for naming in a resource-oriented API, the resource is the Noun and MOST of the RPCs use a verb+noun naming style. So Bundle would be the Noun and the verbs would be We've departed from them in some other minor conventions. Also, they have a different problem than us.... some of their decisions come from using a protobuf as a single source of truth for both a gRPC and REST-ful API. I could go either way in regards to the |
In regards to using Google API Style Guidelines, it was a helpful place to look for established patterns but in no way did I intend for it to be law in terms of what we do. When we feel justified to go into a direction, I'm all about that as long as we are internally consistent. |
If we do end up calling it |
Branch is updated with |
One consequence of using a new |
I think I'm more worried about the external APIs, so having to do an internal conversion for the datastore plugin seems OK. We could provide a function to convert between the types, and validate that using a test that does reflection on fields. |
The branch has been updated with all of the suggestions. Thanks for the feedback everyone. I think this is looking pretty good. I'll be cutting issues for the rest of the API soon so we can start on the implementation. Since it will be some time before we "release" the new API, we still have time to make some changes. Hopefully any changes in the future are superficial. |
Closing this out as consensus on the updated protos was gained and the new APIs have been implemented as of version 0.11.0 |
The SPIRE registration API was originally written to allow manipulation of SPIRE registration entries. It has since become the de-facto API for all things administrative. This includes decidedly un-registration-y things such as generating join tokens, minting arbitrary SVIDs, agent management, and bundle management.
One problem is simple naming - the "registration api" should not include things that are not related to registration. Another problem is the eventual access control that we want to provide... we currently have the "admin" bit you can set to give carte blanche access, however the scope is too wide. Finally, the registration API has grown organically over time to add features such as
ListBy*
etc... these features were added in a compatible manner, but we now have a chance to take a new approach to supporting this functionality in a more idiomatic way.This issue is done when we have loose consensus on the proto(s) for the next iteration of SPIRE management APIs.
The text was updated successfully, but these errors were encountered: