-
Notifications
You must be signed in to change notification settings - Fork 22
Conversation
Implements the suggestion from #300 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very useful, especially when the pvc
is not directly available, e.g. default EKS on AWS without block storage.
LGTM
Is #300 still needed with this change? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #300 (comment)
I think it is better to do a pod template in a macro and reuse that in the Deployment vs Statefulset based on if Persistence is enabled. That is the way I have seen it in many Helm charts and keeps the api much cleaner and leverage more kubernetes native concepts as opposed to reinventing a new approach. I really dislike bloating the public API for test case performance reasons.
Agree with the above. |
Been thinking over this one. At the last meeting someone mentioned that alot this can be done at the storage class level. This seems to me a better place to decide your persistence type, rather than here at the SPIRE level. We should just plug into whatever persistence is provided to us. |
That works really well for just about any csi driver and the recommended path forward that direction. emtpydir and hostpath are kind of the exception to that rule. |
Hostpath and emptydir are not exceptions to the design precept that storage should be configurable, even though k8s has inconsistencies in how they are configured, which makes them us a different mechanisim that the CSI driver flag. In any case, the PR is not for "configuring CSI drivers" it is for "configuring persistence type". While the implementation for hostpath and emptydir will yield different YAML than that of CSI driver configuration, setting the persistence type to any of these values will still require the "persistence type flag". With this alignment, can we please close #300 as not-to-be merged? We have enough "there's two implementations preventing movement on both" issues in review already. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A number of changes were requested in the per-file comments. Please address them individually, I will be closely monitoring this and will clear this message's "request for changes" afterwards.
You can create a host path persistent volume. |
hostpath pv's could be an interesting feature to add to the chart so we can create it for the user. The implementation would be a little different from whats proposed here, but the need a type field would still be the same in that case. |
Im not suggesting we create a hostpath pv. If you want to use hostpath you can do that right now without this PR. If you just want to test without persistent storage then the flag to disable seems to be sufficient and in line with what other helm charts do. So i just don't understand what this pr is accomplishing. |
That may be the source of the difference of opinions. I think being able to easily run spire on the edge with single node or 3 node configurations running on bare metal is a use case that may increase in frequency as time goes on. We do have arm support thanks to Marco, so a 1 or 3 node raspberry pi cluster makes a lot of sense to me. If it ever is to be supported, and if we want to minimize the number of configurables, then I think Edwin is right, it might be best to do it as a enum rather then a boolean. |
Id rather wait till we have a solid use case in hand rather than guess now what may or may not be needed for the future. |
A middle ground might be to drop the hostpath part of this pr until edge spire use cases are fleshed out. but leaves the option open to support it in the future without adding an option? |
@kfox1111 The fact that you submitted two PRs for the same issue is adding to the inability to get this across the line. It makes all forward effort subject to being distracted with "well there is the other PR". I don't see any blockers on this, beyond Marco's "well there is the other PR" and my "I'd like to default to be different, and maybe the keys to be reconfigured differently." The first "blocker" shouldn't exist, because there shouldn't be 2 PRs for 1 issue, and the second could be settled more easily if the team wasn't confused by having two PRs. Please take up the work of getting the items closed. This is close enough to being accepted that it could be in this week or sooner, as long as we keep the commentary on this PR. Also, I've asked for the other PR to be closed once. This is me asking for it twice. |
<snip>
It was put in at your request. You've asked twice, and twice I've told you, I don't own the other PR. Faisal submitted it, and I'm not comfortable closing stuff I did not submit. |
I'm ok pulling the hostpath stuff for now if we want to design that more intentionally. Would that help? |
@edwbuck based on the answer to the question above, I'll either remove hostpath or fix the if's as we previously discussed. |
We have two approvals on this one, and one item blocking it. @marcofranssen please clarify / update your request, now that the whole #300 item has been decided against. |
@edwbuck @marcofranssen not sure how to reconcile that recommendation with this pr. The suggestion, as a far as I understand it, is not to use emptydir at all, but switch to deployment if persistence is to be disabled. In this case, that would be when type=="emptydir" you wouldn't get an emptydir at all, but a deployment? That would look mostly like this pr: #271 with a different conditional. If thats the way to go, should we resurrect that pr so that @marcofranssen gets the credit for his work? |
A StatefulSet can do the same thing. A StatefulSet need not have a volumeClaimTemplate. If it was a primary use case I'd say, yeah, supporting a Deployment would be good. In this case, I'm uncertain how much it really benefits us. The exception might be, if we wanted to support a DaemonSet too. I could see wanting to do that for the Edge use case. If we do that, then we have to template out the PodSpec anyway, and doing it also for Deployment is not such an additional ongoing burden. What do you think to adding both Deployment and DaemonSet? |
This is about persistence of the items, where no persistence can be handled with EmptyDir. We are not expanding the scope of this issue to change StatefulSets to Deployments. It may be a good idea, but it needs an independent issue and should not be riding the coat tails of this PR. #372 will cover that effort. Please unblock the issue which is about persistence and re-review with a tighter scope of only addressing the changes are sufficient to deal with persistence. |
#271 was also all about the persistence. This is the way most charts out there are implementing this feature. Have you also ready this part? #315 (comment) This also explains why you want to use a deployment if you don't require persistence (emptyDir). So no I feel we should not just simply merge this. |
@marcofranssen what do you think of #315 (comment)? |
we could add indeed in the future a PR to also add DaemonSet, but is that really usable for prod as a daemonset won't be able to do the persistent volumes right? For now I would just use the types where they are intended for. Deployment no persistence, Statefulset => persistence. |
Thinking something like, having a 3 node rpi cluster where each node is a control plane node. Running the spire-server as a daemonset with type set to hostPath |
DaemonSets (typically used for the agents) can use persistent volumes, and generally do use them when a person doesn't want to put the KeyStore in memory. It is rare, but occasionally an installation benefits from on-disk keystores, especially in QA/evaulation scenarios where direct inspection of the cached certificates in the agents are used to validate (internal) SPIRE functionality. With DaemonSets (which contain a Pod declaration) one puts the persistent volume in the contained Pod template, and in there it too can be set to hostPath, emptyDir, or some CSI backed PVC. In short, we need to have the setting be 2 settings, one for the storage type of the Server, and one for the Storage type of the Agent, as the storage types between those different processes are not likely to be the same, but are likely to be the same among the Servers or the same among the Agents. To clarify in Kevin's 3 node cluster, we might have the persistence type for the servers be "hostPath" to permit the Servers faster recovery by allowing them to re-find their previously existing signing certs, while we would indicated that Agents use "emptyDir" persistence types because they effectively would need to reload everything anyway. For clusters that have only one server among the three nodes, one could use affiinity / anti-affinity to ensure that the same node is used for all server launches. If this 3 node cluster had access to a NFS server, then one could use a PVC claim instead of "hostPath" and bind the storage to a NFS path. Likewise, if more sophisticated CSI drivers are available, one could bind to Ceph, MapRFS, or whatever. If the restart performance doesn't matter, one could also configure all of the systems to use "emptyDir" storage and have the intermediate CA's used for signing acquired fresh with each restart. In this case, the trade off would be longer restarts and more signing, which might be acceptable should the improved security of never seeing the KeyStores stored in a persistent way. So, selecting the persistence type provides the most flexibility, but we need to have two settings: one to control the persistence type of the servers, and one to control the persistence type of the agents. If we want to expand this pattern to Tornjak or other items, I think that those tasks should be handled in a separate PR. |
You can specify HostPath through a Local Path Provisioner. Thats what K3s does. K3s runs on the rpi and probably what id use to get a cluster going on one. I havent tested on rpi, but I think it should work. So why does HostPath need special handling? |
The local path provisioner works, but when you use that one, care still needs to be taken in some configurations to anit-affinity or daemonset the server to ensure you don't get all instances pile up on the same physical node, reducing the ability to HA the deployment. |
How does this PR avoid that scenario? |
It doesnt. Just stating it doesn't add anything I think that this PR doesn't already do? At this point I've kind of lost all track of what is being discussed for what reasons. Just trying to answer questions |
This patch adds a type flag to the persistence settings to enable specifying the backing volume's type. Signed-off-by: Kevin Fox <Kevin.Fox@pnnl.gov>
3cf51bb
to
a6bdb4d
Compare
Folks, While I am writing this personally and not on behalf of the SPIFFE Steering Committee, I want to express a concern that the overall health of this project has been drawing SSC’s consistent attention. As a reminder, the SPIFFE Steering Committee (or SSC) is a group of volunteers that oversee the strategic direction and growth of all projects falling under the SPIFFE umbrella. It is the only group that presides over the full spectrum of SPIFFE-related work. You can read more about the mission, vision, and charter of SSC here. As we strive for progress and improvements, I wanted to bring to your attention not only this PR but also other outstanding PRs that have been stagnating for a long while. Specifically, this PR has been open since May 25th. I understand the complexity of the matter and appreciate the in-depth discussions and voting that took place to ensure the best possible outcome. It is evident from the conversations that a substantial amount of thought and effort has gone into resolving the disputes around this PR. As it currently stands, all outstanding discussions appear to be addressed satisfactorily, and all checks are passing, indicating the readiness of this PR for integration. In light of these, and to keep up with our momentum, it would be beneficial for us to move forward with the merging process. I suggest we proceed, as doing so will allow us to close this chapter and shift our focus toward new objectives and endeavors. I would like to remind us once more that it is equally important for us to remain cognizant of the broader project cadence and our commitment to deadlines: While detailed discussion and extensive scrutiny are crucial, they should not lead to a loop of constant deliberation at the cost of progression. All conversations around this PR have been duly resolved, and all checks are currently passing, making it ready for integration. I want to note that our work on this project is more than a hobby or side project. It's a commitment under the guidance of the SPIFFE Steering Committee, and the project remains under the spiffe GitHub organization. Stagnation in PRs such as this can inadvertently impact our ability to manage releases effectively and adhere to set timelines. We must balance our attention between resolving current issues and advancing toward our future objectives. For this reason, I propose we aim to increase our project cadence by setting definitive goals and timelines for decision-making processes. By doing so, we can ensure productive discussions while still maintaining momentum and meeting our obligations. Thank you for your understanding and consideration. I eagerly look forward to further collaborative efforts and progression. With respect, Volkan. |
Issues raised in Marcos review were moved into #408
* d2e1606 issuer naming should respect issuer_name override (#378) * a2e5c36 Bump test chart dependencies (#416) * a09e054 support annotations so oidc can be annotated (#391) * 7d94b10 Update spire to 1.7.1 (#412) * 9f4d4ac Add aws_pca to the spire-server (#404) * af13f1f Bump test chart dependencies (#401) * 9a6768b Add support for disabling container selectors (#399) * 4687e20 Merge pull request #315 from spiffe/persistence-type * e16210c Merge branch 'main' into persistence-type * 624ca9c Remove misadded lockfile (#400) * 7ce67c6 Bump actions/checkout from 3.5.2 to 3.5.3 (#395) * b85ba64 Bump helm/kind-action from 1.7.0 to 1.8.0 (#396) * a6bdb4d Add persistence type flag Signed-off-by: Faisal Memon <fymemon@yahoo.com>
* d2e1606 issuer naming should respect issuer_name override (#378) * a2e5c36 Bump test chart dependencies (#416) * a09e054 support annotations so oidc can be annotated (#391) * 7d94b10 Update spire to 1.7.1 (#412) * 9f4d4ac Add aws_pca to the spire-server (#404) * af13f1f Bump test chart dependencies (#401) * 9a6768b Add support for disabling container selectors (#399) * 4687e20 Merge pull request #315 from spiffe/persistence-type * e16210c Merge branch 'main' into persistence-type * 624ca9c Remove misadded lockfile (#400) * 7ce67c6 Bump actions/checkout from 3.5.2 to 3.5.3 (#395) * b85ba64 Bump helm/kind-action from 1.7.0 to 1.8.0 (#396) * a6bdb4d Add persistence type flag Signed-off-by: Faisal Memon <fymemon@yahoo.com>
* d2e1606 issuer naming should respect issuer_name override (#378) * a2e5c36 Bump test chart dependencies (#416) * a09e054 support annotations so oidc can be annotated (#391) * 7d94b10 Update spire to 1.7.1 (#412) * 9f4d4ac Add aws_pca to the spire-server (#404) * af13f1f Bump test chart dependencies (#401) * 9a6768b Add support for disabling container selectors (#399) * 4687e20 Merge pull request #315 from spiffe/persistence-type * e16210c Merge branch 'main' into persistence-type * 624ca9c Remove misadded lockfile (#400) * 7ce67c6 Bump actions/checkout from 3.5.2 to 3.5.3 (#395) * b85ba64 Bump helm/kind-action from 1.7.0 to 1.8.0 (#396) * a6bdb4d Add persistence type flag Signed-off-by: Faisal Memon <fymemon@yahoo.com>
* 5e2e8a9 Adds AWS KMS KeyManager support (#435) * 77fe43f Cron job to check for and update images (#249) * b7e1525 Allow job hooks to be disabled (#434) * 5e4cf6f Clarify project issues identified with nesting document (#450) * 7289351 Update spire bits to 1.7.2 (#452) * dc8a454 Array spacing in values is incorrect in a file. (#451) * 94326d9 Fixup Helm docs * ae8941c Support Nested Spire with External Agent (#117) * f40743d Improve Tornjak documentation (#439) * 0124f63 Bypass example-test for docs only changes (#449) * 48a2898 Fix chainguard image references as per issue 442 (#443) * bd393e9 Bump test chart dependencies (#445) * a52818a Add a FAQ and switch rare issue from README to it (#437) * e60f528 option to set KeyManager memory in spire server (#444) * a167ce6 Bump actions/setup-go from 4.0.1 to 4.1.0 * e774584 Bump test chart dependencies (#426) * bfec27e Fix jwtIssuer to allow for Uris including scheme (#425) * 7a6e4f8 Change Tornjak backend default port (#436) * 1e3039c Bump spire Helm Chart version from 0.11.0 to 0.11.1 (#419) * d2e1606 issuer naming should respect issuer_name override (#378) * a2e5c36 Bump test chart dependencies (#416) * a09e054 support annotations so oidc can be annotated (#391) * 7d94b10 Update spire to 1.7.1 (#412) * 9f4d4ac Add aws_pca to the spire-server (#404) * af13f1f Bump test chart dependencies (#401) * 9a6768b Add support for disabling container selectors (#399) * 4687e20 Merge pull request #315 from spiffe/persistence-type * e16210c Merge branch 'main' into persistence-type * 624ca9c Remove misadded lockfile (#400) * 7ce67c6 Bump actions/checkout from 3.5.2 to 3.5.3 (#395) * b85ba64 Bump helm/kind-action from 1.7.0 to 1.8.0 (#396) * a6bdb4d Add persistence type flag Signed-off-by: Marco Franssen <marco.franssen@gmail.com>
Please review the below changelog to ensure this matches up with the semantic version being applied. > **Note**: **Maintainers** ensure to run following after merging this PR to trigger the release workflow: > > ```shell > git checkout main > git pull > git checkout release > git pull > git merge main > git push > ``` **Changes in this release** * 5e2e8a9 Adds AWS KMS KeyManager support (#435) * 77fe43f Cron job to check for and update images (#249) * b7e1525 Allow job hooks to be disabled (#434) * 5e4cf6f Clarify project issues identified with nesting document (#450) * 7289351 Update spire bits to 1.7.2 (#452) * dc8a454 Array spacing in values is incorrect in a file. (#451) * 94326d9 Fixup Helm docs * ae8941c Support Nested Spire with External Agent (#117) * f40743d Improve Tornjak documentation (#439) * 0124f63 Bypass example-test for docs only changes (#449) * 48a2898 Fix chainguard image references as per issue 442 (#443) * bd393e9 Bump test chart dependencies (#445) * a52818a Add a FAQ and switch rare issue from README to it (#437) * e60f528 option to set KeyManager memory in spire server (#444) * a167ce6 Bump actions/setup-go from 4.0.1 to 4.1.0 * e774584 Bump test chart dependencies (#426) * bfec27e Fix jwtIssuer to allow for Uris including scheme (#425) * 7a6e4f8 Change Tornjak backend default port (#436) * 1e3039c Bump spire Helm Chart version from 0.11.0 to 0.11.1 (#419) * d2e1606 issuer naming should respect issuer_name override (#378) * a2e5c36 Bump test chart dependencies (#416) * a09e054 support annotations so oidc can be annotated (#391) * 7d94b10 Update spire to 1.7.1 (#412) * 9f4d4ac Add aws_pca to the spire-server (#404) * af13f1f Bump test chart dependencies (#401) * 9a6768b Add support for disabling container selectors (#399) * 4687e20 Merge pull request #315 from spiffe/persistence-type * e16210c Merge branch 'main' into persistence-type * 624ca9c Remove misadded lockfile (#400) * 7ce67c6 Bump actions/checkout from 3.5.2 to 3.5.3 (#395) * b85ba64 Bump helm/kind-action from 1.7.0 to 1.8.0 (#396) * a6bdb4d Add persistence type flag Signed-off-by: Marco Franssen <marco.franssen@gmail.com>
This patch adds a type flag to the persistence settings to enable specifying the backing volume's type.