-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal: Change application identifier to allow app names longer than 63 chars #6425
Changes from 4 commits
e1b3c99
02372f9
9a7ae62
1959766
99076af
1fa7b42
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,397 @@ | ||
--- | ||
title: Change the way application resources are identified | ||
authors: | ||
- "@jannfis" | ||
sponsors: | ||
- TBD | ||
reviewers: | ||
- TBD | ||
approvers: | ||
- TBD | ||
|
||
creation-date: 2021-06-07 | ||
last-updated: 2021-06-07 | ||
--- | ||
|
||
# Change the way application resources are identified | ||
|
||
This is a proposal to introduce using a hash value as application identifier | ||
in the application instance label. This will allow application names longer | ||
than 63 characters. As an additional goal, we propose to introduce a GUID as | ||
installation ID that will allow multiple Argo CD instances manage resources | ||
on the same cluster, without the need for manual reconfiguration. | ||
|
||
## Open Questions [optional] | ||
|
||
The major open questions are: | ||
|
||
* Which hashing algorithm to use? The authors think that `SHA-1` and `FNV-1a` | ||
are the best options on the table, with `FNV-1a` being the more performant | ||
one and `SHA-1` being the one most resilient against collisions with a | ||
compromise for speed. There is a pretty good comparison of non cryptographic | ||
hash algorithms on Stack Exchange that suggests that `FNV-1a` might be more | ||
than sufficient for our use-cases: | ||
https://softwareengineering.stackexchange.com/a/145633 | ||
|
||
* The proposal defines a stretch goal for allowing multiple Argo CD instances | ||
to manage applications on the same cluster, without the need for changing | ||
the _application instance label key_ of each installation. This change may | ||
sound intrusive, but the authors think the benefits would outweigh the | ||
disadvantages. | ||
|
||
## Summary | ||
|
||
Argo CD identifies resources it manages by setting the _application instance | ||
label_ to the name of the managing `Application` on all resources that are | ||
managed (i.e. reconciled from Git). The default label used is the well-known | ||
label `app.kubernetes.io/instance`. | ||
|
||
This proposal suggests to change the _value_ of this label to not use the | ||
literal application name, but instead use a value from a stable and reasonable | ||
collision free hash algorithm. | ||
|
||
jgwest marked this conversation as resolved.
Show resolved
Hide resolved
|
||
## Motivation | ||
|
||
The main motivation behind this change is the Kubernetes restriction for the | ||
maximum allowed length of label values, which no more than 63 characters. | ||
|
||
In large scale installations, in order to build up an easy to understand and | ||
well-formed naming schemes for applications managed by Argo CD, people often | ||
hit the 63 character limit and need to define the naming scheme around this | ||
unnecessary limit. | ||
|
||
Furthermore, proposed changes such as described in | ||
https://github.com/argoproj/argo-cd/pull/6409 | ||
would require the application names to include more implicit information | ||
(such as the application's source namespace), which will even add more | ||
characters to the application names. | ||
|
||
An additional motivation - while we're at touching at application instance | ||
label - is to improve the way how multiple Argo CD instances could manage | ||
applications on the same cluster, without requiring the user to actually | ||
perform instance specific configuration. | ||
|
||
### Goals | ||
|
||
* Allow application names of more than 63 characters | ||
|
||
* Keep using a label to properly _select_ resources managed by an Application | ||
using Kubernetes label selectors | ||
|
||
* Keep having a human-readable way to identify resources that belong to a | ||
given Argo CD application | ||
|
||
* As a stretch-goal, allow multiple Argo CD instances to manage resources on | ||
the same cluster without the need for configuring application instance label | ||
key (usually `app.kubernetes.io/instance`) | ||
|
||
### Non-Goals | ||
|
||
* Change the default name of the application instance label | ||
|
||
## Proposal | ||
|
||
We propose to move from a _human-readable_ value to identify the resources of | ||
an application to a _machine-readable_ value of fixed length. For this purpose | ||
we propose to use a one-way hashing algorithm. The chosen algorithm should be | ||
_collision-free_ to a certain extent, but does not require to be _secure_ in a | ||
cryptographic evaluation. | ||
|
||
We further propose to add an _annotation_ to the managed resources, which will | ||
contain the plain text value of the application's name so that humans and other | ||
consumers of the resources will still be able to identify the application that | ||
is managing the resource. Annotations in Kubernetes don't suffer from the same | ||
length restrictions as labels, and can easily store about every possible | ||
application name that users might come up with. | ||
|
||
Furthermore, we may want to consider encoding a unique, persistant identifier | ||
of the Argo CD installation into the final hash value, e.g. | ||
|
||
``` | ||
value = hashFunc(installationIdentifier + "." + applicationName) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My suggestion would be that hashValue can be applicationName+"."+resourceKey. So that we know what the name of original resourceName in case when some kubernetes operator clone the resource. By that we can also aim to solve another usecase of ignoreExtraneous resources outside target namespace There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I really like the idea to bring in Maybe it would make more sense to add it as an additional annotation on the resource? E.g. But at the end, this will be a design decision: Do we want to support above use-case, e.g. using the label as a selector for resources managed by a given application, or not. I clearly see the benefits of using the label as a selector, but I don't have any strong opinions on it. We should just be aware of the consequences. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree that it is very convenient to be compatible with
So I think it would be great if keep using @jessesuen had a good point about trying to keep app mark human readable. How about we add new app-level field This would allow us to upgrade smoothly and takes care about both edge cases. WDYT? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The |
||
``` | ||
|
||
This way, we could support multiple Argo CD instances managing applications on | ||
the same cluster _without_ the need for those installations to have their | ||
instance label name changed to a unique value in order to prevent a "split | ||
brain" scenario where multiple Argo CD instances claim ownership to given | ||
resources, which could potentially result in severe data loss (unintentinoal | ||
resource pruning). | ||
|
||
### Use cases | ||
|
||
Add a list of detailed use cases this enhancement intends to take care of. | ||
|
||
#### Use case 1: Allow for more than 63 characters in application name | ||
|
||
As a user, I would like to be able to give my applications names with arbitrary | ||
length, because I want to include identifiers like target regions and possibly | ||
availability zones, the environment and possibly other identifiers (e.g. a team | ||
name) in the application names. The current restriction of 63 characters is not | ||
sufficient for my naming requirements. | ||
|
||
#### Use case 2: Allow for retrieving all resources using Kubernetes | ||
|
||
As an administrator, I want to enable my users to use more than 63 characters | ||
in their application names, but I still want to be able to retrieve all of the | ||
resources managed by that particular application using Kubernetes mechanisms, | ||
e.g. a label selector as in the following example: | ||
|
||
``` | ||
kubectl get deployments -l app.kubernetes.io/instance=<application> --all-namespaces | ||
``` | ||
|
||
#### Use case 3: Multiple Argo CD instances managing apps on same cluster | ||
|
||
I also want to be able to see which application and Argo CD instance is the | ||
one in charge of a given resource. | ||
|
||
### Implementation Details/Notes/Constraints [optional] | ||
|
||
#### Use a hash value for the application name | ||
|
||
* The application controller needs to be adapted to use the computed hash value | ||
for determining whether the inspected resource is being managed by the | ||
current application. | ||
|
||
* The used hash algorithm is _not required_ to be cryptographically secure, but | ||
should be _reasonable resilient_ against collisions. It should also be | ||
_reasonable fast_ in computing the hash. | ||
|
||
* The _application instance label_ on a managed resource should be set to the | ||
hexa-decimal representation of the computed hash value for the application | ||
name. What becomes the source for the hash's computation has yet to be | ||
decided upon (see above). | ||
|
||
* Additionally, a new annotation should be introduced to managed resources. | ||
This annotation will hold the _human readable_ representation of the name | ||
of the application the resource belongs to. This annotation only serves an | ||
informational purpose for humans, or other tools that need to determine | ||
the name of the application that manages a given resource. This may seem | ||
redundant at first, but we believe is a requirement for many users. | ||
|
||
For example, a managed resource could look like the following after this | ||
change has been implemented using `SHA-1` algorithm when the application's | ||
name is `some-application` and noted `installationIdentifier` isn't taken | ||
into account yet): | ||
|
||
```yaml | ||
apiVersion: v1 | ||
Kind: Secret | ||
metadata: | ||
name: some-secret | ||
namespace: some-namespace | ||
labels: | ||
app.kubernetes.io/instance: 3fbe5782d494cc956140bc58386c971bf0e96fad | ||
annotations: | ||
argo-cd.argoproj.io/application-name: some-application | ||
``` | ||
|
||
The hash-value in the above `app.kubernetes.io/instance` label is simply the | ||
hexa-decimal representation of the SHA-1 value for `some-application`: | ||
|
||
```shell | ||
$ echo "some-application" | sha1sum | ||
3fbe5782d494cc956140bc58386c971bf0e96fad - | ||
``` | ||
|
||
For humans or consumers that need the application name, the application name | ||
is stored in clear text in an annotation. Suggestion for this annotation name | ||
would be `argo-cd.argoproj.io/application-name`. | ||
|
||
#### Allow multiple Argo CD instances manage applications on same cluster | ||
|
||
As of today, to allow two or more Argo CD instances with a similar set of | ||
permissions (e.g. cluster-wide read access to resources) manage applications | ||
on the same cluster, users would have to configure the _application instance | ||
label key_ in the Argo CD configuration to a unique value. Otherwise, if an | ||
application with the same name exists in two different Argo CD installations, | ||
both would claim ownership of the resources of that application. | ||
|
||
We do see the need for preventing such scenarios out-of-the-box in Argo CD. | ||
For this, we do suggest the introduction of an _installation ID_ in the | ||
form of a standard _GUID_. | ||
|
||
This GUID would be generated once by Argo CD upon startup, and is persisted in | ||
the Argo CD configuration, e.g. by storing it as `installationID` in the | ||
`argocd-cm` ConfigMap. The GUID of the installation would need to be encoded | ||
in some way in the resources managed by that Argo CD instance, and we do see | ||
two main possibilities to do so: | ||
|
||
* Encode the GUID in the application name's hash value, e.g. (as noted above) | ||
by concatenating the GUID with the application name before hashing it: | ||
|
||
```yaml | ||
value = hashFunction(GUID + "." + applicationName) | ||
``` | ||
|
||
So, given a GUID of `61199294-412c-4e78-a237-3ebba6784fcd`, the previous | ||
example for `some-application` would become: | ||
|
||
```yaml | ||
apiVersion: v1 | ||
Kind: Secret | ||
metadata: | ||
name: some-secret | ||
namespace: some-namespace | ||
labels: | ||
app.kubernetes.io/instance: 1e2927872cdb73ec3c4010b8afeba2c4be1a92cb | ||
annotations: | ||
argo-cd.argoproj.io/application-name: some-application | ||
``` | ||
|
||
To generate the hash manually (e.g. for selection of resources by label), | ||
the user would have to take the GUID into account: | ||
|
||
```shell | ||
$ echo "61199294-412c-4e78-a237-3ebba6784fcd.some-application" | sha1sum | ||
1e2927872cdb73ec3c4010b8afeba2c4be1a92cb - | ||
``` | ||
|
||
* Use a dedicated label to store the GUID instead of encoding it in the | ||
application name, and modify Argo CD so that it matches _both_, the app | ||
instance key and the GUID to determine whether a resource is managed by | ||
this Argo CD instance. Given above mentioned GUID, this may look like the | ||
following on a resource: | ||
|
||
```yaml | ||
apiVersion: v1 | ||
Kind: Secret | ||
metadata: | ||
name: some-secret | ||
namespace: some-namespace | ||
labels: | ||
app.kubernetes.io/instance: 3fbe5782d494cc956140bc58386c971bf0e96fad | ||
argo-cd.argoproj.io/installation-id: 61199294-412c-4e78-a237-3ebba6784fcd | ||
annotations: | ||
argo-cd.argoproj.io/application-name: some-application | ||
``` | ||
|
||
Both methods have their advantages and disadvantages. We would prefer to use | ||
the first approach and encode the GUID into the application name's hash, as | ||
this would not introduce a new label. However, when manually selecting the | ||
resources of any given application, the user would have to take this GUID | ||
into account when constructing the hash to use for selecting resources. | ||
|
||
### Detailed examples | ||
|
||
### Security Considerations | ||
|
||
We think this change will not have a direct impact on the security of Argo CD | ||
or the applications it manages. | ||
|
||
One concern however is that when a user intentionally produces an application | ||
name whose hash will collide with another, existing application's hash. That | ||
may lead to the undesired deletion (pruning) of resources that do in fact | ||
belong to another application outside the adversaries control. However, this | ||
risk could be properly mitigated by permissions in the `AppProject`, since | ||
Argo CD will never prune resources that are not allowed for any specific | ||
Application. | ||
|
||
### Risks and Mitigations | ||
|
||
The main risks from the authors' points of view are collisions in the hashes | ||
of application names. For example, if the strings `some-app` and `other-app` | ||
produce the same hash, this could result in undesired behavior, especially | ||
when unintentional. The mitigation for this is to chose a hashing algorithm | ||
with a high resilience against collisions. | ||
|
||
Another risk is that third party tools that need to map a managed resource | ||
in the cluster back to the managing application rely on the clear text value | ||
of the application instance label. These tools would have to be adapted to | ||
read the proposed annotation instead of the application instance label. | ||
|
||
We think that this may be a breaking change, however, the advantages outweigh | ||
the risks. | ||
|
||
### Upgrade / Downgrade Strategy | ||
|
||
Upgrading to a version that implements this proposal should be seamless, as | ||
previously injected labels will not be removed and additional annotations will | ||
be applied to the resource. E.g. consider following resource in Git, that will | ||
be synced as part of an application named `some-application`. In Git, the | ||
resource looks like follows: | ||
|
||
```yaml | ||
apiVersion: v1 | ||
Kind: Secret | ||
metadata: | ||
name: some-secret | ||
namespace: some-namespace | ||
``` | ||
|
||
When synced with the current incarnation of Argo CD, Argo CD would inject the | ||
application instance label and once the resource is applied in the cluster, it | ||
would look like follows: | ||
|
||
```yaml | ||
apiVersion: v1 | ||
Kind: Secret | ||
metadata: | ||
name: some-secret | ||
namespace: some-namespace | ||
labels: | ||
app.kubernetes.io/instance: some-application | ||
``` | ||
|
||
Once Argo CD is updated to a version implementing this proposal, the resource | ||
would be rewritten to look like the following: | ||
|
||
```yaml | ||
apiVersion: v1 | ||
Kind: Secret | ||
metadata: | ||
name: some-secret | ||
namespace: some-namespace | ||
labels: | ||
app.kubernetes.io/instance: 3fbe5782d494cc956140bc58386c971bf0e96fad | ||
annotations: | ||
argo-cd.argoproj.io/application-name: some-application | ||
``` | ||
|
||
This would be very similar to _renaming_ an application, as in the managed | ||
resources metadata would be modified in the cluster. | ||
|
||
On a rollback to a previous Argo CD version, this change would be reverted | ||
and the resource would look like the first shown example above. | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How would Argo CD know whether a resource's For example:
I think we need the ability to look at the One option we could use is look at the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's a good point you raise there. I haven't yet verified, but I think it would be similar to the behavior if you rename an application, e.g. by deleting an Argo CD would just go and overwrite what's in There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think if an Argo CD instance encounters a resource with a So, presuming that Argo CD encounters a resource it thinks is owned by someone else, and it has the same name/namespace as a resouce managed by an
Probably 2), and report an error ('error: Argo CD Application would overwrite a resource that may be owned by another Argo CD instance'). It might be worthwhile to document the algorithm that Argo CD will use to move existing resources from the old style to the new style, in the upgrade scenario. I'm guessing it would be:
|
||
## Drawbacks | ||
|
||
We do see some drawbacks to this implementation: | ||
|
||
* The already mentioned possible backwards incompatibility with existing tools | ||
that rely on reading the `app.kubernetes.io/instance` label to map back a | ||
resource to its managing application. | ||
|
||
* This change would trigger a re-sync of each and every managed resource, which | ||
may result in unexpected heavy load on Argo CD and the cluster at upgrade | ||
time. | ||
|
||
* People manually selecting resources from an application must create the | ||
value for the label selector instead of just using the application name, e.g. | ||
instead of | ||
|
||
```shell | ||
kubectl get secrets -l app.kubernetes.io/instance=some-application --all-namespaces | ||
``` | ||
|
||
Something like the following needs to be constructed: | ||
|
||
```shell | ||
kubectl get secrets -l app.kubernetes.io/instance=$(echo "some-application" | sha1sum | awk '{ print $1; }') --all-namespaces | ||
``` | ||
|
||
* If we chose to also implement the GUID as application identifier, the GUID | ||
token becomes a viable part of the _state_ and needs to be backed up as | ||
part of any recovery procedures. Without this GUID, resources not anymore | ||
existing in Git will not get pruned upon removal, thus becoming stale in | ||
the cluster. A resource synced with a given GUID will only be ever removed | ||
again if the GUID of the pruning instance is the same as at sync time. | ||
|
||
## Alternatives | ||
|
||
* Enabling application names longer than 63 characters could also be done | ||
by having the _application instance label_ to become an annotation instead. | ||
This is a much simpler solution, however, this would disable the possibility | ||
to use label selectors for retrieving all resources managed by a given | ||
application. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re: hash choice -
If we're going to go with SHA-based algorithm, I recommend SHA-256, as SHA-1 is considered broken. It doesn't matter for this use case (we're not using the cryptographic properties of a hash, as you note later), but we will constantly get people telling us it's broken and asking us when we will move to another algorithm 😄 .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, correct me if I'm wrong, but I don't think hash performance is an issue: for each Application, we only need to calculate this value once, and then store it in memory (eg memoize the result of the hash function for each application).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have a good argument there. I favored the SHA-1 variant for producing the shorter hash values, but I think you are right in that we may want to consider usage of SHA-256 instead. Although we do not need to be secure in cryptographic terms here, probably SHA-1 will become utterly deprecated soon enough (maybe up to the point of being removed from any standard libraries?), and even Git will move away from it. So yeah, it leaves a "bad aftertaste" kind of thing :)
Thanks for bringing it up, I will change it in the proposal accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like we've got two similar discussion threads in this PR. Copying comments from another thread here.
As I understand there are trying to solve three issues:
app.kubernetes.io/instance
labelAlso, I think additional requirements are: try to keep it human-friendly and try avoid breaking changes.
To solve I want to propose:
argo-cd.argoproj.io/application
,argo-cd.argoproj.io/resourceKey
, ``argo-cd.argoproj.io/installation` ) that includes app name, resource key and argocd installation idapp.spec.tracking
field. (this comment tries to justify why new way of tracking should be optional)So main changes in my proposal:
@jannfis , @jgwest what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should not provide two different methods for tracking applications, as it will complicate things greatly and probably will cause some confusion among users.
What speaks against using the label for tracking and the annotations for human-readable identifiers together? So the annotations will only serve informational purposes, and the label is used to select and identify resources.
To solve
I would rather propose to change the default
app.kubernetes.io/instance
label to something more sensible, Argo CD specific. E.g.argocd.argoproj.io/instance
- I think the Argo CD Helm chart already sets this by default.IMHO, I think we shouldn't sacrifice the label in order to enable longer-than-63-chars application names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm trying to avoid painful migration for end-users: if Argo CD changes the tracking method then all managed resources suddenly became out-of-sync.
However, I agree that two ways if tracking is not great. In order to make it easier to migrate, we should support using "legacy" way of tracking for 1~2 releases.