Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: Change application identifier to allow app names longer than 63 chars #6425

Merged
merged 6 commits into from
Sep 13, 2021

Conversation

jannfis
Copy link
Member

@jannfis jannfis commented Jun 8, 2021

This is a proposal to change the way application resources are identified within Argo CD to allow application names longer than 63 characters.

Signed-off-by: jannfis jann@mistrust.net

Note on DCO:

If the DCO action in the integration test fails, one or more of your commits are not signed off. Please click on the Details link next to the DCO action for instructions on how to resolve this.

Checklist:

  • Either (a) I've created an enhancement proposal and discussed it with the community, (b) this is a bug fix, or (c) this does not need to be in the release notes.
  • The title of the PR states what changed and the related issues number (used for the release note).
  • I've included "Closes [ISSUE #]" or "Fixes [ISSUE #]" in the description to automatically close the associated issue.
  • I've updated both the CLI and UI to expose my feature, or I plan to submit a second PR with them.
  • Does this PR require documentation updates?
  • I've updated documentation as required by this PR.
  • Optional. My organization is added to USERS.md.
  • I have signed off all my commits as required by DCO
  • I have written unit and/or e2e tests for my change. PRs without these are unlikely to be merged.
  • My build is green (troubleshooting builds).

jannfis added 2 commits June 8, 2021 08:20
Signed-off-by: jannfis <jann@mistrust.net>
Signed-off-by: jannfis <jann@mistrust.net>
@jannfis jannfis changed the title proposal: Change application identifier proposal: Change application identifier to allow app names longer than 63 chars Jun 8, 2021
@codecov
Copy link

codecov bot commented Jun 8, 2021

Codecov Report

Merging #6425 (1fa7b42) into master (12c95d6) will increase coverage by 0.02%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #6425      +/-   ##
==========================================
+ Coverage   41.01%   41.03%   +0.02%     
==========================================
  Files         152      158       +6     
  Lines       20088    21299    +1211     
==========================================
+ Hits         8239     8740     +501     
- Misses      10716    11306     +590     
- Partials     1133     1253     +120     
Impacted Files Coverage Δ
util/db/repository.go 40.00% <0.00%> (-22.74%) ⬇️
util/cache/cache.go 56.96% <0.00%> (-13.26%) ⬇️
util/git/workaround.go 46.87% <0.00%> (-6.07%) ⬇️
util/webhook/webhook.go 62.36% <0.00%> (-3.73%) ⬇️
server/account/account.go 65.89% <0.00%> (-3.28%) ⬇️
reposerver/cache/cache.go 64.51% <0.00%> (-2.93%) ⬇️
server/project/project.go 52.69% <0.00%> (-2.78%) ⬇️
util/git/git.go 75.75% <0.00%> (-2.37%) ⬇️
util/db/cluster.go 60.00% <0.00%> (-2.31%) ⬇️
util/jwt/jwt.go 60.00% <0.00%> (-1.91%) ⬇️
... and 71 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 12c95d6...1fa7b42. Read the comment docs.

Signed-off-by: jannfis <jann@mistrust.net>

On a rollback to a previous Argo CD version, this change would be reverted
and the resource would look like the first shown example above.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would Argo CD know whether a resource's app.kubernetes.io/instance was applied using the old system or the new system (the one described in this proposal)?

For example:

  1. A cluster is running with the old version of Argo CD
  2. All the application resources are labelled using the old system
  3. Argo CD is upgraded
  4. The new Argo CD looks at all the existing app.kubernetes.io/instance fields of the existing resources, discovers that they don't match any expected hash (and thus don't belong to any Application), and ignores them (recreating new Application resources).

I think we need the ability to look at the app.kubernetes.io/instance field of a resource and know which system it was created with.

One option we could use is look at the app.kubernetes.io/instance value and determine if it is a 32 character hexadecimal value, but this is not foolproof (users may be using Applications with names that are 32-byte hexadecimal values).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point you raise there.

I haven't yet verified, but I think it would be similar to the behavior if you rename an application, e.g. by deleting an Application and re-creating it with a different name, but same parameters (source + destination).

Argo CD would just go and overwrite what's in app.kubernetes.io/instance - at least as long as another application with the previous name does not exist, in which case Argo CD would issue a Shared Resource Warning and prevent the overwrite.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if an Argo CD instance encounters a resource with a app.kubernetes.io/instance value that it doesn't recognize (eg it does not correspond to a hash of a known application name+install id value), then Argo CD must assume that that resource is owned by another Argo CD instance (the goal of 'allow multiple Argo CD instances to manage resources on the same cluster').

So, presuming that Argo CD encounters a resource it thinks is owned by someone else, and it has the same name/namespace as a resouce managed by an Application managed by the current Argo CD instance, do we:

  1. Take ownership of it with the current Argo CD instance
  2. Leave it alone

Probably 2), and report an error ('error: Argo CD Application would overwrite a resource that may be owned by another Argo CD instance').

It might be worthwhile to document the algorithm that Argo CD will use to move existing resources from the old style to the new style, in the upgrade scenario.

I'm guessing it would be:

  1. If we encounter a resource that contains a app.kubernetes.io/instance that is not a 32/64 character hexadecimal value, AND it matches the name field of an existing Argo CD Application, then convert it to the new system (ensure the app.kubernetes.io/instance field is updated)
    • As you said, this might already be handled by the rename-style behaviour.
    • Another option would be to delete the resource, and hope that Argo CD recreates it.
  2. If we encounter a resource that contains a app.kubernetes.io/instance that is a 32/64 character hexadecimal value, leave it alone. (Assume it's a valid instance id from another Argo CD instance).
    • Unless it has the same Kubernetes resource name as a resource that this Argo CD instance is trying to create, in which case report the above error ('error: would overwrite a resource that may be owned...')
  3. Else, either delete or leave it alone (orphan it).
    (in very rare cases this might orphan or delete old resources, but it's probably fine.)

hash algorithms on Stack Exchange that suggests that `FNV-1a` might be more
than sufficient for our use-cases:
https://softwareengineering.stackexchange.com/a/145633

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re: hash choice -

If we're going to go with SHA-based algorithm, I recommend SHA-256, as SHA-1 is considered broken. It doesn't matter for this use case (we're not using the cryptographic properties of a hash, as you note later), but we will constantly get people telling us it's broken and asking us when we will move to another algorithm 😄 .

Copy link
Member

@jgwest jgwest Jun 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, correct me if I'm wrong, but I don't think hash performance is an issue: for each Application, we only need to calculate this value once, and then store it in memory (eg memoize the result of the hash function for each application).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have a good argument there. I favored the SHA-1 variant for producing the shorter hash values, but I think you are right in that we may want to consider usage of SHA-256 instead. Although we do not need to be secure in cryptographic terms here, probably SHA-1 will become utterly deprecated soon enough (maybe up to the point of being removed from any standard libraries?), and even Git will move away from it. So yeah, it leaves a "bad aftertaste" kind of thing :)

Thanks for bringing it up, I will change it in the proposal accordingly.

Copy link
Collaborator

@alexmt alexmt Jun 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we've got two similar discussion threads in this PR. Copying comments from another thread here.

As I understand there are trying to solve three issues:

  • allow app with long names
  • support two argocd managing apps in the same cluster
  • avoid argocd confusion that happened when something else adds app.kubernetes.io/instance label

Also, I think additional requirements are: try to keep it human-friendly and try avoid breaking changes.

To solve I want to propose:

  • use annotations instead of labels for tracking to solve long names issue
  • use three annotations (e.g. argo-cd.argoproj.io/application, argo-cd.argoproj.io/resourceKey, ``argo-cd.argoproj.io/installation` ) that includes app name, resource key and argocd installation id
  • allow the user to switch to annotation-based tracking when necessary using new app.spec.tracking field. (this comment tries to justify why new way of tracking should be optional)

So main changes in my proposal:

  • use annotations + human-readable strings instead of hash + labels.
  • new way of tracking should be explicitly enabled per app

@jannfis , @jgwest what do you think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should not provide two different methods for tracking applications, as it will complicate things greatly and probably will cause some confusion among users.

What speaks against using the label for tracking and the annotations for human-readable identifiers together? So the annotations will only serve informational purposes, and the label is used to select and identify resources.

To solve

avoid argocd confusion that happened when something else adds app.kubernetes.io/instance label

I would rather propose to change the default app.kubernetes.io/instance label to something more sensible, Argo CD specific. E.g. argocd.argoproj.io/instance - I think the Argo CD Helm chart already sets this by default.

IMHO, I think we shouldn't sacrifice the label in order to enable longer-than-63-chars application names.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to avoid painful migration for end-users: if Argo CD changes the tracking method then all managed resources suddenly became out-of-sync.

However, I agree that two ways if tracking is not great. In order to make it easier to migrate, we should support using "legacy" way of tracking for 1~2 releases.

Signed-off-by: jannfis <jann@mistrust.net>
of the Argo CD installation into the final hash value, e.g.

```
value = hashFunc(installationIdentifier + "." + applicationName)
Copy link
Contributor

@kshamajain99 kshamajain99 Jun 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My suggestion would be that hashValue can be applicationName+"."+resourceKey. So that we know what the name of original resourceName in case when some kubernetes operator clone the resource.

By that we can also aim to solve another usecase of ignoreExtraneous resources outside target namespace

#6324

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like the idea to bring in resourceKey somewhere. But I think if we put it into the application instance label, we will end up with a different value on each resource. This makes it utterly hard to use the label as a selector, e.g. to select all resources belonging to a given application by running kubectl get somekind -l app.kubernetes.io/instance=<somevalue>

Maybe it would make more sense to add it as an additional annotation on the resource? E.g. argo-cd.argoproj.io/resourceKey instead of trying to condense it.

But at the end, this will be a design decision: Do we want to support above use-case, e.g. using the label as a selector for resources managed by a given application, or not. I clearly see the benefits of using the label as a selector, but I don't have any strong opinions on it. We should just be aware of the consequences.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it is very convenient to be compatible with kubectl get -l app.kubernetes.io/instance=<somevalue> . This works most of the time except few edge cases:

  • app name is too long
  • something else add app.kubernetes.io/instance label and confuses argo cd. This is the most annoying issue and we don't have a very good resolution.

So I think it would be great if keep using app.kubernetes.io/instance label by default and allow to switch to new way when necessary.

@jessesuen had a good point about trying to keep app mark human readable. How about we add new app-level field resourceTracking that might be label or annotation. If user select annotation that Argo CD should apply app.kubernetes.io/instance annotation where annotation value is <appName>-<resouceName> ?

This would allow us to upgrade smoothly and takes care about both edge cases. WDYT?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The argo-cd.argoproj.io/resourceKey annotation instead of embedding resource key into app.kubernetes.io/instance works too

Signed-off-by: Alexander Matyushentsev <amatyushentsev@gmail.com>
@alexmt
Copy link
Collaborator

alexmt commented Sep 7, 2021

We brainstormed about this change with @jannfis and @jessesuen offline. Created the PR to @jannfis 's fork to capture the proposed changes: jannfis#17

docs: update proposal to use annotations with human friendly values
Copy link
Collaborator

@alexmt alexmt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@alexmt alexmt merged commit d823efe into argoproj:master Sep 13, 2021
@pasha-codefresh pasha-codefresh mentioned this pull request Sep 17, 2021
3 tasks
plakyda-codefresh pushed a commit to plakyda-codefresh/argo-cd that referenced this pull request Sep 28, 2021
…n 63 chars (argoproj#6425)

proposal: Change application identifier to allow app names longer than 63 chars (argoproj#6425)

Signed-off-by: jannfis <jann@mistrust.net>
Signed-off-by: viktorplakida <plakyda1@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants