-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(registration): registration checks and re-registration flow #86
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the intended result if the user deletes the jvmId==hash credential within Cryostat? After a non-200 GET response from the check, the agent will repeatedly try to POST another registration request many many times in quick succession until something happens in the end that I can't understand.
Here is the attached logs, but it seems both agent worker threads POST to the discovery endpoint during the same time and the retries stop. Then we must sigkill that agent.
Hmm, I'll try to figure out what's happening there. If the stored credentials get deleted then that shouldn't actually do anything to the agent right away - its registration information and token are still valid, so it should still be able to check its registration status periodically. If it needed to it should still be able to publish updates about its discovery subtree (just itself) as well, but the timing there is very unlikely. However, once the credentials are deleted, then when Cryostat tries to ping the agent instance after some time, the ping should fail because Cryostat is unable to look up the stored credentials that the agent expects, resulting in the agent rejecting Cryostat's ping. This should lead to Cryostat deregistering the agent. The next time the agent tries to check its registration state it will see this and should re-enter its initial registration flow, which should start with attempting to re-submit new stored credentials. |
I think I've solved it with the last commit. @maxcao13 give it another test run please |
I tried the pr and the issue with the looping registration is fixed now, but I notice something else weird now. I tried lowering the ping period of the discovery storage pinging the plugins |
Ah yes, I think that makes sense with the way the agent's internal state machinery works. The discovery ping request tells the agent to re-register itself, which really means de-register and start the registration flow over. But, the harvester also intentionally stops itself when the agent has become deregistered, and starts on successful registration. So if the discovery ping period is shorter than the harvester period you'll end up with the agent probably never even attempting to push harvested files, or if it does then perhaps it gets interrupted part way through sometimes. There's probably some approach I could take like adding an internal state for "re-registering" to allow this to smoothly transition over, but I'm not sure how worthwhile it is. This does point out an interesting side-effect of the discovery ping though, because it means that the harvester period will result in files being pushed with that periodicity on one interval, but the interval will reset at the next re-registration time. If the discovery ping period is relatively long and the harvester period is relatively short then this won't be very noticeable, but as these two periods become closer in value then the skew could become noticeable, up to the point you've identified where the skew actually can prevent any harvesting from happening at all. I think that's a separate issue to work on, maybe for next development cycle. |
02293bb
to
81ed5ed
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
Signed-off-by: Andrew Azores <aazores@redhat.com>
Signed-off-by: Andrew Azores <aazores@redhat.com>
Signed-off-by: Andrew Azores <aazores@redhat.com>
remote credentials deleted externally Signed-off-by: Andrew Azores <aazores@redhat.com>
81ed5ed
to
e0dcabe
Compare
Rebased + signed |
Fixes #85
Depends on https://github.com/cryostatio/cryostat/pull/1425