-
-
Notifications
You must be signed in to change notification settings - Fork 4.5k
feat(aci): Ensure DetectorGroup for new events on existing error Groups #103419
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #103419 +/- ##
===========================================
- Coverage 80.67% 80.67% -0.01%
===========================================
Files 9243 9243
Lines 394826 394936 +110
Branches 25152 25152
===========================================
+ Hits 318538 318615 +77
- Misses 75841 75874 +33
Partials 447 447 |
mifu67
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're only touching error issues here—ensure_association_with_detector is never called with a detector ID. Did you want to handle other grouptypes in a separate PR?
In any case, I think that save_issue_occurrence is a good place to handle healing the relationship for metric issues. We can get the detector ID using occurrence_data.evidence_data. Let me know too if you want me to handle that.
| flags=FLAG_AUTOMATOR_MODIFIABLE, | ||
| ) | ||
|
|
||
| register( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just confirming that you know what this is doing (I have never seen flags like this before but I trust you 😅)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless I'm missing something, I've used option values like this for infrastructural kill switches a few times in the past. We usually use flags that let us roll things out, but if the real goal is just be able to turn something off very quickly and it's not necessarily and org or project specific thing, this seems the easiest way.
I was initially just targeting errors here, since for metric issues there was already a migration PR out, but I intentionally left the helper method more general to allow us to extend it to other issue types. So, happy to do that here. |
We want DetectorGroups for all error Groups.
This is a somewhat tricky thing to backfill, but we can do it incrementally as new events come in for the cost of a single query in the common case. This should result in full coverage in the lifespan of a Group (30 days?) and ensures that any recurring older group will have coverage, which is nice.
A real backfill would certainly be faster, but this is easy, safe. and largely makes a real backfill unnecessary.