-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support UTF-8 label matchers: Use compat package in Alertmanager server #3567
Support UTF-8 label matchers: Use compat package in Alertmanager server #3567
Conversation
cd3904e
to
67c0042
Compare
df1e680
to
3f30b33
Compare
silence/silence.go
Outdated
} | ||
|
||
// InitFromFlags initializes the validation function from the flagger. | ||
func InitFromFlags(l log.Logger, f featurecontrol.Flagger) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found an interesting issue here! If you create a silence with UTF-8 matchers and then restart Alertmanager with the "classic-matchers-parsing" feature flag, you can no longer edit or expire the silence because the validation function returns an error. However, the silence will still expire once it's expiration time has elapsed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've attempted to fix that with this commit here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great Find! I'll take a look once we have the other changes in.
9ec6b89
to
7261a7f
Compare
@gotjosh let me know when you have some time and we can review this together? |
64ac0a0
to
bfb4b78
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job so far, please take a look at my comments.
@@ -82,7 +83,7 @@ func (a *alertQueryCmd) queryAlerts(ctx context.Context, _ *kingpin.ParseContext | |||
m := a.matcherGroups[0] | |||
_, err := compat.Matcher(m) | |||
if err != nil { | |||
a.matcherGroups[0] = fmt.Sprintf("alertname=%s", m) | |||
a.matcherGroups[0] = fmt.Sprintf("alertname=%s", strconv.Quote(m)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not related - just something we missed from before, right? We had before silence_add
, silence_query
and alert_add
so I guess it makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, correct! I noticed it as the changes I made in this PR caused a test to fail.
|
||
_, err := am.Client().Silence.PostSilences(silenceParams) | ||
require.NoError(t, err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a test for the case that you found - where you create an utf-8 silence, restart the server with the classic flag and try to operate on it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll take a look. I might need to make some larger changes to acceptance.go
to support restarting clusters, as it seems at present stopping a cluster deletes all it's data.
at := NewAcceptanceTest(t, &AcceptanceOpts{ | ||
Tolerance: 150 * time.Millisecond, | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused - and running this test in isolation seems to agree with me. Where exactly do we make this test accept Unicode characters?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remember the default mode of Alertmanager (unless you use classic-mode
) is UTF-8, with fallback to classic mode. Makes me think I should rename utf8-mode
to utf8-strict-mode
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I completely forgot about this haha.
silence/silence.go
Outdated
} | ||
|
||
// InitFromFlags initializes the validation function from the flagger. | ||
func InitFromFlags(l log.Logger, f featurecontrol.Flagger) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great Find! I'll take a look once we have the other changes in.
2c60bec
to
ee2e8da
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job! I only have 2 files left to review - silence and silce_test.
// Can get same alert from the API. | ||
resp, err := am.Client().Alert.GetAlerts(nil) | ||
require.NoError(t, err) | ||
require.Len(t, resp.Payload, 1) | ||
require.Equal(t, labels, resp.Payload[0].Labels) | ||
|
||
// Can filter alerts on UTF-8 labels. | ||
getAlertParams := alert.NewGetAlertsParams() | ||
getAlertParams = getAlertParams.WithFilter([]string{"00=b", "Σ=c", "\"\\xf0\\x9f\\x99\\x82\"=dΘ"}) | ||
resp, err = am.Client().Alert.GetAlerts(getAlertParams) | ||
require.NoError(t, err) | ||
require.Len(t, resp.Payload, 1) | ||
require.Equal(t, labels, resp.Payload[0].Labels) | ||
|
||
// Can get same alert in alert group from the API. | ||
alertGroupResp, err := am.Client().Alertgroup.GetAlertGroups(nil) | ||
require.NoError(t, err) | ||
require.Len(t, alertGroupResp.Payload, 1) | ||
require.Len(t, alertGroupResp.Payload[0].Alerts, 1) | ||
require.Equal(t, labels, alertGroupResp.Payload[0].Alerts[0].Labels) | ||
|
||
// Can filter alertGroups on UTF-8 labels. | ||
getAlertGroupsParams := alertgroup.NewGetAlertGroupsParams() | ||
getAlertGroupsParams.Filter = []string{"00=b", "Σ=c", "\"\\xf0\\x9f\\x99\\x82\"=dΘ"} | ||
alertGroupResp, err = am.Client().Alertgroup.GetAlertGroups(getAlertGroupsParams) | ||
require.NoError(t, err) | ||
require.Len(t, alertGroupResp.Payload, 1) | ||
require.Len(t, alertGroupResp.Payload[0].Alerts, 1) | ||
require.Equal(t, labels, alertGroupResp.Payload[0].Alerts[0].Labels) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job! ❤️
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM but please take a look at my comment wrt the silences validation.
I think we're missing three things at this point:
- Make the CI pass
- The test to catch the removal of validation in
expire
- The avoidance of copy paste code.
func validateUTF8Matcher(m *pb.Matcher) error { | ||
if !utf8.ValidString(m.Name) { | ||
return fmt.Errorf("invalid label name %q", m.Name) | ||
} | ||
switch m.Type { | ||
case pb.Matcher_EQUAL, pb.Matcher_NOT_EQUAL: | ||
if !utf8.ValidString(m.Pattern) { | ||
return fmt.Errorf("invalid label value %q", m.Pattern) | ||
} | ||
case pb.Matcher_REGEXP, pb.Matcher_NOT_REGEXP: | ||
if !utf8.ValidString(m.Pattern) { | ||
return fmt.Errorf("invalid regular expression %q", m.Pattern) | ||
} | ||
if _, err := regexp.Compile(m.Pattern); err != nil { | ||
return fmt.Errorf("invalid regular expression %q: %s", m.Pattern, err) | ||
} | ||
default: | ||
return fmt.Errorf("unknown matcher type %q", m.Type) | ||
} | ||
return nil | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have my reservations when it comes to this code - to me, this feels wrong. We're accepting matches as user input from the silences API in a structured format:
"matchers": [
{
"name": "string",
"value": "string",
"isRegex": true,
"isEqual": true
}
],
In my opinion, this validation should happen at an API level and ideally make the input matchers go through the parser, by taking each parser and constructing a string that would you give you the struct directly.
My main worry is that right now validation for entities is spreadout in multiple layers which makes it difficult to track who validates what.
Not something that we need to solve right now, but it's worth thinking about.
423ddc7
to
422e80f
Compare
f2529d0
to
a33c000
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - please take a look at my comments but overall I have no further blocking questions.
Great work and thank you very much for your contribution.
I've merged #2469, can you please rebase so that I can also merge this one? |
Signed-off-by: George Robinson <george.robinson@grafana.com>
Signed-off-by: George Robinson <george.robinson@grafana.com>
Signed-off-by: George Robinson <george.robinson@grafana.com>
Signed-off-by: George Robinson <george.robinson@grafana.com>
Signed-off-by: George Robinson <george.robinson@grafana.com>
Signed-off-by: George Robinson <george.robinson@grafana.com>
Signed-off-by: George Robinson <george.robinson@grafana.com>
Signed-off-by: George Robinson <george.robinson@grafana.com>
Signed-off-by: George Robinson <george.robinson@grafana.com>
Signed-off-by: George Robinson <george.robinson@grafana.com>
Signed-off-by: George Robinson <george.robinson@grafana.com>
Signed-off-by: George Robinson <george.robinson@grafana.com>
This commit adds isDelete to setSilence to skip validation when expiring silences. This prevents an issue where a silence that is no longer valid (i.e. the rules have changed in a newer version of Alertmanager) cannot be expired via the API or UI. Signed-off-by: George Robinson <george.robinson@grafana.com>
Signed-off-by: George Robinson <george.robinson@grafana.com>
Signed-off-by: George Robinson <george.robinson@grafana.com>
Signed-off-by: George Robinson <george.robinson@grafana.com>
62b5ade
to
318b822
Compare
Done! 🙂 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* [CHANGE] Deprecate and remove api/v1/ #2970 * [CHANGE] Remove unused feature flags #3676 * [CHANGE] Newlines in smtp password file are now ignored #3681 * [CHANGE] Change compat metrics to counters #3686 * [CHANGE] Do not register compat metrics in amtool #3713 * [CHANGE] Remove metrics from compat package #3714 * [CHANGE] Mark muted alerts #3793 * [FEATURE] Add metric for inhibit rules #3681 * [FEATURE] Support UTF-8 label matchers #3453, #3507, #3523, #3483, #3567, #3568, #3569, #3571, #3595, #3604, #3619, #3658, #3659, #3662, #3668, 3572 * [FEATURE] Add counter to track alerts dropped outside of time_intervals #3565 * [FEATURE] Add date and tz functions to templates #3812 * [FEATURE] Add limits for silences #3852 * [FEATURE] Add time helpers for templates #3863 * [FEATURE] Add auto GOMAXPROCS #3837 * [FEATURE] Add auto GOMEMLIMIT #3895 * [FEATURE] Add Jira receiver integration #3590 * [ENHANCEMENT] Add the receiver name to notification metrics #3045 * [ENHANCEMENT] Add the route ID to uuid #3372 * [ENHANCEMENT] Add duration to the notify success message #3559 * [ENHANCEMENT] Implement webhook_url_file for discord and msteams #3555 * [ENHANCEMENT] Add debug logs for muted alerts #3558 * [ENHANCEMENT] API: Allow the Silences API to use their own 400 response #3610 * [ENHANCEMENT] Add summary to msteams notification #3616 * [ENHANCEMENT] Add context reasons to notifications failed counter #3631 * [ENHANCEMENT] Add optional native histogram support to latency metrics #3737 * [ENHANCEMENT] Enable setting ThreadId for Telegram notifications #3638 * [ENHANCEMENT] Allow webex roomID from template #3801 * [BUGFIX] Add missing integrations to notify metrics #3480 * [BUGFIX] Add missing ttl in pushhover #3474 * [BUGFIX] Fix scheme required for webhook url in amtool #3409 * [BUGFIX] Remove duplicate integration from metrics #3516 * [BUGFIX] Reflect Discord's max length message limits #3597 * [BUGFIX] Fix nil error in warn logs about incompatible matchers #3683 * [BUGFIX] Fix a small number of inconsistencies in compat package logging #3718 * [BUGFIX] Fix log line in featurecontrol #3719 * [BUGFIX] Fix panic in acceptance tests #3592 * [BUGFIX] Fix flaky test TestClusterJoinAndReconnect/TestTLSConnection #3722 * [BUGFIX] Fix crash on errors when url_file is used #3800 * [BUGFIX] Fix race condition in dispatch.go #3826 * [BUGFIX] Fix race conditions in the memory alerts store #3648 * [BUGFIX] Hide config.SecretURL when the URL is incorrect. #3887 * [BUGFIX] Fix invalid silence causes incomplete updates #3898 * [BUGFIX] Fix leaking of Silences matcherCache entries #3930 * [BUGFIX] Close SMTP submission correctly to handle errors #4006 Signed-off-by: SuperQ <superq@gmail.com>
This pull request adds use of the compat package in Alertmanager server that will allow users to switch between the new matchers/parse parser and the old pkg/labels parser. The new matchers/parse parser uses a fallback mechanism where if the input cannot be parsed in the new parser it then attempts to use the old parser. If an input is parsed in the old parser but not the new parser then a warning log is emitted.