Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support UTF-8 label matchers: Use compat package in Alertmanager server #3567

Conversation

grobinson-grafana
Copy link
Contributor

@grobinson-grafana grobinson-grafana commented Oct 20, 2023

This pull request adds use of the compat package in Alertmanager server that will allow users to switch between the new matchers/parse parser and the old pkg/labels parser. The new matchers/parse parser uses a fallback mechanism where if the input cannot be parsed in the new parser it then attempts to use the old parser. If an input is parsed in the old parser but not the new parser then a warning log is emitted.

@grobinson-grafana grobinson-grafana changed the title Use compat package in Alertmanager server Support UTF-8 label matchers: Use compat package in Alertmanager server Oct 20, 2023
@grobinson-grafana grobinson-grafana marked this pull request as draft October 20, 2023 16:04
@grobinson-grafana grobinson-grafana force-pushed the grobinson/use-adapter-alertmanager-server branch from cd3904e to 67c0042 Compare November 1, 2023 10:28
api/v2/api.go Outdated Show resolved Hide resolved
@grobinson-grafana grobinson-grafana marked this pull request as ready for review November 2, 2023 17:13
@grobinson-grafana grobinson-grafana marked this pull request as draft November 3, 2023 20:02
@grobinson-grafana grobinson-grafana force-pushed the grobinson/use-adapter-alertmanager-server branch 3 times, most recently from df1e680 to 3f30b33 Compare November 14, 2023 17:16
}

// InitFromFlags initializes the validation function from the flagger.
func InitFromFlags(l log.Logger, f featurecontrol.Flagger) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found an interesting issue here! If you create a silence with UTF-8 matchers and then restart Alertmanager with the "classic-matchers-parsing" feature flag, you can no longer edit or expire the silence because the validation function returns an error. However, the silence will still expire once it's expiration time has elapsed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've attempted to fix that with this commit here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great Find! I'll take a look once we have the other changes in.

@grobinson-grafana grobinson-grafana force-pushed the grobinson/use-adapter-alertmanager-server branch from 9ec6b89 to 7261a7f Compare November 15, 2023 13:14
@grobinson-grafana grobinson-grafana marked this pull request as ready for review November 15, 2023 13:17
@grobinson-grafana
Copy link
Contributor Author

@gotjosh let me know when you have some time and we can review this together?

@grobinson-grafana grobinson-grafana force-pushed the grobinson/use-adapter-alertmanager-server branch from 64ac0a0 to bfb4b78 Compare November 21, 2023 11:27
Copy link
Member

@gotjosh gotjosh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job so far, please take a look at my comments.

api/v2/api.go Outdated Show resolved Hide resolved
@@ -82,7 +83,7 @@ func (a *alertQueryCmd) queryAlerts(ctx context.Context, _ *kingpin.ParseContext
m := a.matcherGroups[0]
_, err := compat.Matcher(m)
if err != nil {
a.matcherGroups[0] = fmt.Sprintf("alertname=%s", m)
a.matcherGroups[0] = fmt.Sprintf("alertname=%s", strconv.Quote(m))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not related - just something we missed from before, right? We had before silence_add, silence_query and alert_add so I guess it makes sense.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, correct! I noticed it as the changes I made in this PR caused a test to fail.


_, err := am.Client().Silence.PostSilences(silenceParams)
require.NoError(t, err)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a test for the case that you found - where you create an utf-8 silence, restart the server with the classic flag and try to operate on it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll take a look. I might need to make some larger changes to acceptance.go to support restarting clusters, as it seems at present stopping a cluster deletes all it's data.

test/with_api_v2/acceptance/api_test.go Outdated Show resolved Hide resolved
Comment on lines 124 to 126
at := NewAcceptanceTest(t, &AcceptanceOpts{
Tolerance: 150 * time.Millisecond,
})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused - and running this test in isolation seems to agree with me. Where exactly do we make this test accept Unicode characters?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remember the default mode of Alertmanager (unless you use classic-mode) is UTF-8, with fallback to classic mode. Makes me think I should rename utf8-mode to utf8-strict-mode.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I completely forgot about this haha.

types/types.go Outdated Show resolved Hide resolved
}

// InitFromFlags initializes the validation function from the flagger.
func InitFromFlags(l log.Logger, f featurecontrol.Flagger) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great Find! I'll take a look once we have the other changes in.

silence/silence.go Outdated Show resolved Hide resolved
cmd/alertmanager/main.go Outdated Show resolved Hide resolved
@gotjosh gotjosh mentioned this pull request Nov 21, 2023
3 tasks
@grobinson-grafana grobinson-grafana force-pushed the grobinson/use-adapter-alertmanager-server branch 2 times, most recently from 2c60bec to ee2e8da Compare November 22, 2023 14:42
Copy link
Member

@gotjosh gotjosh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job! I only have 2 files left to review - silence and silce_test.

api/api.go Outdated Show resolved Hide resolved
api/v2/api.go Outdated Show resolved Hide resolved
Comment on lines +77 to +105
// Can get same alert from the API.
resp, err := am.Client().Alert.GetAlerts(nil)
require.NoError(t, err)
require.Len(t, resp.Payload, 1)
require.Equal(t, labels, resp.Payload[0].Labels)

// Can filter alerts on UTF-8 labels.
getAlertParams := alert.NewGetAlertsParams()
getAlertParams = getAlertParams.WithFilter([]string{"00=b", "Σ=c", "\"\\xf0\\x9f\\x99\\x82\"=dΘ"})
resp, err = am.Client().Alert.GetAlerts(getAlertParams)
require.NoError(t, err)
require.Len(t, resp.Payload, 1)
require.Equal(t, labels, resp.Payload[0].Labels)

// Can get same alert in alert group from the API.
alertGroupResp, err := am.Client().Alertgroup.GetAlertGroups(nil)
require.NoError(t, err)
require.Len(t, alertGroupResp.Payload, 1)
require.Len(t, alertGroupResp.Payload[0].Alerts, 1)
require.Equal(t, labels, alertGroupResp.Payload[0].Alerts[0].Labels)

// Can filter alertGroups on UTF-8 labels.
getAlertGroupsParams := alertgroup.NewGetAlertGroupsParams()
getAlertGroupsParams.Filter = []string{"00=b", "Σ=c", "\"\\xf0\\x9f\\x99\\x82\"=dΘ"}
alertGroupResp, err = am.Client().Alertgroup.GetAlertGroups(getAlertGroupsParams)
require.NoError(t, err)
require.Len(t, alertGroupResp.Payload, 1)
require.Len(t, alertGroupResp.Payload[0].Alerts, 1)
require.Equal(t, labels, alertGroupResp.Payload[0].Alerts[0].Labels)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job! ❤️

Copy link
Member

@gotjosh gotjosh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but please take a look at my comment wrt the silences validation.

I think we're missing three things at this point:

  1. Make the CI pass
  2. The test to catch the removal of validation in expire
  3. The avoidance of copy paste code.

silence/silence.go Outdated Show resolved Hide resolved
Comment on lines +501 to +522
func validateUTF8Matcher(m *pb.Matcher) error {
if !utf8.ValidString(m.Name) {
return fmt.Errorf("invalid label name %q", m.Name)
}
switch m.Type {
case pb.Matcher_EQUAL, pb.Matcher_NOT_EQUAL:
if !utf8.ValidString(m.Pattern) {
return fmt.Errorf("invalid label value %q", m.Pattern)
}
case pb.Matcher_REGEXP, pb.Matcher_NOT_REGEXP:
if !utf8.ValidString(m.Pattern) {
return fmt.Errorf("invalid regular expression %q", m.Pattern)
}
if _, err := regexp.Compile(m.Pattern); err != nil {
return fmt.Errorf("invalid regular expression %q: %s", m.Pattern, err)
}
default:
return fmt.Errorf("unknown matcher type %q", m.Type)
}
return nil
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have my reservations when it comes to this code - to me, this feels wrong. We're accepting matches as user input from the silences API in a structured format:

 "matchers": [
    {
      "name": "string",
      "value": "string",
      "isRegex": true,
      "isEqual": true
    }
  ],

In my opinion, this validation should happen at an API level and ideally make the input matchers go through the parser, by taking each parser and constructing a string that would you give you the struct directly.

My main worry is that right now validation for entities is spreadout in multiple layers which makes it difficult to track who validates what.

Not something that we need to solve right now, but it's worth thinking about.

@grobinson-grafana grobinson-grafana force-pushed the grobinson/use-adapter-alertmanager-server branch from 423ddc7 to 422e80f Compare November 23, 2023 10:34
@grobinson-grafana grobinson-grafana force-pushed the grobinson/use-adapter-alertmanager-server branch from f2529d0 to a33c000 Compare November 23, 2023 16:35
Copy link
Member

@gotjosh gotjosh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - please take a look at my comments but overall I have no further blocking questions.

Great work and thank you very much for your contribution.

silence/silence.go Show resolved Hide resolved
@gotjosh
Copy link
Member

gotjosh commented Nov 24, 2023

I've merged #2469, can you please rebase so that I can also merge this one?

Signed-off-by: George Robinson <george.robinson@grafana.com>
Signed-off-by: George Robinson <george.robinson@grafana.com>
Signed-off-by: George Robinson <george.robinson@grafana.com>
Signed-off-by: George Robinson <george.robinson@grafana.com>
Signed-off-by: George Robinson <george.robinson@grafana.com>
Signed-off-by: George Robinson <george.robinson@grafana.com>
Signed-off-by: George Robinson <george.robinson@grafana.com>
Signed-off-by: George Robinson <george.robinson@grafana.com>
Signed-off-by: George Robinson <george.robinson@grafana.com>
Signed-off-by: George Robinson <george.robinson@grafana.com>
Signed-off-by: George Robinson <george.robinson@grafana.com>
Signed-off-by: George Robinson <george.robinson@grafana.com>
Signed-off-by: George Robinson <george.robinson@grafana.com>
Signed-off-by: George Robinson <george.robinson@grafana.com>
Signed-off-by: George Robinson <george.robinson@grafana.com>
Signed-off-by: George Robinson <george.robinson@grafana.com>
Signed-off-by: George Robinson <george.robinson@grafana.com>
This commit adds isDelete to setSilence to skip validation when
expiring silences. This prevents an issue where a silence that
is no longer valid (i.e. the rules have changed in a newer version
of Alertmanager) cannot be expired via the API or UI.

Signed-off-by: George Robinson <george.robinson@grafana.com>
Signed-off-by: George Robinson <george.robinson@grafana.com>
Signed-off-by: George Robinson <george.robinson@grafana.com>
Signed-off-by: George Robinson <george.robinson@grafana.com>
Signed-off-by: George Robinson <george.robinson@grafana.com>
Signed-off-by: George Robinson <george.robinson@grafana.com>
Signed-off-by: George Robinson <george.robinson@grafana.com>
@grobinson-grafana grobinson-grafana force-pushed the grobinson/use-adapter-alertmanager-server branch from 62b5ade to 318b822 Compare November 24, 2023 09:29
@grobinson-grafana
Copy link
Contributor Author

I've merged #2469, can you please rebase so that I can also merge this one?

Done! 🙂

Copy link
Member

@gotjosh gotjosh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@gotjosh gotjosh merged commit 70bd5da into prometheus:main Nov 24, 2023
11 checks passed
@grobinson-grafana grobinson-grafana deleted the grobinson/use-adapter-alertmanager-server branch April 16, 2024 14:44
SuperQ added a commit that referenced this pull request Oct 16, 2024
* [CHANGE] Deprecate and remove api/v1/ #2970
* [CHANGE] Remove unused feature flags #3676
* [CHANGE] Newlines in smtp password file are now ignored #3681
* [CHANGE] Change compat metrics to counters #3686
* [CHANGE] Do not register compat metrics in amtool #3713
* [CHANGE] Remove metrics from compat package #3714
* [CHANGE] Mark muted alerts #3793
* [FEATURE] Add metric for inhibit rules #3681
* [FEATURE] Support UTF-8 label matchers #3453, #3507, #3523, #3483, #3567, #3568, #3569, #3571, #3595, #3604, #3619, #3658, #3659, #3662, #3668, 3572
* [FEATURE] Add counter to track alerts dropped outside of time_intervals #3565
* [FEATURE] Add date and tz functions to templates #3812
* [FEATURE] Add limits for silences #3852
* [FEATURE] Add time helpers for templates #3863
* [FEATURE] Add auto GOMAXPROCS #3837
* [FEATURE] Add auto GOMEMLIMIT #3895
* [FEATURE] Add Jira receiver integration #3590
* [ENHANCEMENT] Add the receiver name to notification metrics #3045
* [ENHANCEMENT] Add the route ID to uuid #3372
* [ENHANCEMENT] Add duration to the notify success message #3559
* [ENHANCEMENT] Implement webhook_url_file for discord and msteams #3555
* [ENHANCEMENT] Add debug logs for muted alerts #3558
* [ENHANCEMENT] API: Allow the Silences API to use their own 400 response #3610
* [ENHANCEMENT] Add summary to msteams notification #3616
* [ENHANCEMENT] Add context reasons to notifications failed counter #3631
* [ENHANCEMENT] Add optional native histogram support to latency metrics #3737
* [ENHANCEMENT] Enable setting ThreadId for Telegram notifications #3638
* [ENHANCEMENT] Allow webex roomID from template #3801
* [BUGFIX] Add missing integrations to notify metrics #3480
* [BUGFIX] Add missing ttl in pushhover #3474
* [BUGFIX] Fix scheme required for webhook url in amtool #3409
* [BUGFIX] Remove duplicate integration from metrics #3516
* [BUGFIX] Reflect Discord's max length message limits #3597
* [BUGFIX] Fix nil error in warn logs about incompatible matchers #3683
* [BUGFIX] Fix a small number of inconsistencies in compat package logging #3718
* [BUGFIX] Fix log line in featurecontrol #3719
* [BUGFIX] Fix panic in acceptance tests #3592
* [BUGFIX] Fix flaky test TestClusterJoinAndReconnect/TestTLSConnection #3722
* [BUGFIX] Fix crash on errors when url_file is used #3800
* [BUGFIX] Fix race condition in dispatch.go #3826
* [BUGFIX] Fix race conditions in the memory alerts store #3648
* [BUGFIX] Hide config.SecretURL when the URL is incorrect. #3887
* [BUGFIX] Fix invalid silence causes incomplete updates #3898
* [BUGFIX] Fix leaking of Silences matcherCache entries #3930
* [BUGFIX] Close SMTP submission correctly to handle errors #4006

Signed-off-by: SuperQ <superq@gmail.com>
@SuperQ SuperQ mentioned this pull request Oct 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants