Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

operator: Add support for Loki rules reconciliation #5986

Merged
merged 1 commit into from
May 24, 2022

Conversation

periklis
Copy link
Collaborator

@periklis periklis commented Apr 21, 2022

What this PR does / why we need it:
This PR is first-shot implementation for the proposed enhancement: #5985

Which issue(s) this PR fixes:
Implements #5985, based on/replacing #5188

Special notes for your reviewer:

Checklist

  • Documentation added
  • Tests updated
  • Is this an important fix or new feature? Add an entry in the CHANGELOG.md.
  • Changes that require user attention or interaction to upgrade are documented in docs/sources/upgrading/_index.md

@periklis periklis added type/feature Something new we should do sig/operator labels Apr 21, 2022
@periklis periklis self-assigned this Apr 21, 2022
@periklis periklis requested a review from a team as a code owner April 21, 2022 13:01
@periklis periklis force-pushed the ruler-support branch 7 times, most recently from 50d749b to 04f4a2c Compare April 26, 2022 09:29
@periklis periklis changed the title operator: Add basic support for Loki ruler and rules reconciliation operator: Add basic support for Loki rules reconciliation Apr 26, 2022
@periklis periklis force-pushed the ruler-support branch 2 times, most recently from da96af8 to a31504b Compare April 26, 2022 13:17
@periklis periklis changed the title operator: Add basic support for Loki rules reconciliation operator: Add support for Loki rules reconciliation Apr 26, 2022
@periklis periklis force-pushed the ruler-support branch 5 times, most recently from f05bcd1 to 206d4a9 Compare May 3, 2022 11:59
@github-actions github-actions bot removed the type/feature Something new we should do label May 4, 2022
@cyriltovena
Copy link
Contributor

Went through it all where is the rules API reconciliation with the ruler ?

@periklis
Copy link
Collaborator Author

periklis commented May 5, 2022

Went through it all where is the rules API reconciliation with the ruler ?

You mean the RulerConfig reconciliation for AlertManager and RemoteWrite? If yes this will be a separate PR. This one is focusing on AlertingRule and RecordingRule.

Copy link
Collaborator

@xperimental xperimental left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm about half-way through the content of the PR, but here's the comments so far.

operator/internal/status/components.go Outdated Show resolved Hide resolved
operator/internal/status/status.go Outdated Show resolved Hide resolved
operator/PROJECT Show resolved Hide resolved
operator/api/v1beta1/alertingrule_types.go Outdated Show resolved Hide resolved
operator/api/v1beta1/alertingrule_webhook.go Outdated Show resolved Hide resolved
operator/config/default/manager_webhook_patch.yaml Outdated Show resolved Hide resolved
@@ -103,7 +103,7 @@ help: ## Display this help.

.PHONY: deps
deps: go.mod go.sum
go mod tidy
go mod tidy -compat=1.17
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a really bad feeling about the error message that go mod tidy is outputting when used without the compat option. Maybe we need to discuss splitting out the API dependency as a separate module from Loki?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem here is our new dependency against Loki itself, in particular we reference:

github.com/grafana/loki/operator/internal/manifests/internal/rules imports
        gopkg.in/yaml.v2 tested by
        gopkg.in/yaml.v2.test imports
        gopkg.in/check.v1 loaded from gopkg.in/check.v1@v1.0.0-20200902074654-038fdea0a05b,
        but go 1.16 would select v1.0.0-20201130134442-10cb98267c6c
github.com/grafana/loki/operator/api/v1beta1 imports
        github.com/grafana/loki/pkg/logql/syntax imports
        github.com/grafana/loki/pkg/util imports
        github.com/grafana/dskit/kv imports
        github.com/grafana/dskit/kv/consul imports
        github.com/hashicorp/consul/api loaded from github.com/hashicorp/consul/api@v1.12.0,
        but go 1.16 would fail to locate it:
        ambiguous import: found package github.com/hashicorp/consul/api in multiple modules:
        github.com/hashicorp/consul v1.2.1 (/home/periklis/.gvm/pkgsets/go1.17.8/global/pkg/mod/github.com/hashicorp/consul@v1.2.1/api)
        github.com/hashicorp/consul/api v1.12.0 (/home/periklis/.gvm/pkgsets/go1.17.8/global/pkg/mod/github.com/hashicorp/consul/api@v1.12.0)

To upgrade to the versions selected by go 1.16, leaving some packages unresolved:
        go mod tidy -e -go=1.16 && go mod tidy -e -go=1.17
If reproducibility with go 1.16 is not needed:
        go mod tidy -compat=1.17
For other options, see:
        https://golang.org/doc/modules/pruning
make: *** [Makefile:106: deps] Error 1

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requires a patch into loki go.mod to resolve this issue.

operator/hack/lokistack_gateway_ocp.yaml Outdated Show resolved Hide resolved
operator/hack/lokistack_gateway_ocp.yaml Show resolved Hide resolved
operator/internal/manifests/mutate.go Show resolved Hide resolved
@grafanabot
Copy link
Collaborator

./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki

Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell.

+           ingester	0%
-        distributor	-0.3%
+            querier	0%
+ querier/queryrange	0%
+               iter	0%
+            storage	0%
+           chunkenc	0%
+              logql	0%
+               loki	0%

Copy link
Collaborator

@xperimental xperimental left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And another round, still not fully through the files (61/93). The good news is that I combined it with actually running the new version of the operator on a testing cluster.

Generally I have the feeling that there is some "de-duplication" of the alertingrule/recordingrule code possible, maybe by introducing some interfaces?

I'll continue tomorrow.

operator/api/v1beta1/recordingrule_types.go Outdated Show resolved Hide resolved
}

// AlertingRuleGroup defines a group of Loki alerting rules.
type AlertingRuleGroup struct {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The result I envisioned was having the checks that are now duplicated in both the webhook for alerting and recording rules only in one place (duplicate group names, durations, etc.). Don't know if the embedded struct is the choice for reaching that goal, there are a few alternatives... in the end it's also the question if the effort is worth it.

operator/hack/lokirules_ocp.yaml Outdated Show resolved Hide resolved
operator/hack/lokistack_gateway_ocp.yaml Outdated Show resolved Hide resolved
operator/internal/handlers/internal/rules/rules.go Outdated Show resolved Hide resolved
operator/internal/manifests/config.go Outdated Show resolved Hide resolved
operator/internal/manifests/distributor.go Show resolved Hide resolved
@grafanabot
Copy link
Collaborator

./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki

Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell.

+           ingester	0%
+        distributor	0%
+            querier	0%
+ querier/queryrange	0%
+               iter	0%
+            storage	0%
+           chunkenc	0%
+              logql	0%
+               loki	0%

@periklis periklis mentioned this pull request May 19, 2022
4 tasks
@grafanabot
Copy link
Collaborator

./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki

Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell.

+           ingester	0%
+        distributor	0%
+            querier	0%
+ querier/queryrange	0%
+               iter	0%
+            storage	0%
+           chunkenc	0%
+              logql	0%
+               loki	0%

@grafanabot
Copy link
Collaborator

./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki

Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell.

+           ingester	0%
+        distributor	0.3%
+            querier	0%
+ querier/queryrange	0%
+               iter	0%
+            storage	0%
+           chunkenc	0%
+              logql	0%
+               loki	0%

Copy link
Collaborator

@xperimental xperimental left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally through all the files. 🎉

operator/internal/manifests/internal/rules/marshal.go Outdated Show resolved Hide resolved
operator/internal/manifests/internal/rules/marshal.go Outdated Show resolved Hide resolved
operator/internal/manifests/ruler.go Show resolved Hide resolved
operator/internal/manifests/rules_config.go Show resolved Hide resolved
@grafanabot
Copy link
Collaborator

./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki

Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell.

+           ingester	0%
+        distributor	0%
+            querier	0%
+ querier/queryrange	0%
+               iter	0%
+            storage	0%
+           chunkenc	0%
+              logql	0%
-               loki	-0.7%

@grafanabot
Copy link
Collaborator

./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki

Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell.

+           ingester	0%
+        distributor	0%
+            querier	0%
+ querier/queryrange	0%
+               iter	0%
+            storage	0%
+           chunkenc	0%
+              logql	0%
+               loki	0%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants