-
Notifications
You must be signed in to change notification settings - Fork 820
RuleStore: implementation using thanos objstore bucket client #3805
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RuleStore: implementation using thanos objstore bucket client #3805
Conversation
5433e9b
to
98998e0
Compare
76ec1c6
to
0b73060
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot @jtlisi for working on this! I left few high level comments, before jumping on a deeper 2nd pass review. 🙏
pkg/storage/rulestore/config.go
Outdated
return ErrUnsupportedStorageBackend | ||
} | ||
|
||
if cfg.Backend == bucket.S3 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't duplicate the bucket config Validate()
but use it. I understanding you did it because the Backend
validation wouldn't pass if it's configured configdb
.
An option:
- Create
bucket.Config.ValidateWithExtraBackends(extras []string)
- Pass
[]string{configdb}
toValidateWithExtraBackends()
- This function just calls
cfg.Bucket.Validate()
pkg/storage/rulestore/config.go
Outdated
func (cfg *Config) RegisterFlags(f *flag.FlagSet) { | ||
cfg.ConfigDB.RegisterFlagsWithPrefix(prefix, f) | ||
|
||
cfg.S3.RegisterFlagsWithPrefix(prefix, f) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should call cfg.Bucket.RegisterFlagsWithPrefix()
instead of duplicating here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only reason I didn't call this function is because it wouldn't print configdb
as an option in the help text.
Between this issue and #3805 (comment) I think I found a solution that will accommodate the use case case.
- Add
ExtraBackends []string
field to the bucket.Config struct. - In
rulestore.Config.RegisterFlags
we set thebucket.Config.ExtraBackends
field to[]string{ConfigDB}
- The
bucket.Config.RegisterFlagsWithPrefix
andbucket.Config.Validate
functions are adjusted to merge thesupportedBackends
with the embeddedExtraBackends
before validating/printing-help-text
Let me know what you think? I think this approach will avoid any additional functions at the cost of an additional hidden field in the config that must be set before the bucket.Config.Validate
or bucket.Config.RegisterFlags
functions are called.
rulePrefix = "rules" + delim | ||
) | ||
|
||
type BucketClient struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a bucket client but an implementation of the rule store. I think we should name it accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does BucketRuleStore
work?
} | ||
} | ||
|
||
func (b *BucketClient) listNamespacesForUser(ctx context.Context, user string) ([]string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry. In all functions you will have to use UserBucketClient
for every operation involving a specific user. We're leveraging on it to add extra functionalities and it's now required that we use it everywhere we leverage on the Thanos objstore client.
We can discuss offline if you need more info about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I transitioned all user operations to use a UserBucketClient. However, I had to make some additional changes to the UserBucketClient to allow for a global prefix, since all rules are currently stores on the path rules/<user-id>
.
6216ec7
to
f975f93
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some initial comments from first quick look.
3b0e489
to
aa58b75
Compare
@pstibrany Thanks for the feedback, this should be ready for another look. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job! I've left some comments to take a look at.
|
||
// NewRuleStore creates a new bucket client based on the configured backend | ||
func NewRuleStore(ctx context.Context, cfg Config, logger log.Logger, reg prometheus.Registerer) (rules.RuleStore, error) { | ||
if cfg.Backend == ConfigDB { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need another way of configuring obsolete ConfigDB rule store? It's already possible to use it without having this option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on this discussion we would eventually deprecate the config options currently configured using the -ruler.storage.*
flag prefix in favor of the the newly added config field. However, the config service is still in use and needs to be carried over. I'm in favor of deprecating the config service but I think it would be best to scope that separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't mean to deprecate configdb (although perhaps we should), just that we don't need a separate way of configuring it.
The way I was thinking about the transition was to add config options to existing RuleStoreConfig
, and adding new types like "bucket-s3". But I see that it could create more confusion.
Ideally we will do this without breaking changes for ConfigDB users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way I was thinking about the transition was to add config options to existing RuleStoreConfig, and adding new types like "bucket-s3". But I see that it could create more confusion.
I also felt it would be more confusing to try and fit this change into the existing config struct instead of creating a new top level struct. As for the configdb, the upgrade path would be identical to the bucket upgrade path, except only the prefix of the flags would change.
However, another option would be to:
- Not add the configdb as an option for the new RuleStore config struct
- When deprecating the existing objectclient flags/fields, maintain the config options for the
configdb
. However, give a clear warning that this client is not recommended for new deployments. - Figure out the future of the configdb service separately
1d082c2
to
392d959
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, good job!
|
||
// Used to inject additional backends into the config. Allows for this config to | ||
// be embedded in multiple contexts and support non-object storage based backends. | ||
ExtraBackends []string `yaml:"-"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is quite smart way of extending bucket.Config with more options.
rulePrefix = "rules" | ||
) | ||
|
||
type BucketRuleStore struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: comment this?
pkg/ruler/rulestore/bucket_client.go
Outdated
if err := ctx.Err(); err != nil { | ||
return err | ||
} | ||
userBucket := bucket.NewUserBucketClient(userID, b.bucket) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should only create userBucket
once, not for every rule group.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job! 👏 ! 👏 Please mention it in the list of experimental features at docs/configuration/v1-guarantees.md
.
CHANGELOG.md
Outdated
@@ -7,6 +7,7 @@ | |||
* [CHANGE] Query-frontend: removed `-querier.split-queries-by-day` (deprecated in Cortex 0.4.0). You should use `-querier.split-queries-by-interval` instead. #3813 | |||
* [CHANGE] Store-gateway: the chunks pool controlled by `-blocks-storage.bucket-store.max-chunk-pool-bytes` is now shared across all tenants. #3830 | |||
* [CHANGE] Ingester: return error code 400 instead of 429 when per-user/per-tenant series/metadata limits are reached. #3833 | |||
* [FEATURE] Experimental Ruler Storage: Add a separate set of configuration options to configure the ruler storage backend under the `-ruler-storage.` flag prefix. All TSDB bucket clients and the config service are currently supported. Clients using this implementation will only be enabled if the existing `-ruler.storage` flags are left unset. #3805 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* [FEATURE] Experimental Ruler Storage: Add a separate set of configuration options to configure the ruler storage backend under the `-ruler-storage.` flag prefix. All TSDB bucket clients and the config service are currently supported. Clients using this implementation will only be enabled if the existing `-ruler.storage` flags are left unset. #3805 | |
* [FEATURE] Experimental Ruler Storage: Add a separate set of configuration options to configure the ruler storage backend under the `-ruler-storage.` flag prefix. All blocks storage bucket clients and the config service are currently supported. Clients using this implementation will only be enabled if the existing `-ruler.storage` flags are left unset. #3805 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
pkg/cortex/cortex.go
Outdated
@@ -115,6 +116,7 @@ type Config struct { | |||
TenantFederation tenantfederation.Config `yaml:"tenant_federation"` | |||
|
|||
Ruler ruler.Config `yaml:"ruler"` | |||
RulerStorage rulestore.Config `yaml:"ruler_storage" doc:"hidden"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we wanna keep it hidden for now? I don't have strong opinions but, without having in it in the doc, will be difficult for our users to try it out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, should be good to expose since we explicitly declare it experimental in the guarantees doc
pkg/ruler/rulestore/config.go
Outdated
const ( | ||
ConfigDB = "configdb" | ||
|
||
name = "rule-store" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
name = "rule-store" | |
name = "ruler-storage" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
pkg/ruler/rulestore/config.go
Outdated
return nil, err | ||
} | ||
|
||
store := NewBucketRuleStore(bucketClient, 10, logger) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of putting a 10
here for loadConcurrency
, why not having a const
defined in pkg/ruler/rulestore/bucket_client.go
and not letting to configure it at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. I think we should avoid exposing more config options for this client unless there is a compelling need to do so.
pkg/ruler/rulestore/bucket_client.go
Outdated
|
||
gr, err := b.getRuleGroup(gCtx, user, namespace, group, gr) // reuse group pointer from the map. | ||
if err != nil { | ||
level.Error(b.logger).Log("msg", "failed to get rule group", "namespace", namespace, "name", group, "user", user, "err", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We log in this case, but not other return
error cases. What if we don't log any error here, but wrap the error with info about the namespace/group/user and let the caller log it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com> fixup Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com> fix flag Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com> fix integration tests Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com> pr-fix: rename rule-store flag/field to ruler-storage Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com> pr-refactor: reuse bucket client Config validate and pass extra backend Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com> pr-refactor: reuse bucket.Config function with extra backends and rename rulestore.BucketClient Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com> update changelog Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com> remove unused variables Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com> make prefixed userBucket and utilize it in ruler storage Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com> refactor: create PrefixedBucketClient and use it instead of altering UserBucketClient Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com> refactor: mv rulestore to pkg/ruler Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com> refactor: fix per PR suggestions, except embedding buckets Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com> use prefixed bucket by default Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com> add SupportModifications to implementation Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com> add tests for DeleteNamespace Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com> remove utility function and use embedded prefixed bucket Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com> fix lint Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com> add todo Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com> add experimental comment to guarantees document Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com>
Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com>
4426200
to
5f1dd3f
Compare
Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com>
Fantastic job @jtlisi. LGTM! 🚀 |
What this PR does:
Which issue(s) this PR fixes:
Related #2262
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]