Skip to content

Commit

Permalink
fix(loki): allow global and per tenant sigv4 config (grafana#6358)
Browse files Browse the repository at this point in the history
Signed-off-by: Trevor Wood <Trevor.G.Wood@gmail.com>
  • Loading branch information
taharah authored Jun 13, 2022
1 parent 87d04d5 commit aed11c2
Show file tree
Hide file tree
Showing 8 changed files with 100 additions and 18 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,7 @@
* [6317](https://github.com/grafana/loki/pull/6317/files) **dannykoping**: General: add cache usage statistics

##### Fixes
* [6358](https://github.com/grafana/loki/pull/6358) **taharah**: Fixes sigv4 authentication for the Ruler's remote write configuration by allowing both a global and per tenant configuration.
* [6152](https://github.com/grafana/loki/pull/6152) **slim-bean**: Fixes unbounded ingester memory growth when live tailing under specific circumstances.
* [5685](https://github.com/grafana/loki/pull/5685) **chaudum**: Assert that push values tuples consist of string values
##### Changes
Expand Down
42 changes: 27 additions & 15 deletions docs/sources/configuration/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -562,21 +562,7 @@ remote_write:
# Optionally configures AWS's Signature Verification 4 signing process to
# sign requests. Cannot be set at the same time as basic_auth, authorization, or oauth2.
# To use the default credentials from the AWS SDK, use `sigv4: {}`.
sigv4:
# The AWS region. If blank, the region from the default credentials chain
# is used.
[region: <string>]

# The AWS API keys. If blank, the environment variables `AWS_ACCESS_KEY_ID`
# and `AWS_SECRET_ACCESS_KEY` are used.
[access_key: <string>]
[secret_key: <secret>]

# Named AWS profile used to authenticate.
[profile: <string>]

# AWS Role ARN, an alternative to using AWS API keys.
[role_arn: <string>]
[sigv4: <sigv4_config>]

# Configures the remote write request's TLS settings.
tls_config:
Expand Down Expand Up @@ -2366,6 +2352,10 @@ The `limits_config` block configures global and per-tenant limits in Loki.
# This is experimental and might change in the future.
[ruler_remote_write_queue_retry_on_ratelimit: <boolean>]

# Configures AWS's Signature Verification 4 signing process to
# sign every remote write request.
[ruler_remote_write_sigv4_config: <sigv4_config>]

# Limit queries that can be sharded.
# Queries within the time range of now and now minus this sharding lookback
# are not sharded. The default value of 0s disables the lookback, causing
Expand All @@ -2379,6 +2369,28 @@ The `limits_config` block configures global and per-tenant limits in Loki.
[split_queries_by_interval: <duration> | default = 30m]
```
## sigv4_config
The `sigv4_config` block configures AWS's Signature Verification 4 signing process to
sign every remote write request.

```yaml
# The AWS region. If blank, the region from the default credentials chain
# is used.
[region: <string>]
# The AWS API keys. If blank, the environment variables `AWS_ACCESS_KEY_ID`
# and `AWS_SECRET_ACCESS_KEY` are used.
[access_key: <string>]
[secret_key: <secret>]

# Named AWS profile used to authenticate.
[profile: <string>]

# AWS Role ARN, an alternative to using AWS API keys.
[role_arn: <string>]
```
### grpc_client_config
The `grpc_client_config` block configures a client connection to a gRPC service.
Expand Down
6 changes: 4 additions & 2 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -112,7 +112,10 @@ require (
k8s.io/klog v1.0.0
)

require github.com/willf/bloom v2.0.3+incompatible
require (
github.com/prometheus/common/sigv4 v0.1.0
github.com/willf/bloom v2.0.3+incompatible
)

require (
cloud.google.com/go v0.100.2 // indirect
Expand Down Expand Up @@ -228,7 +231,6 @@ require (
github.com/pierrec/lz4 v2.6.1+incompatible // indirect
github.com/pmezard/go-difflib v1.0.0 // indirect
github.com/prometheus/alertmanager v0.23.1-0.20210914172521-e35efbddb66a // indirect
github.com/prometheus/common/sigv4 v0.1.0 // indirect
github.com/prometheus/node_exporter v1.0.0-rc.0.0.20200428091818-01054558c289 // indirect
github.com/prometheus/procfs v0.7.3 // indirect
github.com/rcrowley/go-metrics v0.0.0-20201227073835-cf1acfcdf475 // indirect
Expand Down
2 changes: 2 additions & 0 deletions pkg/ruler/compat.go
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ import (
"github.com/pkg/errors"
"github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/common/model"
"github.com/prometheus/common/sigv4"
"github.com/prometheus/prometheus/model/labels"
"github.com/prometheus/prometheus/model/rulefmt"
"github.com/prometheus/prometheus/model/timestamp"
Expand Down Expand Up @@ -49,6 +50,7 @@ type RulesLimits interface {
RulerRemoteWriteQueueMinBackoff(userID string) time.Duration
RulerRemoteWriteQueueMaxBackoff(userID string) time.Duration
RulerRemoteWriteQueueRetryOnRateLimit(userID string) bool
RulerRemoteWriteSigV4Config(userID string) *sigv4.SigV4Config
}

// engineQueryFunc returns a new query function using the rules.EngineQueryFunc function
Expand Down
5 changes: 4 additions & 1 deletion pkg/ruler/registry.go
Original file line number Diff line number Diff line change
Expand Up @@ -229,7 +229,6 @@ func (r *walRegistry) getTenantRemoteWriteConfig(tenant string, base RemoteWrite
// TODO(dannyk): configure HTTP client overrides
// metadata is only used by prometheus scrape configs
overrides.Client.MetadataConfig = config.MetadataConfig{Send: false}
overrides.Client.SigV4Config = nil

if r.overrides.RulerRemoteWriteDisabled(tenant) {
overrides.Enabled = false
Expand Down Expand Up @@ -296,6 +295,10 @@ func (r *walRegistry) getTenantRemoteWriteConfig(tenant string, base RemoteWrite
overrides.Client.QueueConfig.RetryOnRateLimit = v
}

if v := r.overrides.RulerRemoteWriteSigV4Config(tenant); v != nil {
overrides.Client.SigV4Config = v
}

return overrides, nil
}

Expand Down
51 changes: 51 additions & 0 deletions pkg/ruler/registry_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ import (
"github.com/go-kit/log"
promConfig "github.com/prometheus/common/config"
"github.com/prometheus/common/model"
"github.com/prometheus/common/sigv4"
"github.com/prometheus/prometheus/config"
"github.com/prometheus/prometheus/model/relabel"
"github.com/stretchr/testify/assert"
Expand All @@ -31,6 +32,9 @@ const customRelabelsTenant = "custom-relabels"
const badRelabelsTenant = "bad-relabels"
const nilRelabelsTenant = "nil-relabels"
const emptySliceRelabelsTenant = "empty-slice-relabels"
const sigV4ConfigTenant = "sigv4"
const sigV4GlobalRegion = "us-east-1"
const sigV4TenantRegion = "us-east-2"

const defaultCapacity = 1000

Expand Down Expand Up @@ -80,6 +84,11 @@ func newFakeLimits() fakeLimits {
},
},
},
sigV4ConfigTenant: {
RulerRemoteWriteSigV4Config: &sigv4.SigV4Config{
Region: sigV4TenantRegion,
},
},
},
}
}
Expand Down Expand Up @@ -134,6 +143,19 @@ func setupRegistry(t *testing.T) *walRegistry {
return reg.(*walRegistry)
}

func setupSigV4Registry(t *testing.T) *walRegistry {
// Get the global config and override it
reg := setupRegistry(t)

// Remove the basic auth config and replace with sigv4
reg.config.RemoteWrite.Client.HTTPClientConfig.BasicAuth = nil
reg.config.RemoteWrite.Client.SigV4Config = &sigv4.SigV4Config{
Region: sigV4GlobalRegion,
}

return reg
}

func TestTenantRemoteWriteConfigWithOverride(t *testing.T) {
reg := setupRegistry(t)

Expand All @@ -159,6 +181,35 @@ func TestTenantRemoteWriteConfigWithoutOverride(t *testing.T) {
assert.Equal(t, tenantCfg.RemoteWrite[0].QueueConfig.Capacity, defaultCapacity)
}

func TestRulerRemoteWriteSigV4ConfigWithOverrides(t *testing.T) {
reg := setupSigV4Registry(t)

tenantCfg, err := reg.getTenantConfig(sigV4ConfigTenant)
require.NoError(t, err)

// tenant has not disable remote-write so will inherit the global one
assert.Len(t, tenantCfg.RemoteWrite, 1)
// ensure sigv4 config is not nil and overwritten
if assert.NotNil(t, tenantCfg.RemoteWrite[0].SigV4Config) {
assert.Equal(t, tenantCfg.RemoteWrite[0].SigV4Config.Region, sigV4TenantRegion)
}
}

func TestRulerRemoteWriteSigV4ConfigWithoutOverrides(t *testing.T) {
reg := setupSigV4Registry(t)

// this tenant has no overrides, so will get defaults
tenantCfg, err := reg.getTenantConfig("unknown")
require.NoError(t, err)

// tenant has not disable remote-write so will inherit the global one
assert.Len(t, tenantCfg.RemoteWrite, 1)
// ensure sigv4 config is not nil and the global value
if assert.NotNil(t, tenantCfg.RemoteWrite[0].SigV4Config) {
assert.Equal(t, tenantCfg.RemoteWrite[0].SigV4Config.Region, sigV4GlobalRegion)
}
}

func TestTenantRemoteWriteConfigDisabled(t *testing.T) {
reg := setupRegistry(t)

Expand Down
6 changes: 6 additions & 0 deletions pkg/validation/limits.go
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ import (

"github.com/pkg/errors"
"github.com/prometheus/common/model"
"github.com/prometheus/common/sigv4"
"github.com/prometheus/prometheus/model/labels"
"golang.org/x/time/rate"
"gopkg.in/yaml.v2"
Expand Down Expand Up @@ -108,6 +109,7 @@ type Limits struct {
RulerRemoteWriteQueueMinBackoff time.Duration `yaml:"ruler_remote_write_queue_min_backoff" json:"ruler_remote_write_queue_min_backoff"`
RulerRemoteWriteQueueMaxBackoff time.Duration `yaml:"ruler_remote_write_queue_max_backoff" json:"ruler_remote_write_queue_max_backoff"`
RulerRemoteWriteQueueRetryOnRateLimit bool `yaml:"ruler_remote_write_queue_retry_on_ratelimit" json:"ruler_remote_write_queue_retry_on_ratelimit"`
RulerRemoteWriteSigV4Config *sigv4.SigV4Config `yaml:"ruler_remote_write_sigv4_config" json:"ruler_remote_write_sigv4_config"`

// Global and per tenant retention
RetentionPeriod model.Duration `yaml:"retention_period" json:"retention_period"`
Expand Down Expand Up @@ -512,6 +514,10 @@ func (o *Overrides) RulerRemoteWriteQueueRetryOnRateLimit(userID string) bool {
return o.getOverridesForUser(userID).RulerRemoteWriteQueueRetryOnRateLimit
}

func (o *Overrides) RulerRemoteWriteSigV4Config(userID string) *sigv4.SigV4Config {
return o.getOverridesForUser(userID).RulerRemoteWriteSigV4Config
}

// RetentionPeriod returns the retention period for a given user.
func (o *Overrides) RetentionPeriod(userID string) time.Duration {
return time.Duration(o.getOverridesForUser(userID).RetentionPeriod)
Expand Down
5 changes: 5 additions & 0 deletions pkg/validation/limits_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,8 @@ split_queries_by_interval: 190s
ruler_evaluation_delay_duration: 200s
ruler_max_rules_per_rule_group: 210
ruler_max_rule_groups_per_tenant: 220
ruler_remote_write_sigv4_config:
region: us-east-1
per_tenant_override_config: ""
per_tenant_override_period: 230s
`
Expand Down Expand Up @@ -96,6 +98,9 @@ per_tenant_override_period: 230s
"ruler_evaluation_delay_duration": "200s",
"ruler_max_rules_per_rule_group": 210,
"ruler_max_rule_groups_per_tenant":220,
"ruler_remote_write_sigv4_config": {
"region": "us-east-1"
},
"per_tenant_override_config": "",
"per_tenant_override_period": "230s"
}
Expand Down

0 comments on commit aed11c2

Please sign in to comment.