Skip to content

Commit f0d126c

Browse files
authored
Add firewall support to http based alertmanager receiver integrations (#4085)
* Introduced firewall in the Alertmanager to block specific addresses in receiver integrations Signed-off-by: Marco Pracucci <marco@pracucci.com> * Adapted implementation based on new design in prometheus/common Signed-off-by: Marco Pracucci <marco@pracucci.com> * Updated doc Signed-off-by: Marco Pracucci <marco@pracucci.com> * Fixed doc Signed-off-by: Marco Pracucci <marco@pracucci.com> * Improved doc Signed-off-by: Marco Pracucci <marco@pracucci.com> * Improved config description Signed-off-by: Marco Pracucci <marco@pracucci.com>
1 parent 6a68e9e commit f0d126c

File tree

13 files changed

+628
-26
lines changed

13 files changed

+628
-26
lines changed

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -54,6 +54,7 @@
5454
* [ENHANCEMENT] Ruler: Added `-ruler.enabled-tenants` and `-ruler.disabled-tenants` to explicitly enable or disable rules processing for specific tenants. #4074
5555
* [ENHANCEMENT] Block Storage Ingester: `/flush` now accepts two new parameters: `tenant` to specify tenant to flush and `wait=true` to make call synchronous. Multiple tenants can be specified by repeating `tenant` parameter. If no `tenant` is specified, all tenants are flushed, as before. #4073
5656
* [ENHANCEMENT] Alertmanager: validate configured `-alertmanager.web.external-url` and fail if ends with `/`. #4081
57+
* [ENHANCEMENT] Alertmanager: added `-alertmanager.receivers-firewall.block.cidr-networks` and `-alertmanager.receivers-firewall.block.private-addresses` to block specific network addresses in HTTP-based Alertmanager receiver integrations. #4085
5758
* [ENHANCEMENT] Allow configuration of Cassandra's host selection policy. #4069
5859
* [ENHANCEMENT] Store-gateway: retry synching blocks if a per-tenant sync fails. #3975 #4088
5960
* [ENHANCEMENT] Add metric `cortex_tcp_connections` exposing the current number of accepted TCP connections. #4099

docs/blocks-storage/production-tips.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -105,3 +105,14 @@ You can see that the initial migration is done by looking for the following mess
105105
The rule of thumb to ensure memcached is properly scaled is to make sure evictions happen infrequently. When that's not the case and they affect query performances, the suggestion is to scale out the memcached cluster adding more nodes or increasing the memory limit of existing ones.
106106

107107
We also recommend to run a different memcached cluster for each cache type (metadata, index, chunks). It's not required, but suggested to not worry about the effect of memory pressure on a cache type against others.
108+
109+
## Alertmanager
110+
111+
### Ensure Alertmanager networking is hardened
112+
113+
If the Alertmanager API is enabled, users with access to Cortex can autonomously configure the Alertmanager, including receiver integrations that allow to issue network requests to the configured URL (eg. webhook). If the Alertmanager network is not hardened, Cortex users may have the ability to issue network requests to any network endpoint including services running in the local network accessible by the Alertmanager itself.
114+
115+
Despite hardening the system is out of the scope of Cortex, Cortex provides a basic built-in firewall to block connections created by Alertmanager receiver integrations:
116+
117+
- `-alertmanager.receivers-firewall.block.cidr-networks`
118+
- `-alertmanager.receivers-firewall.block.private-addresses`

docs/configuration/config-file-reference.md

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1849,6 +1849,20 @@ The `alertmanager_config` configures the Cortex alertmanager.
18491849
# CLI flag: -alertmanager.max-recv-msg-size
18501850
[max_recv_msg_size: <int> | default = 16777216]
18511851
1852+
receivers_firewall:
1853+
block:
1854+
# Comma-separated list of network CIDRs to block in Alertmanager receiver
1855+
# integrations.
1856+
# CLI flag: -alertmanager.receivers-firewall.block.cidr-networks
1857+
[cidr_networks: <string> | default = ""]
1858+
1859+
# True to block private and local addresses in Alertmanager receiver
1860+
# integrations. It blocks private addresses defined by RFC 1918 (IPv4
1861+
# addresses) and RFC 4193 (IPv6 addresses), as well as loopback, local
1862+
# unicast and local multicast addresses.
1863+
# CLI flag: -alertmanager.receivers-firewall.block.private-addresses
1864+
[private_addresses: <boolean> | default = false]
1865+
18521866
# Shard tenants across multiple alertmanager instances.
18531867
# CLI flag: -alertmanager.sharding-enabled
18541868
[sharding_enabled: <boolean> | default = false]

docs/configuration/v1-guarantees.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,10 @@ Currently experimental features are:
4141
- Azure blob storage.
4242
- Zone awareness based replication.
4343
- Ruler API (to PUT rules).
44-
- Alertmanager API
44+
- Alertmanager:
45+
- API (enabled via `-experimental.alertmanager.enable-api`)
46+
- Sharding of tenants across multiple instances (enabled via `-alertmanager.sharding-enabled`)
47+
- Receiver integrations firewall (configured via `-alertmanager.receivers-firewall.*`)
4548
- Memcached client DNS-based service discovery.
4649
- Delete series APIs.
4750
- In-memory (FIFO) and Redis cache.
@@ -61,7 +64,6 @@ Currently experimental features are:
6164
- The bucket index support in the querier and store-gateway (enabled via `-blocks-storage.bucket-store.bucket-index.enabled=true`) is experimental
6265
- The block deletion marks migration support in the compactor (`-compactor.block-deletion-marks-migration-enabled`) is temporarily and will be removed in future versions
6366
- Querier: tenant federation
64-
- Alertmanager: Sharding of tenants across multiple instances
6567
- The thanosconvert tool for converting Thanos block metadata to Cortex
6668
- HA Tracker: cleanup of old replicas from KV Store.
6769
- Flags for configuring whether blocks-ingester streams samples or chunks are temporary, and will be removed when feature is tested:

pkg/alertmanager/alertmanager.go

Lines changed: 30 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -40,10 +40,12 @@ import (
4040
"github.com/prometheus/alertmanager/ui"
4141
"github.com/prometheus/client_golang/prometheus"
4242
"github.com/prometheus/client_golang/prometheus/promauto"
43+
commoncfg "github.com/prometheus/common/config"
4344
"github.com/prometheus/common/model"
4445
"github.com/prometheus/common/route"
4546

4647
"github.com/cortexproject/cortex/pkg/alertmanager/alertstore"
48+
util_net "github.com/cortexproject/cortex/pkg/util/net"
4749
"github.com/cortexproject/cortex/pkg/util/services"
4850
)
4951

@@ -59,12 +61,13 @@ const (
5961

6062
// Config configures an Alertmanager.
6163
type Config struct {
62-
UserID string
63-
Logger log.Logger
64-
Peer *cluster.Peer
65-
PeerTimeout time.Duration
66-
Retention time.Duration
67-
ExternalURL *url.URL
64+
UserID string
65+
Logger log.Logger
66+
Peer *cluster.Peer
67+
PeerTimeout time.Duration
68+
Retention time.Duration
69+
ExternalURL *url.URL
70+
ReceiversFirewall FirewallConfig
6871

6972
// Tenant-specific local directory where AM can store its state (notifications, silences, templates). When AM is stopped, entire dir is removed.
7073
TenantDataDir string
@@ -94,6 +97,7 @@ type Alertmanager struct {
9497
wg sync.WaitGroup
9598
mux *http.ServeMux
9699
registry *prometheus.Registry
100+
firewallDialer *util_net.FirewallDialer
97101

98102
// The Dispatcher is the only component we need to recreate when we call ApplyConfig.
99103
// Given its metrics don't have any variable labels we need to re-use the same metrics.
@@ -147,6 +151,10 @@ func New(cfg *Config, reg *prometheus.Registry) (*Alertmanager, error) {
147151
cfg: cfg,
148152
logger: log.With(cfg.Logger, "user", cfg.UserID),
149153
stop: make(chan struct{}),
154+
firewallDialer: util_net.NewFirewallDialer(util_net.FirewallDialerConfig{
155+
BlockCIDRNetworks: cfg.ReceiversFirewall.Block.CIDRNetworks,
156+
BlockPrivateAddresses: cfg.ReceiversFirewall.Block.PrivateAddresses,
157+
}),
150158
configHashMetric: promauto.With(reg).NewGauge(prometheus.GaugeOpts{
151159
Name: "alertmanager_config_hash",
152160
Help: "Hash of the currently loaded alertmanager configuration.",
@@ -315,7 +323,7 @@ func (am *Alertmanager) ApplyConfig(userID string, conf *config.Config, rawCfg s
315323
return d + waitFunc()
316324
}
317325

318-
integrationsMap, err := buildIntegrationsMap(conf.Receivers, tmpl, am.logger)
326+
integrationsMap, err := buildIntegrationsMap(conf.Receivers, tmpl, am.firewallDialer, am.logger)
319327
if err != nil {
320328
return nil
321329
}
@@ -407,10 +415,10 @@ func (am *Alertmanager) getFullState() (*clusterpb.FullState, error) {
407415

408416
// buildIntegrationsMap builds a map of name to the list of integration notifiers off of a
409417
// list of receiver config.
410-
func buildIntegrationsMap(nc []*config.Receiver, tmpl *template.Template, logger log.Logger) (map[string][]notify.Integration, error) {
418+
func buildIntegrationsMap(nc []*config.Receiver, tmpl *template.Template, firewallDialer *util_net.FirewallDialer, logger log.Logger) (map[string][]notify.Integration, error) {
411419
integrationsMap := make(map[string][]notify.Integration, len(nc))
412420
for _, rcv := range nc {
413-
integrations, err := buildReceiverIntegrations(rcv, tmpl, logger)
421+
integrations, err := buildReceiverIntegrations(rcv, tmpl, firewallDialer, logger)
414422
if err != nil {
415423
return nil, err
416424
}
@@ -422,7 +430,7 @@ func buildIntegrationsMap(nc []*config.Receiver, tmpl *template.Template, logger
422430
// buildReceiverIntegrations builds a list of integration notifiers off of a
423431
// receiver config.
424432
// Taken from https://github.com/prometheus/alertmanager/blob/94d875f1227b29abece661db1a68c001122d1da5/cmd/alertmanager/main.go#L112-L159.
425-
func buildReceiverIntegrations(nc *config.Receiver, tmpl *template.Template, logger log.Logger) ([]notify.Integration, error) {
433+
func buildReceiverIntegrations(nc *config.Receiver, tmpl *template.Template, firewallDialer *util_net.FirewallDialer, logger log.Logger) ([]notify.Integration, error) {
426434
var (
427435
errs types.MultiError
428436
integrations []notify.Integration
@@ -436,29 +444,34 @@ func buildReceiverIntegrations(nc *config.Receiver, tmpl *template.Template, log
436444
}
437445
)
438446

447+
// Inject the firewall to any receiver integration supporting it.
448+
httpOps := []commoncfg.HTTPClientOption{
449+
commoncfg.WithDialContextFunc(firewallDialer.DialContext),
450+
}
451+
439452
for i, c := range nc.WebhookConfigs {
440-
add("webhook", i, c, func(l log.Logger) (notify.Notifier, error) { return webhook.New(c, tmpl, l) })
453+
add("webhook", i, c, func(l log.Logger) (notify.Notifier, error) { return webhook.New(c, tmpl, l, httpOps...) })
441454
}
442455
for i, c := range nc.EmailConfigs {
443456
add("email", i, c, func(l log.Logger) (notify.Notifier, error) { return email.New(c, tmpl, l), nil })
444457
}
445458
for i, c := range nc.PagerdutyConfigs {
446-
add("pagerduty", i, c, func(l log.Logger) (notify.Notifier, error) { return pagerduty.New(c, tmpl, l) })
459+
add("pagerduty", i, c, func(l log.Logger) (notify.Notifier, error) { return pagerduty.New(c, tmpl, l, httpOps...) })
447460
}
448461
for i, c := range nc.OpsGenieConfigs {
449-
add("opsgenie", i, c, func(l log.Logger) (notify.Notifier, error) { return opsgenie.New(c, tmpl, l) })
462+
add("opsgenie", i, c, func(l log.Logger) (notify.Notifier, error) { return opsgenie.New(c, tmpl, l, httpOps...) })
450463
}
451464
for i, c := range nc.WechatConfigs {
452-
add("wechat", i, c, func(l log.Logger) (notify.Notifier, error) { return wechat.New(c, tmpl, l) })
465+
add("wechat", i, c, func(l log.Logger) (notify.Notifier, error) { return wechat.New(c, tmpl, l, httpOps...) })
453466
}
454467
for i, c := range nc.SlackConfigs {
455-
add("slack", i, c, func(l log.Logger) (notify.Notifier, error) { return slack.New(c, tmpl, l) })
468+
add("slack", i, c, func(l log.Logger) (notify.Notifier, error) { return slack.New(c, tmpl, l, httpOps...) })
456469
}
457470
for i, c := range nc.VictorOpsConfigs {
458-
add("victorops", i, c, func(l log.Logger) (notify.Notifier, error) { return victorops.New(c, tmpl, l) })
471+
add("victorops", i, c, func(l log.Logger) (notify.Notifier, error) { return victorops.New(c, tmpl, l, httpOps...) })
459472
}
460473
for i, c := range nc.PushoverConfigs {
461-
add("pushover", i, c, func(l log.Logger) (notify.Notifier, error) { return pushover.New(c, tmpl, l) })
474+
add("pushover", i, c, func(l log.Logger) (notify.Notifier, error) { return pushover.New(c, tmpl, l, httpOps...) })
462475
}
463476
if errs.Len() > 0 {
464477
return nil, &errs

pkg/alertmanager/firewall.go

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
package alertmanager
2+
3+
import (
4+
"flag"
5+
"fmt"
6+
7+
"github.com/cortexproject/cortex/pkg/util/flagext"
8+
)
9+
10+
type FirewallConfig struct {
11+
Block FirewallHostsSpec `yaml:"block"`
12+
}
13+
14+
func (cfg *FirewallConfig) RegisterFlagsWithPrefix(prefix string, f *flag.FlagSet) {
15+
cfg.Block.RegisterFlagsWithPrefix(prefix+".block", "block", f)
16+
}
17+
18+
type FirewallHostsSpec struct {
19+
CIDRNetworks flagext.CIDRSliceCSV `yaml:"cidr_networks"`
20+
PrivateAddresses bool `yaml:"private_addresses"`
21+
}
22+
23+
func (cfg *FirewallHostsSpec) RegisterFlagsWithPrefix(prefix, action string, f *flag.FlagSet) {
24+
f.Var(&cfg.CIDRNetworks, prefix+".cidr-networks", fmt.Sprintf("Comma-separated list of network CIDRs to %s in Alertmanager receiver integrations.", action))
25+
f.BoolVar(&cfg.PrivateAddresses, prefix+".private-addresses", false, fmt.Sprintf("True to %s private and local addresses in Alertmanager receiver integrations. It blocks private addresses defined by RFC 1918 (IPv4 addresses) and RFC 4193 (IPv6 addresses), as well as loopback, local unicast and local multicast addresses.", action))
26+
}

pkg/alertmanager/multitenant.go

Lines changed: 8 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -102,11 +102,12 @@ func init() {
102102

103103
// MultitenantAlertmanagerConfig is the configuration for a multitenant Alertmanager.
104104
type MultitenantAlertmanagerConfig struct {
105-
DataDir string `yaml:"data_dir"`
106-
Retention time.Duration `yaml:"retention"`
107-
ExternalURL flagext.URLValue `yaml:"external_url"`
108-
PollInterval time.Duration `yaml:"poll_interval"`
109-
MaxRecvMsgSize int64 `yaml:"max_recv_msg_size"`
105+
DataDir string `yaml:"data_dir"`
106+
Retention time.Duration `yaml:"retention"`
107+
ExternalURL flagext.URLValue `yaml:"external_url"`
108+
PollInterval time.Duration `yaml:"poll_interval"`
109+
MaxRecvMsgSize int64 `yaml:"max_recv_msg_size"`
110+
ReceiversFirewall FirewallConfig `yaml:"receivers_firewall"`
110111

111112
// Enable sharding for the Alertmanager
112113
ShardingEnabled bool `yaml:"sharding_enabled"`
@@ -158,9 +159,8 @@ func (cfg *MultitenantAlertmanagerConfig) RegisterFlags(f *flag.FlagSet) {
158159
f.BoolVar(&cfg.ShardingEnabled, "alertmanager.sharding-enabled", false, "Shard tenants across multiple alertmanager instances.")
159160

160161
cfg.AlertmanagerClient.RegisterFlagsWithPrefix("alertmanager.alertmanager-client", f)
161-
162162
cfg.Persister.RegisterFlagsWithPrefix("alertmanager", f)
163-
163+
cfg.ReceiversFirewall.RegisterFlagsWithPrefix("alertmanager.receivers-firewall", f)
164164
cfg.ShardingRing.RegisterFlags(f)
165165
cfg.Store.RegisterFlags(f)
166166
cfg.Cluster.RegisterFlags(f)
@@ -873,6 +873,7 @@ func (am *MultitenantAlertmanager) newAlertmanager(userID string, amConfig *amco
873873
ReplicationFactor: am.cfg.ShardingRing.ReplicationFactor,
874874
Store: am.store,
875875
PersisterConfig: am.cfg.Persister,
876+
ReceiversFirewall: am.cfg.ReceiversFirewall,
876877
}, reg)
877878
if err != nil {
878879
return nil, fmt.Errorf("unable to start Alertmanager for user %v: %v", userID, err)

0 commit comments

Comments
 (0)