Skip to content

Latest commit

 

History

History
197 lines (148 loc) · 8.28 KB

README.md

File metadata and controls

197 lines (148 loc) · 8.28 KB

Agent Check: Envoy

Overview

This check collects distributed system observability metrics from Envoy.

Setup

Follow the instructions below to install and configure this check for an Agent running on a host. For containerized environments, see the Autodiscovery Integration Templates for guidance on applying these instructions.

Installation

The Envoy check is included in the Datadog Agent package, so you don't need to install anything else on your server.

via Istio

If you are using Envoy as part of Istio, to access Envoy's admin endpoint you need to set Istio's proxyAdminPort.

Standard

There are 2 ways to setup the /stats endpoint:

Unsecured stats endpoint

Here's an example Envoy admin configuration:

admin:
  access_log_path: "/dev/null"
  address:
    socket_address:
      address: 0.0.0.0
      port_value: 8001
Secured stats endpoint

Create a listener/vhost that routes to the admin endpoint (Envoy connecting to itself), but only has a route for /stats; all other routes get a static/error response. Additionally, this allows nice integration with L3 filters for auth, for example.

Here's an example config (from this gist):

admin:
  access_log_path: /dev/null
  address:
    socket_address:
      protocol: TCP
      address: 127.0.0.1
      port_value: 8081
static_resources:
  listeners:
    - address:
        socket_address:
          protocol: TCP
          address: 0.0.0.0
          port_value: 80
      filter_chains:
        - filters:
            - name: envoy.http_connection_manager
              config:
                codec_type: AUTO
                stat_prefix: ingress_http
                route_config:
                  virtual_hosts:
                    - name: backend
                      domains:
                        - "*"
                      routes:
                        - match:
                            prefix: /stats
                          route:
                            cluster: service_stats
                http_filters:
                  - name: envoy.router
                    config:
  clusters:
    - name: service_stats
      connect_timeout: 0.250s
      type: LOGICAL_DNS
      lb_policy: ROUND_ROBIN
      hosts:
        - socket_address:
            protocol: TCP
            address: 127.0.0.1
            port_value: 8001

Configuration

  1. Edit the envoy.d/conf.yaml file, in the conf.d/ folder at the root of your Agent's configuration directory to start collecting your Envoy performance data. See the sample envoy.d/conf.yaml for all available configuration options.

  2. Check if the Datadog Agent can access Envoy's admin endpoint.

  3. Restart the Agent.

Setting Description
stats_url (REQUIRED) The admin stats endpoint, e.g. http://localhost:80/stats. Add a ?usedonly on the end if you wish to ignore unused metrics instead of reporting them as 0.
tags A list of custom tags to apply to this instance.
metric_whitelist A list of regular expressions.
metric_blacklist A list of regular expressions.
cache_metrics Cache results of whitelist/blacklist to decrease CPU utilization, at the expense of some memory (default is true).
username The username to authenticate with if behind basic auth.
password The password to authenticate with if behind basic auth.
tls_verify This instructs the check to validate TLS certificates when connecting to Envoy. Defaults to true, set to false if you want to disable TLS certificate validation.
skip_proxy If true, the check bypasses any proxy settings enabled and attempt to reach Envoy directly.
timeout A custom timeout for network requests in seconds (default is 20).

Metric filtering

Metrics can be filtered using a regular expression metric_whitelist or metric_blacklist. If both are used, then whitelist is applied first, and then blacklist is applied on the resulting set.

The filtering occurs before tag extraction, so you have the option to have certain tags decide whether or not to keep or ignore metrics. An exhaustive list of all metrics and tags can be found in metrics.py. Let's walk through an example of Envoy metric tagging!

...
'cluster.grpc.success': {
    'tags': (
        ('<CLUSTER_NAME>', ),
        ('<GRPC_SERVICE>', '<GRPC_METHOD>', ),
        (),
    ),
    ...
},
...

Here there are 3 tag sequences: ('<CLUSTER_NAME>'), ('<GRPC_SERVICE>', '<GRPC_METHOD>'), and empty (). The number of sequences corresponds exactly to how many metric parts there are. For this metric, there are 3 parts: cluster, grpc, and success. Envoy separates everything with a ., hence the final metric name would be:

cluster.<CLUSTER_NAME>.grpc.<GRPC_SERVICE>.<GRPC_METHOD>.success

If you care only about the cluster name and grpc service, you would add this to your whitelist:

^cluster\.<CLUSTER_NAME>\.grpc\.<GRPC_SERVICE>\.

Log collection

Available for Agent >6.0

  1. Collecting logs is disabled by default in the Datadog Agent, enable it in your datadog.yaml file:

      logs_enabled: true
  2. Next, edit envoy.d/conf.yaml by uncommenting the logs lines at the bottom. Update the logs path with the correct path to your Envoy log files.

      logs:
        - type: file
          path: /var/log/envoy.log
          source: envoy
          service: envoy
  3. Restart the Agent.

Validation

Run the Agent's status subcommand and look for envoy under the Checks section.

Data Collected

Metrics

See metadata.csv for a list of metrics provided by this integration.

See metrics.py for a list of tags sent by each metric.

Events

The Envoy check does not include any events.

Service Checks

envoy.can_connect:
Returns CRITICAL if the Agent cannot connect to Envoy to collect metrics, otherwise returns OK.

Troubleshooting

Need help? Contact Datadog support.