Skip to content

Add CoreDNS integration #2091

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 21 commits into from
Sep 10, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions coredns/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# CHANGELOG - CoreDNS

11 changes: 11 additions & 0 deletions coredns/MANIFEST.in
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
graft datadog_checks
graft tests

include MANIFEST.in
include README.md
include requirements.in
include requirements.txt
include requirements-dev.txt
include manifest.json

global-exclude *.py[cod] __pycache__
66 changes: 66 additions & 0 deletions coredns/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# CoreDNS Integration

## Overview
Get metrics from CoreDNS in real time to visualize and monitor DNS failures and cache hits/misses.

## Setup
### Installation

The CoreDNS check is included in the [Datadog Agent][1] package, so you don't need to install anything else on your servers.

### Configuration

Edit the `coredns.d/conf.yaml` file, in the `conf.d/` folder at the root of your [Agent's configuration directory][6], to point to your server and port and set the masters to monitor. See the [sample coredns.d/conf.yaml][2] for all available configuration options.

#### Using with service discovery

If you are using one dd-agent pod (daemon set) per kubernetes worker nodes, use the following annotations on your kube-dns pod to retrieve the data automatically.

```yaml
metadata:
annotations:
ad.datadoghq.com/coredns.check_names: '["coredns"]'
ad.datadoghq.com/coredns.init_configs: '[{}]'
ad.datadoghq.com/coredns.instances: '[[{"prometheus_url":"http://%%host%%:9153/metrics", "tags":["dns-pod:%%host%%"]}]]'
```

**Note:**

* The `dns-pod` tag keeps track of the target dns pod IP. The other tags are related to the dd-agent that is polling the information using the service discovery.
* The service discovery annotations need to be done on the pod. In case of a deployment, add the annotations to the metadata of the template's specifications. Do not add it at the outer specification level.


### Validation

[Run the Agent's `status` subcommand][3] and look for `coredns` under the Checks section.

## Data Collected

### Metrics

See [metadata.csv][5] for a list of metrics provided by this integration.

### Events

The CoreDNS check does not include any events.

### Service Checks

The CoreDNS check does not include any service checks.

## Troubleshooting

Need help? Contact [Datadog Support][7].

## Development

Please refer to the [main documentation][6]
for more details about how to test and develop Agent based integrations.

[1]: https://app.datadoghq.com/account/settings#agent
[2]: https://github.com/DataDog/integrations-core/blob/master/coredns/datadog_checks/coredns/data/conf.yaml.example
[3]: https://docs.datadoghq.com/agent/faq/agent-commands/#start-stop-restart-the-agent
[4]: https://docs.datadoghq.com/agent/faq/agent-commands/#agent-status-and-information
[5]: https://github.com/DataDog/cookiecutter-datadog-check/blob/master/%7B%7Bcookiecutter.check_name%7D%7D/metadata.csv
[6]: https://docs.datadoghq.com/developers/
[7]: http://docs.datadoghq.com/help/
5 changes: 5 additions & 0 deletions coredns/datadog_checks/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# (C) Datadog, Inc. 2018
# All rights reserved
# Licensed under a 3-clause BSD style license (see LICENSE)

__path__ = __import__('pkgutil').extend_path(__path__, __name__)
5 changes: 5 additions & 0 deletions coredns/datadog_checks/coredns/__about__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# (C) Datadog, Inc. 2018
# All rights reserved
# Licensed under a 3-clause BSD style license (see LICENSE)

__version__ = '0.1.0'
11 changes: 11 additions & 0 deletions coredns/datadog_checks/coredns/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# (C) Datadog, Inc. 2018
# All rights reserved
# Licensed under a 3-clause BSD style license (see LICENSE)

from .coredns import CoreDNSCheck
from .__about__ import __version__

__all__ = [
'__version__',
'CoreDNSCheck'
]
103 changes: 103 additions & 0 deletions coredns/datadog_checks/coredns/coredns.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
# (C) Datadog, Inc. 2018
# All rights reserved
# Licensed under a 3-clause BSD style license (see LICENSE)

from datadog_checks.checks.openmetrics import OpenMetricsBaseCheck
from datadog_checks.errors import CheckException


DEFAULT_METRICS = {
'coredns_dns_response_size_bytes': 'response_size.bytes',
'coredns_cache_hits_total': 'cache_hits_count',
'coredns_cache_misses_total': 'cache_misses_count',
'coredns_dns_request_count_total': 'request_count',
'coredns_dns_request_duration_seconds': 'request_duration.seconds',
'coredns_dns_request_size_bytes': 'request_size.bytes',
'coredns_dns_request_type_count_total': 'request_type_count',
'coredns_dns_response_rcode_count_total': 'response_code_count',
'coredns_proxy_request_count_total': 'proxy_request_count',
'coredns_proxy_request_duration_seconds': 'proxy_request_duration.seconds',
'coredns_cache_size': 'cache_size.count',
}


GO_METRICS = {
'go_gc_duration_seconds': 'go.gc_duration_seconds',
'go_goroutines': 'go.goroutines',
'go_info': 'go.info',
'go_memstats_alloc_bytes': 'go.memstats.alloc_bytes',
'go_memstats_alloc_bytes_total': 'go.memstats.alloc_bytes_total',
'go_memstats_buck_hash_sys_bytes': 'go.memstats.buck_hash_sys_bytes',
'go_memstats_frees_total': 'go.memstats.frees_total',
'go_memstats_gc_cpu_fraction': 'go.memstats.gc_cpu_fraction',
'go_memstats_gc_sys_bytes': 'go.memstats.gc_sys_bytes',
'go_memstats_heap_alloc_bytes': 'go.memstats.heap_alloc_bytes',
'go_memstats_heap_idle_bytes': 'go.memstats.heap_idle_bytes',
'go_memstats_heap_inuse_bytes': 'go.memstats.heap_inuse_bytes',
'go_memstats_heap_objects': 'go.memstats.heap_objects',
'go_memstats_heap_released_bytes': 'go.memstats.heap_released_bytes',
'go_memstats_heap_sys_bytes': 'go.memstats.heap_sys_bytes',
'go_memstats_last_gc_time_seconds': 'go.memstats.last_gc_time_seconds',
'go_memstats_lookups_total': 'go.memstats.lookups_total',
'go_memstats_mallocs_total': 'go.memstats.mallocs_total',
'go_memstats_mcache_inuse_bytes': 'go.memstats.mcache_inuse_bytes',
'go_memstats_mcache_sys_bytes': 'go.memstats.mcache_sys_bytes',
'go_memstats_mspan_inuse_bytes': 'go.memstats.mspan_inuse_bytes',
'go_memstats_mspan_sys_bytes': 'go.memstats.mspan_sys_bytes',
'go_memstats_next_gc_bytes': 'go.memstats.next_gc_bytes',
'go_memstats_other_sys_bytes': 'go.memstats.other_sys_bytes',
'go_memstats_stack_inuse_bytes': 'go.memstats.stack_inuse_bytes',
'go_memstats_stack_sys_bytes': 'go.memstats.stack_sys_bytes',
'go_memstats_sys_bytes': 'go.memstats.sys_bytes',
'process_cpu_seconds_total': 'process.cpu_seconds_total',
'process_max_fds': 'process.max_fds',
'process_open_fds': 'process.open_fds',
'process_resident_memory_bytes': 'process.resident_memory_bytes',
'process_start_time_seconds': 'process.start_time_seconds',
'process_virtual_memory_bytes': 'process.virtual_memory_bytes',
}


class CoreDNSCheck(OpenMetricsBaseCheck):
"""
Collect CoreDNS metrics from its Prometheus endpoint
"""

def __init__(self, name, init_config, agentConfig, instances=None):

# Create instances we can use in OpenMetricsBaseCheck
generic_instances = None
if instances is not None:
generic_instances = self.create_generic_instances(instances)

super(CoreDNSCheck, self).__init__(name, init_config, agentConfig, instances=generic_instances)

def create_generic_instances(self, instances):
"""
Transform each CoreDNS instance into a OpenMetricsBaseCheck instance
"""
generic_instances = []
for instance in instances:
transformed_instance = self._create_core_dns_instance(instance)
generic_instances.append(transformed_instance)

return generic_instances

def _create_core_dns_instance(self, instance):
"""
Set up coredns instance so it can be used in OpenMetricsBaseCheck
"""
endpoint = instance.get('prometheus_url')
if endpoint is None:
raise CheckException("Unable to find prometheus endpoint in config file.")

metrics = [DEFAULT_METRICS, GO_METRICS]
metrics.extend(instance.get('metrics', []))

instance.update({
'prometheus_url': endpoint,
'namespace': 'coredns',
'metrics': metrics,
})

return instance
37 changes: 37 additions & 0 deletions coredns/datadog_checks/coredns/data/auto_conf.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
ad_identifiers:
- coredns

init_config:

instances:

## @param prometheus_url - string - required
## To enable CoreDNS metrics you must specify the prometheus url
## and enable the plugin within coredns
## See: https://coredns.io/plugins/metrics/
#
- prometheus_url: "http://%%host%%:9153/metrics"
tags:
- "dns-pod:%%host%%"

## @param send_histogram_buckets - boolean - optional - default: true
## Set send_histograms_buckets to true to send the histograms bucket.
#
# send_histograms_buckets: true

## @param send_monotonic_counter - boolean - optional - default: true
## To send counters as monotonic counter
## see: https://github.com/DataDog/integrations-core/issues/1303
#
# send_monotonic_counter: true

## @param metrics - list of strings - optional
## Metrics from the CoreDNS plugins for 'metrics', 'proxy' and 'cache'
## are enabled by default, however in order to scrape metrics for optional
## plugins, enable the plugin in the CoreDNS corefile and then add the metric below.
## As an example, the 'template' plugin's metrics are below
#
# metrics:
# - coredns_template_matches_total: template_matches_count
# - coredns_template_template_failures_total: template_templating_failures_count
# - coredns_template_rr_failures_total: template_resource_record_failures_count
34 changes: 34 additions & 0 deletions coredns/datadog_checks/coredns/data/conf.yaml.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
init_config:

instances:

## @param prometheus_url - string - required
## To enable CoreDNS metrics you must specify the prometheus url
## and enable the plugin within coredns.
## See: https://coredns.io/plugins/metrics/
#
- prometheus_url: "http://%%host%%:9153/metrics"
tags:
- "dns-pod:%%host%%"

## @param send_histogram_buckets - boolean - optional - default:True
## Set send_histograms_buckets to true to send the histograms buckets.
#
# send_histograms_buckets: True

## @param send_monotonic_counter - boolean - optional - default:True
## To send counters as monotonic counter.
## see: https://github.com/DataDog/integrations-core/issues/1303
#
# send_monotonic_counter: True

## @param metrics - list of strings - optional
## Metrics from the CoreDNS plugins for 'metrics', 'proxy' and 'cache'
## are enabled by default, however in order to scrape metrics for optional
## plugins, enable the plugin in the CoreDNS corefile and then add the metric below.
## As an example, the 'template' plugin's metrics are below.
#
# metrics:
# - coredns_template_matches_total: template_matches_count
# - coredns_template_template_failures_total: template_templating_failures_count
# - coredns_template_rr_failures_total: template_resource_record_failures_count
21 changes: 21 additions & 0 deletions coredns/manifest.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
{
"maintainer": "help@datadoghq.com",
"manifest_version": "1.0.0",
"name": "coredns",
"display_name": "CoreDNS",
"short_description": "CoreDNS collects DNS metrics in Kubernetes.",
"support": "core",
"supported_os": [
"linux"
],
"guid": "9b316155-fc8e-4cb0-8bd5-8af270759cfb",
"public_title": "Datadog-CoreDNS Integration",
"categories":[
"containers",
"network"
],
"type":"check",
"aliases":[],
"is_public": true,
"creates_events": false
}
50 changes: 50 additions & 0 deletions coredns/metadata.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
metric_name,metric_type,interval,unit_name,per_unit_name,description,orientation,integration,short_name
coredns.response_code_count,gauge,,response code,,number of responses per zone and rcode,1,coredns,response code
coredns.proxy_request_count,gauge,,request,,query count per upstream.,1,coredns,proxy request count
coredns.cache_hits_count,gauge,,hit,,Counter of cache hits by cache type,1,coredns,cache hit
coredns.cache_misses_count,gauge,,miss,,Counter of cache misses.,1,coredns,cache miss
coredns.request_count,gauge,,request,,total query count.,1,coredns,request count
coredns.request_type_count,gauge,,request type,,counter of queries per zone and type,1,coredns,request type
coredns.request_duration.seconds.sum,gauge,,second,,duration to process each query,-1,coredns,proxy request duration
coredns.request_duration.seconds.count,count,,second,,duration to process each query,-1,coredns,request duration
coredns.proxy_request_duration.seconds.sum,gauge,,second,,duration per upstream interaction,-1,coredns,proxy request duration
coredns.proxy_request_duration.seconds.count,count,,second,,duration per upstream interaction,-1,coredns,proxy request duration
coredns.request_size.bytes.sum,gauge,,byte,,size of the request in bytes,0,coredns,request size
coredns.request_size.bytes.count,count,,byte,,size of the request in bytes,0,coredns,request size
coredns.response_size.bytes.sum,gauge,,byte,,size of the request in bytes,0,coredns,response size
coredns.response_size.bytes.count,count,,byte,,size of the request in bytes,0,coredns,response size
coredns.cache_size.count,count,,entry,,,1,coredns,cache size
coredns.go.gc_duration_seconds.count,gauge,,second,,Count of the GC invocation durations.,0,coredns,
coredns.go.gc_duration_seconds.sum,gauge,,second,,Sum of the GC invocation durations.,0,coredns,
coredns.go.goroutines,gauge,,thread,,Number of goroutines that currently exist.,0,coredns,
coredns.go.info,gauge,,,,Information about the Go environment.,0,coredns,
coredns.go.memstats.alloc_bytes,gauge,,byte,,Number of bytes allocated and still in use.,0,coredns,
coredns.go.memstats.alloc_bytes_total,gauge,,byte,,Total number of bytes allocated even if freed.,0,coredns,
coredns.go.memstats.buck_hash_sys_bytes,gauge,,byte,,Number of bytes used by the profiling bucket hash table.,0,coredns,bytes used by profiling
coredns.go.memstats.frees_total,gauge,,,,Total number of frees.,0,coredns,
coredns.go.memstats.gc_cpu_fraction,gauge,,percent,,CPU taken up by GC,0,coredns,CPU taken up by gc
coredns.go.memstats.gc_sys_bytes,gauge,,byte,,Number of bytes used for garbage collection system metadata.,0,coredns,
coredns.go.memstats.heap_alloc_bytes,gauge,,byte,,Bytes allocated to the heap,0,coredns,
coredns.go.memstats.heap_idle_bytes,gauge,,byte,,Number of idle bytes in the heap,0,coredns,
coredns.go.memstats.heap_inuse_bytes,gauge,,byte,,Number of Bytes in the heap,0,coredns,
coredns.go.memstats.heap_objects,gauge,,object,,Number of objects in the heap,0,coredns,
coredns.go.memstats.heap_released_bytes,gauge,,byte,,Number of bytes released to the system in the last gc,0,coredns,
coredns.go.memstats.heap_sys_bytes,gauge,,byte,,Number of bytes used by the heap,0,coredns,
coredns.go.memstats.last_gc_time_seconds,gauge,,second,,Length of last GC,0,coredns,gc time
coredns.go.memstats.lookups_total,gauge,,operation,,Number of lookups,0,coredns,lookups total
coredns.go.memstats.mallocs_total,gauge,,operation,,Number of mallocs,0,coredns,mallocs total
coredns.go.memstats.mcache_inuse_bytes,gauge,,byte,,Number of bytes in use by mcache structures.,0,coredns,
coredns.go.memstats.mcache_sys_bytes,gauge,,byte,,Number of bytes used for mcache structures obtained from system.,0,coredns,
coredns.go.memstats.mspan_inuse_bytes,gauge,,byte,,Number of bytes in use by mspan structures.,0,coredns,
coredns.go.memstats.mspan_sys_bytes,gauge,,byte,,Number of bytes used for mspan structures obtained from system.,0,coredns,
coredns.go.memstats.next_gc_bytes,gauge,,byte,,Number of heap bytes when next garbage collection will take place,0,coredns,
coredns.go.memstats.other_sys_bytes,gauge,,byte,,Number of bytes used for other system allocations,0,coredns,
coredns.go.memstats.stack_inuse_bytes,gauge,,byte,,Number of bytes in use by the stack allocator,0,coredns,
coredns.go.memstats.stack_sys_bytes,gauge,,byte,,Number of bytes obtained from system for stack allocator,0,coredns,
coredns.go.memstats.sys_bytes,gauge,,byte,,Number of bytes obtained from system,0,coredns,
coredns.go.threads,gauge,,thread,,Number of OS threads created.,0,coredns,
coredns.process.max_fds,gauge,,file,,Maximum number of open file descriptors.,0,coredns,
coredns.process.open_fds,gauge,,file,,Number of open file descriptors.,0,coredns,
coredns.process.resident_memory_bytes,gauge,,byte,,Resident memory size in bytes.,0,coredns,
coredns.process.start_time_seconds,gauge,,second,,Start time of the process since unix epoch in seconds.,0,coredns,
coredns.process.virtual_memory_bytes,gauge,,byte,,Virtual memory size in bytes.,0,coredns,
1 change: 1 addition & 0 deletions coredns/requirements-dev.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
datadog-checks-dev
6 changes: 6 additions & 0 deletions coredns/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
#
# This file is autogenerated by pip-compile
# To update, run:
#
# pip-compile --generate-hashes --output-file requirements.txt requirements.in
#
Loading