Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CoreDNS integration #2091

Merged
merged 21 commits into from
Sep 10, 2018
Merged

Add CoreDNS integration #2091

merged 21 commits into from
Sep 10, 2018

Conversation

shraykay
Copy link
Contributor

@shraykay shraykay commented Aug 22, 2018

What does this PR do?

Adding an integration for CoreDNS -- CoreDNS emits metrics via Prometheus.
See sample events from plugins: here, here, and here.

Motivation

Kubernetes is moving from KubeDNS -> CoreDNS starting v1.11

Review checklist

  • PR has a meaningful title or PR has the no-changelog label attached
  • Feature or bugfix has tests
  • Git history is clean
  • If PR impacts documentation, docs team has been notified or an issue has been opened on the documentation repo

Additional Notes

First time making a contribution! Please let me know what I may have missed, I borrowed heavily from KubeDNS's integration.

@shraykay shraykay requested review from a team as code owners August 22, 2018 04:37
@shraykay
Copy link
Contributor Author

@gmmeyer should I make changes to support OpenMetricsBaseCheck in this PR or in a separate one?

Copy link
Contributor

@l0k0ms l0k0ms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello, Many thanks for your contribution 🙇 I left some comment regarding the documentation of your new integration.

coredns.response_code_count.count,count,,response code,,instant number of responses per zone and rcode,1,coredns,response code
coredns.proxy_request_count,gauge,,request,,query count per upstream.,1,coredns,proxy request count
coredns.proxy_request_count.count,count,,request,,query count per upstream at a moment,1,coredns,proxy request count
coredns.cache_hits_count,gauge,,hits,,Counter of cache hits by cache type,1,coredns,cache hit
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unit name should be hit not hits

https://docs.datadoghq.com/developers/metrics/#units

coredns.proxy_request_count,gauge,,request,,query count per upstream.,1,coredns,proxy request count
coredns.proxy_request_count.count,count,,request,,query count per upstream at a moment,1,coredns,proxy request count
coredns.cache_hits_count,gauge,,hits,,Counter of cache hits by cache type,1,coredns,cache hit
coredns.cache_hits_count.count,count,,hits,,Counter of cache hits by cache type at a moment,1,coredns,cache hit
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unit_name should be hit not hits

https://docs.datadoghq.com/developers/metrics/#units

coredns.proxy_request_count.count,count,,request,,query count per upstream at a moment,1,coredns,proxy request count
coredns.cache_hits_count,gauge,,hits,,Counter of cache hits by cache type,1,coredns,cache hit
coredns.cache_hits_count.count,count,,hits,,Counter of cache hits by cache type at a moment,1,coredns,cache hit
coredns.cache_misses_count,gauge,,misses,,Counter of cache misses.,1,coredns,cache miss
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r/misses/miss

coredns.cache_hits_count,gauge,,hits,,Counter of cache hits by cache type,1,coredns,cache hit
coredns.cache_hits_count.count,count,,hits,,Counter of cache hits by cache type at a moment,1,coredns,cache hit
coredns.cache_misses_count,gauge,,misses,,Counter of cache misses.,1,coredns,cache miss
coredns.cache_misses_count.count,count,,misses,,Counter of cache misses at a moment.,1,coredns,cache miss
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r/misses/miss

coredns.request_count.count,count,,request,,total query count at a moment,1,coredns,request count
coredns.request_type_count,gauge,,request type,,counter of queries per zone and type,1,coredns,request type
coredns.request_type_count.count,count,,request type,,counter of queries per zone and type,1,coredns,request type
coredns.request_duration.seconds.sum,gauge,,seconds,,duration to process each query,-1,coredns,proxy request duration
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r/seconds/second

and for the next 3 metrics too

Get metrics from CoreDNS in real time to:

* Visualize and monitor DNS failures and cache hits/misses
## Setup
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this integration is in integration-core.
The CoreDNS check is packaged with the Agent? if so, no need to manually download check.py and check.conf files.

5. [Restart the Agent][3] to begin sending CoreDNS metrics to Datadog.

### Validation

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Validation section should follow the same layout as all other integrations in integrations core:
https://github.com/DataDog/integrations-core/tree/master/apache#validation

We don't display the result since it differs between agent v5 and v6.

[...]
```

## Compatibility
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to add the compatibility section in the readme. The information is already in the manifest.json file and has a specific display on the public documentation:

https://docs.datadoghq.com/integrations/apache/

See [metadata.csv][5] for a list of metrics provided by this integration.

### Events
The CoreDNS check does not include any event at this time.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to say "at this time" here. The documentation represents now, not a hypothetical future.

The CoreDNS check does not include any events.

The CoreDNS check does not include any event at this time.

### Service Checks
The CoreDNS check does not include any service checks at this time.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as for events subsections.

@gmmeyer
Copy link
Contributor

gmmeyer commented Aug 22, 2018

@shraykay can you do it in this one? That PR should be merged soon!

Copy link
Contributor

@gmmeyer gmmeyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again for this! This is great! 😄

I have a few small changes to request, I'll take a second look once you migrate this to the new stuff.

One thing I wanna ask if if you could create an integration test for this -- as in, actually spin coredns up in a docker container and grab metrics from it. There's a bunch of examples of this in the repo. Nginx is a fairly simple one that should be easy to follow: https://github.com/DataDog/integrations-core/tree/master/nginx/tests

coredns/tox.ini Outdated
[testenv]
platform = linux2|darwin
deps =
datadog-checks-base
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you change this to -e../datadog_checks_base[deps]? That will, I believe, make the test work.

super(CoreDNSCheck, self).__init__(name, init_config, agentConfig, instances)
self.NAMESPACE = 'coredns'

self.metrics_mapper = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we make this configurable in the yaml? making this the default and then merging a yaml metrics_mapper map with this one would be a good idea I think

# https://coredns.io/plugins/cache/
'coredns_cache_size': 'cache_size.count',
}
self.ignore_metrics = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we wanna ignore all of these? (people can find lots of metrics useful for a lot of reasons)

Copy link
Contributor

@gzussa gzussa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for this contribution 💯 I made some additional comments. I think this is on good tracks 😄.

@@ -0,0 +1,2 @@
mock==2.0.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can just import datadog-checks-dev instead of those two

coredns/setup.py Outdated
packages=['datadog_checks.coredns'],

# Run-time dependencies
install_requires=get_requirements('requirements.in')+[
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can remove the get_requirements('requirements.in') since it has nothing in it.



@pytest.fixture
def aggregator():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can remove this function since it is part of datadog_check_base.
You can look at the Vault integrations for an example.

@BrianChristie
Copy link

Thanks for this work!

@ datadog folks, it would be great to merge this soon because as you can see in the Kubernetes V1.11 release notes https://kubernetes.io/docs/setup/release/notes/#sig-cluster-lifecycle , CoreDNS has replaced kube-dns as the default DNS provider so all new clusters have only CoreDNS running.

@shraykay
Copy link
Contributor Author

i will work on this today/tomorrow! will have an updated version for you soon.

@shraykay
Copy link
Contributor Author

I pushed a newer version, however I'm having trouble running an integration test using docker-compose. can anyone provide some pointers?

coredns/tox.ini Outdated

[common]
deps =
../datadog_checks_base
Copy link
Contributor

@gzussa gzussa Aug 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to replace both of those lines with -e../datadog_checks_base[deps]. See this example

}

@pytest.fixture
def aggregator():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aggregator is globally available with datadog_checks_dev there so you don't need to add it here :)


@pytest.fixture
def coredns():
env = os.environ
Copy link
Contributor

@gzussa gzussa Aug 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of calling subprocess to start your container, you can leverage utilities in this file. You can find good examples here or there. Feel free to take a look at what the datadog_checks_dev package contains since it holds a lot of useful methods :)

@shraykay
Copy link
Contributor Author

shraykay commented Aug 29, 2018

@gzussa thanks! was able to get docker-compose going with your help, could you help me figure out why flake8 isn't running? the error seems a bit vague

edit: disregard, was whitespaces

@codecov-io
Copy link

codecov-io commented Aug 29, 2018

Codecov Report

Merging #2091 into master will increase coverage by 3.49%.
The diff coverage is 89.28%.

Impacted file tree graph

l0k0ms
l0k0ms previously approved these changes Aug 29, 2018
Copy link
Contributor

@l0k0ms l0k0ms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👌 ping @cswatt for a final doc review

Copy link
Contributor

@ofek ofek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent work!


return instance

def submit_as_gauge_and_monotonic_count(self, metric_suffix, message, scraper_config):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these should only be submitted as monotonic_count. See #1303

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @mfpierre to verify

Copy link
Contributor

@mfpierre mfpierre Aug 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree we should use the send_monotonic_counter to True here I double-checked all the affected metrics are of the counter type.
As a result we can remove all the custom functions and METRIC_TRANSFORMERS + drop the extra .count gauges in the metadata.csv file

gmmeyer
gmmeyer previously approved these changes Aug 29, 2018
Copy link
Contributor

@gmmeyer gmmeyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with this! it looks great! Thanks a lot.

coredns/MANIFEST.in Show resolved Hide resolved

return instance

def submit_as_gauge_and_monotonic_count(self, metric_suffix, message, scraper_config):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @mfpierre to verify

coredns/setup.py Show resolved Hide resolved
coredns/setup.py Outdated
keywords='datadog agent check',
url='https://github.com/DataDog/integrations-core',
author='CoreDNS',
author_email='shray@namely.com',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

author='Datadog',
author_email='packages@datadoghq.com',

coredns/setup.py Outdated

# Extra files to ship with the wheel package
package_data={'datadog_checks.coredns': ['conf.yaml.example']},
include_package_data=True,
Copy link
Contributor

@ofek ofek Aug 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

    # The package we're going to ship
    packages=['datadog_checks', 'datadog_checks.coredns'],

    # Run-time dependencies
    install_requires=[CHECKS_BASE_REQ],

    # Extra files to ship with the wheel package
    include_package_data=True,



@pytest.fixture(scope="session")
def spin_up_coredns():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please put all these fixtures in a conftest.py? At that point they will be globally available.

res = requests.get(URL)
# create some metrics by using dig
subprocess.check_call(DIG_ARGS, env=env)
res.raise_for_status
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

res.raise_for_status()


compose_file = os.path.join(HERE, 'docker', 'docker-compose.yml')
env = os.environ
env['COREDNS_CONFIG_FOLDER'] = CONFIG_FOLDER
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

env = {'COREDNS_CONFIG_FOLDER': CONFIG_FOLDER}

with open(mesh_file_path, 'rb') as f:
text_data = f.read()

p = mock.patch('requests.get', return_value=MockResponse(text_data, 'text/plain; version=0.0.4'), __name__='get')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with mock.patch(...):
    yield


instances:
# url of the metrics endpoint of prometheus
- prometheus_endpoint: http://localhost:9153/metrics
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please use prometheus_url instead of prometheus_endpoint?

"""
Set up coredns instance so it can be used in OpenMetricsBaseCheck
"""
endpoint = instance.get('prometheus_endpoint')
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should also be prometheus_url.

"""
endpoint = instance.get('prometheus_endpoint')
if endpoint is None:
raise CheckException("Unable to find prometheus endpoint in config file.")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And we can just change this to prometheus_url.

super(CoreDNSCheck, self).__init__(name, init_config, agentConfig, instances=generic_instances)

def check(self, instance):
endpoint = instance.get('prometheus_endpoint')
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And here as well prometheus_url.

# Primarily, metrics are emitted by the prometheus plugin: https://coredns.io/plugins/metrics/
# Note: the count metrics were moved to specific functions
# below to be submitted as both gauges and monotonic_counts
# 'coredns_dns_response_rcode_count_total': self.coredns_dns_response_rcode_count_total,
Copy link

@rishabh rishabh Aug 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're not using the transformer functions, we can just get rid of this dictionary.
As per #2091 (comment).

def check(self, instance):
endpoint = instance.get('prometheus_endpoint')
scraper_config = self.config_map[endpoint]
self.process(scraper_config, metric_transformers=self.METRIC_TRANSFORMERS)
Copy link

@rishabh rishabh Aug 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And then we can also get rid of the metric_transformers named argument.

instances:
# To enable CoreDNS metrics you must specify the prometheus url and enable the plugin within coredns
# See: https://coredns.io/plugins/metrics/
- prometheus_endpoint: "http://%%host%%:9153/metrics"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this should also be prometheus_url.

annotations:
ad.datadoghq.com/coredns.check_names: '["coredns"]'
ad.datadoghq.com/coredns.init_configs: '[{}]'
ad.datadoghq.com/coredns.instances: '[[{"prometheus_endpoint":"http://%%host%%:9153/metrics", "tags":["dns-pod:%%host%%"]}]]'
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this one, too :)

from datadog_checks.utils.common import get_docker_hostname

instance = {
'prometheus_endpoint': 'http://localhost:9153/metrics',
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's fix the tests as well to use prometheus_url.

@pytest.fixture
def dockerinstance():
return {
'prometheus_endpoint': URL,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And let's also fix the docker instance config to use prometheus_url.

NAMESPACE + '.proxy_request_duration.seconds.sum',
NAMESPACE + '.cache_size.count',
]
COUNT_METRICS = [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should remove this empty list

@shraykay shraykay dismissed stale reviews from gmmeyer and l0k0ms via f236fe4 August 30, 2018 00:56
@shraykay shraykay force-pushed the coredns branch 2 times, most recently from bdbca94 to ac8517f Compare August 30, 2018 01:09
ofek
ofek previously approved these changes Sep 4, 2018
gmmeyer
gmmeyer previously approved these changes Sep 4, 2018
Copy link
Contributor

@gmmeyer gmmeyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks amazing! Thanks a lot!

@l0k0ms l0k0ms dismissed stale reviews from gmmeyer and ofek via 5118561 September 5, 2018 06:42
@shraykay
Copy link
Contributor Author

greetings! just following up on this, is there anything else required or can we get this merged in? :)

@gmmeyer gmmeyer merged commit ac9cf85 into DataDog:master Sep 10, 2018
@ofek ofek changed the title Adding CoreDNS integration to Datadog Add CoreDNS integration Oct 13, 2018
nmuesch pushed a commit that referenced this pull request Nov 1, 2018
@fanny-jiang fanny-jiang mentioned this pull request Jan 24, 2022
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.