-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add CoreDNS integration #2091
Add CoreDNS integration #2091
Conversation
@gmmeyer should I make changes to support |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello, Many thanks for your contribution 🙇 I left some comment regarding the documentation of your new integration.
coredns/metadata.csv
Outdated
coredns.response_code_count.count,count,,response code,,instant number of responses per zone and rcode,1,coredns,response code | ||
coredns.proxy_request_count,gauge,,request,,query count per upstream.,1,coredns,proxy request count | ||
coredns.proxy_request_count.count,count,,request,,query count per upstream at a moment,1,coredns,proxy request count | ||
coredns.cache_hits_count,gauge,,hits,,Counter of cache hits by cache type,1,coredns,cache hit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unit name should be hit
not hits
coredns/metadata.csv
Outdated
coredns.proxy_request_count,gauge,,request,,query count per upstream.,1,coredns,proxy request count | ||
coredns.proxy_request_count.count,count,,request,,query count per upstream at a moment,1,coredns,proxy request count | ||
coredns.cache_hits_count,gauge,,hits,,Counter of cache hits by cache type,1,coredns,cache hit | ||
coredns.cache_hits_count.count,count,,hits,,Counter of cache hits by cache type at a moment,1,coredns,cache hit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unit_name should be hit
not hits
coredns/metadata.csv
Outdated
coredns.proxy_request_count.count,count,,request,,query count per upstream at a moment,1,coredns,proxy request count | ||
coredns.cache_hits_count,gauge,,hits,,Counter of cache hits by cache type,1,coredns,cache hit | ||
coredns.cache_hits_count.count,count,,hits,,Counter of cache hits by cache type at a moment,1,coredns,cache hit | ||
coredns.cache_misses_count,gauge,,misses,,Counter of cache misses.,1,coredns,cache miss |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
r/misses
/miss
coredns/metadata.csv
Outdated
coredns.cache_hits_count,gauge,,hits,,Counter of cache hits by cache type,1,coredns,cache hit | ||
coredns.cache_hits_count.count,count,,hits,,Counter of cache hits by cache type at a moment,1,coredns,cache hit | ||
coredns.cache_misses_count,gauge,,misses,,Counter of cache misses.,1,coredns,cache miss | ||
coredns.cache_misses_count.count,count,,misses,,Counter of cache misses at a moment.,1,coredns,cache miss |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
r/misses
/miss
coredns/metadata.csv
Outdated
coredns.request_count.count,count,,request,,total query count at a moment,1,coredns,request count | ||
coredns.request_type_count,gauge,,request type,,counter of queries per zone and type,1,coredns,request type | ||
coredns.request_type_count.count,count,,request type,,counter of queries per zone and type,1,coredns,request type | ||
coredns.request_duration.seconds.sum,gauge,,seconds,,duration to process each query,-1,coredns,proxy request duration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
r/seconds
/second
and for the next 3 metrics too
coredns/README.md
Outdated
Get metrics from CoreDNS in real time to: | ||
|
||
* Visualize and monitor DNS failures and cache hits/misses | ||
## Setup |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this integration is in integration-core.
The CoreDNS check is packaged with the Agent? if so, no need to manually download check.py
and check.conf
files.
coredns/README.md
Outdated
5. [Restart the Agent][3] to begin sending CoreDNS metrics to Datadog. | ||
|
||
### Validation | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Validation section should follow the same layout as all other integrations in integrations core:
https://github.com/DataDog/integrations-core/tree/master/apache#validation
We don't display the result since it differs between agent v5 and v6.
coredns/README.md
Outdated
[...] | ||
``` | ||
|
||
## Compatibility |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to add the compatibility section in the readme. The information is already in the manifest.json file and has a specific display on the public documentation:
coredns/README.md
Outdated
See [metadata.csv][5] for a list of metrics provided by this integration. | ||
|
||
### Events | ||
The CoreDNS check does not include any event at this time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to say "at this time" here. The documentation represents now, not a hypothetical future.
The CoreDNS check does not include any events.
coredns/README.md
Outdated
The CoreDNS check does not include any event at this time. | ||
|
||
### Service Checks | ||
The CoreDNS check does not include any service checks at this time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as for events subsections.
@shraykay can you do it in this one? That PR should be merged soon! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again for this! This is great! 😄
I have a few small changes to request, I'll take a second look once you migrate this to the new stuff.
One thing I wanna ask if if you could create an integration test for this -- as in, actually spin coredns up in a docker container and grab metrics from it. There's a bunch of examples of this in the repo. Nginx is a fairly simple one that should be easy to follow: https://github.com/DataDog/integrations-core/tree/master/nginx/tests
coredns/tox.ini
Outdated
[testenv] | ||
platform = linux2|darwin | ||
deps = | ||
datadog-checks-base |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you change this to -e../datadog_checks_base[deps]
? That will, I believe, make the test work.
super(CoreDNSCheck, self).__init__(name, init_config, agentConfig, instances) | ||
self.NAMESPACE = 'coredns' | ||
|
||
self.metrics_mapper = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we make this configurable in the yaml? making this the default and then merging a yaml metrics_mapper map with this one would be a good idea I think
# https://coredns.io/plugins/cache/ | ||
'coredns_cache_size': 'cache_size.count', | ||
} | ||
self.ignore_metrics = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we wanna ignore all of these? (people can find lots of metrics useful for a lot of reasons)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for this contribution 💯 I made some additional comments. I think this is on good tracks 😄.
coredns/requirements-dev.txt
Outdated
@@ -0,0 +1,2 @@ | |||
mock==2.0.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can just import datadog-checks-dev
instead of those two
coredns/setup.py
Outdated
packages=['datadog_checks.coredns'], | ||
|
||
# Run-time dependencies | ||
install_requires=get_requirements('requirements.in')+[ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can remove the get_requirements('requirements.in')
since it has nothing in it.
coredns/tests/test_check.py
Outdated
|
||
|
||
@pytest.fixture | ||
def aggregator(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can remove this function since it is part of datadog_check_base
.
You can look at the Vault
integrations for an example.
Thanks for this work! @ datadog folks, it would be great to merge this soon because as you can see in the Kubernetes V1.11 release notes https://kubernetes.io/docs/setup/release/notes/#sig-cluster-lifecycle , CoreDNS has replaced kube-dns as the default DNS provider so all new clusters have only CoreDNS running. |
i will work on this today/tomorrow! will have an updated version for you soon. |
I pushed a newer version, however I'm having trouble running an integration test using docker-compose. can anyone provide some pointers? |
coredns/tox.ini
Outdated
|
||
[common] | ||
deps = | ||
../datadog_checks_base |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should be able to replace both of those lines with -e../datadog_checks_base[deps]
. See this example
coredns/tests/test_check.py
Outdated
} | ||
|
||
@pytest.fixture | ||
def aggregator(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aggregator is globally available with datadog_checks_dev
there so you don't need to add it here :)
coredns/tests/test_check.py
Outdated
|
||
@pytest.fixture | ||
def coredns(): | ||
env = os.environ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gzussa thanks! was able to get docker-compose going with your help, could you help me figure out why flake8 isn't running? the error seems a bit vague edit: disregard, was whitespaces |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👌 ping @cswatt for a final doc review
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent work!
|
||
return instance | ||
|
||
def submit_as_gauge_and_monotonic_count(self, metric_suffix, message, scraper_config): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these should only be submitted as monotonic_count
. See #1303
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @mfpierre to verify
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree we should use the send_monotonic_counter
to True
here I double-checked all the affected metrics are of the counter
type.
As a result we can remove all the custom functions and METRIC_TRANSFORMERS
+ drop the extra .count
gauges in the metadata.csv
file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy with this! it looks great! Thanks a lot.
|
||
return instance | ||
|
||
def submit_as_gauge_and_monotonic_count(self, metric_suffix, message, scraper_config): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @mfpierre to verify
coredns/setup.py
Outdated
keywords='datadog agent check', | ||
url='https://github.com/DataDog/integrations-core', | ||
author='CoreDNS', | ||
author_email='shray@namely.com', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
author='Datadog',
author_email='packages@datadoghq.com',
coredns/setup.py
Outdated
|
||
# Extra files to ship with the wheel package | ||
package_data={'datadog_checks.coredns': ['conf.yaml.example']}, | ||
include_package_data=True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# The package we're going to ship
packages=['datadog_checks', 'datadog_checks.coredns'],
# Run-time dependencies
install_requires=[CHECKS_BASE_REQ],
# Extra files to ship with the wheel package
include_package_data=True,
coredns/tests/test_check.py
Outdated
|
||
|
||
@pytest.fixture(scope="session") | ||
def spin_up_coredns(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please put all these fixtures in a conftest.py
? At that point they will be globally available.
coredns/tests/test_check.py
Outdated
res = requests.get(URL) | ||
# create some metrics by using dig | ||
subprocess.check_call(DIG_ARGS, env=env) | ||
res.raise_for_status |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
res.raise_for_status()
coredns/tests/test_check.py
Outdated
|
||
compose_file = os.path.join(HERE, 'docker', 'docker-compose.yml') | ||
env = os.environ | ||
env['COREDNS_CONFIG_FOLDER'] = CONFIG_FOLDER |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
env = {'COREDNS_CONFIG_FOLDER': CONFIG_FOLDER}
coredns/tests/test_check.py
Outdated
with open(mesh_file_path, 'rb') as f: | ||
text_data = f.read() | ||
|
||
p = mock.patch('requests.get', return_value=MockResponse(text_data, 'text/plain; version=0.0.4'), __name__='get') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with mock.patch(...):
yield
|
||
instances: | ||
# url of the metrics endpoint of prometheus | ||
- prometheus_endpoint: http://localhost:9153/metrics |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please use prometheus_url
instead of prometheus_endpoint
?
""" | ||
Set up coredns instance so it can be used in OpenMetricsBaseCheck | ||
""" | ||
endpoint = instance.get('prometheus_endpoint') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should also be prometheus_url
.
""" | ||
endpoint = instance.get('prometheus_endpoint') | ||
if endpoint is None: | ||
raise CheckException("Unable to find prometheus endpoint in config file.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And we can just change this to prometheus_url
.
super(CoreDNSCheck, self).__init__(name, init_config, agentConfig, instances=generic_instances) | ||
|
||
def check(self, instance): | ||
endpoint = instance.get('prometheus_endpoint') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And here as well prometheus_url
.
# Primarily, metrics are emitted by the prometheus plugin: https://coredns.io/plugins/metrics/ | ||
# Note: the count metrics were moved to specific functions | ||
# below to be submitted as both gauges and monotonic_counts | ||
# 'coredns_dns_response_rcode_count_total': self.coredns_dns_response_rcode_count_total, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we're not using the transformer functions, we can just get rid of this dictionary.
As per #2091 (comment).
def check(self, instance): | ||
endpoint = instance.get('prometheus_endpoint') | ||
scraper_config = self.config_map[endpoint] | ||
self.process(scraper_config, metric_transformers=self.METRIC_TRANSFORMERS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And then we can also get rid of the metric_transformers
named argument.
instances: | ||
# To enable CoreDNS metrics you must specify the prometheus url and enable the plugin within coredns | ||
# See: https://coredns.io/plugins/metrics/ | ||
- prometheus_endpoint: "http://%%host%%:9153/metrics" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And this should also be prometheus_url
.
coredns/README.md
Outdated
annotations: | ||
ad.datadoghq.com/coredns.check_names: '["coredns"]' | ||
ad.datadoghq.com/coredns.init_configs: '[{}]' | ||
ad.datadoghq.com/coredns.instances: '[[{"prometheus_endpoint":"http://%%host%%:9153/metrics", "tags":["dns-pod:%%host%%"]}]]' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And this one, too :)
coredns/tests/test_check.py
Outdated
from datadog_checks.utils.common import get_docker_hostname | ||
|
||
instance = { | ||
'prometheus_endpoint': 'http://localhost:9153/metrics', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's fix the tests as well to use prometheus_url
.
coredns/tests/test_check.py
Outdated
@pytest.fixture | ||
def dockerinstance(): | ||
return { | ||
'prometheus_endpoint': URL, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And let's also fix the docker instance config to use prometheus_url
.
coredns/tests/test_check.py
Outdated
NAMESPACE + '.proxy_request_duration.seconds.sum', | ||
NAMESPACE + '.cache_size.count', | ||
] | ||
COUNT_METRICS = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should remove this empty list
bdbca94
to
ac8517f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks amazing! Thanks a lot!
greetings! just following up on this, is there anything else required or can we get this merged in? :) |
What does this PR do?
Adding an integration for CoreDNS -- CoreDNS emits metrics via Prometheus.
See sample events from plugins: here, here, and here.
Motivation
Kubernetes is moving from KubeDNS -> CoreDNS starting v1.11
Review checklist
no-changelog
label attachedAdditional Notes
First time making a contribution! Please let me know what I may have missed, I borrowed heavily from KubeDNS's integration.