forked from influxdata/telegraf
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add input plugin for DC/OS (influxdata#3519)
- Loading branch information
1 parent
27e9520
commit 86af1fa
Showing
8 changed files
with
1,723 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,209 @@ | ||
# DC/OS Input Plugin | ||
|
||
This input plugin gathers metrics from a DC/OS cluster's [metrics component](https://docs.mesosphere.com/1.10/metrics/). | ||
|
||
**Series Cardinality Warning** | ||
|
||
Depending on the work load of your DC/OS cluster, this plugin can quickly | ||
create a high number of series which, when unchecked, can cause high load on | ||
your database. | ||
|
||
- Use [measurement filtering](https://github.com/influxdata/telegraf/blob/master/docs/CONFIGURATION.md#measurement-filtering) liberally to exclude unneeded metrics as well as the node, container, and app inclue/exclude options. | ||
- Write to a database with an appropriate [retention policy](https://docs.influxdata.com/influxdb/v1.3/concepts/glossary/#retention-policy-rp). | ||
- Limit the number of series allowed in your database using the `max-series-per-database` and `max-values-per-tag` settings. | ||
- Consider enabling the [TSI](https://docs.influxdata.com/influxdb/v1.3/about_the_project/releasenotes-changelog/#release-notes-8) engine. | ||
- Monitor your [series cardinality](https://docs.influxdata.com/influxdb/v1.3/troubleshooting/frequently-asked-questions/#how-can-i-query-for-series-cardinality). | ||
|
||
### Configuration: | ||
```toml | ||
[[inputs.dcos]] | ||
## The DC/OS cluster URL. | ||
cluster_url = "https://dcos-master-1" | ||
|
||
## The ID of the service account. | ||
service_account_id = "telegraf" | ||
## The private key file for the service account. | ||
service_account_private_key = "/etc/telegraf/telegraf-sa-key.pem" | ||
|
||
## Path containing login token. If set, will read on every gather. | ||
# token_file = "/home/dcos/.dcos/token" | ||
|
||
## In all filter options if both include and exclude are empty all items | ||
## will be collected. Arrays may contain glob patterns. | ||
## | ||
## Node IDs to collect metrics from. If a node is excluded, no metrics will | ||
## be collected for its containers or apps. | ||
# node_include = [] | ||
# node_exclude = [] | ||
## Container IDs to collect container metrics from. | ||
# container_include = [] | ||
# container_exclude = [] | ||
## Container IDs to collect app metrics from. | ||
# app_include = [] | ||
# app_exclude = [] | ||
|
||
## Maximum concurrent connections to the cluster. | ||
# max_connections = 10 | ||
## Maximum time to receive a response from cluster. | ||
# response_timeout = "20s" | ||
|
||
## Optional SSL Config | ||
# ssl_ca = "/etc/telegraf/ca.pem" | ||
# ssl_cert = "/etc/telegraf/cert.pem" | ||
# ssl_key = "/etc/telegraf/key.pem" | ||
## If false, skip chain & host verification | ||
# insecure_skip_verify = true | ||
|
||
## Recommended filtering to reduce series cardinality. | ||
# [inputs.dcos.tagdrop] | ||
# path = ["/var/lib/mesos/slave/slaves/*"] | ||
``` | ||
|
||
#### Enterprise Authentication | ||
|
||
When using Enterprise DC/OS, it is recommended to use a service account to | ||
authenticate with the cluster. | ||
|
||
The plugin requires the following permissions: | ||
``` | ||
dcos:adminrouter:ops:system-metrics full | ||
dcos:adminrouter:ops:mesos full | ||
``` | ||
|
||
Follow the directions to [create a service account and assign permissions](https://docs.mesosphere.com/1.10/security/service-auth/custom-service-auth/). | ||
|
||
Quick configuration using the Enterprise CLI: | ||
``` | ||
dcos security org service-accounts keypair telegraf-sa-key.pem telegraf-sa-cert.pem | ||
dcos security org service-accounts create -p telegraf-sa-cert.pem -d "Telegraf DC/OS input plugin" telegraf | ||
dcos security org users grant telegraf dcos:adminrouter:ops:system-metrics full | ||
dcos security org users grant telegraf dcos:adminrouter:ops:mesos full | ||
``` | ||
|
||
#### Open Source Authentication | ||
|
||
The Open Source DC/OS does not provide service accounts. Instead you can use | ||
of the following options: | ||
|
||
1. [Disable authentication](https://dcos.io/docs/1.10/security/managing-authentication/#authentication-opt-out) | ||
2. Use the `token_file` parameter to read a authentication token from a file. | ||
|
||
Then `token_file` can be set by using the [dcos cli] to login periodically. | ||
The cli can login for at most XXX days, you will need to ensure the cli | ||
performs a new login before this time expires. | ||
``` | ||
dcos auth login --username foo --password bar | ||
dcos config show core.dcos_acs_token > ~/.dcos/token | ||
``` | ||
|
||
Another option to create a `token_file` is to generate a token using the | ||
cluster secret. This will allow you to set the expiration date manually or | ||
even create a never expiring token. However, if the cluster secret or the | ||
token is compromised it cannot be revoked and may require a full reinstall of | ||
the cluster. For more information on this technique reference | ||
[this blog post](https://medium.com/@richardgirges/authenticating-open-source-dc-os-with-third-party-services-125fa33a5add). | ||
|
||
### Metrics: | ||
|
||
Please consult the [Metrics Reference](https://docs.mesosphere.com/1.10/metrics/reference/) | ||
for details on interprete field interpretation. | ||
|
||
- dcos_node | ||
- tags: | ||
- cluster | ||
- hostname | ||
- path (filesystem fields only) | ||
- interface (network fields only) | ||
- fields: | ||
- system_uptime (float) | ||
- cpu_cores (float) | ||
- cpu_total (float) | ||
- cpu_user (float) | ||
- cpu_system (float) | ||
- cpu_idle (float) | ||
- cpu_wait (float) | ||
- load_1min (float) | ||
- load_5min (float) | ||
- load_15min (float) | ||
- filesystem_capacity_total_bytes (int) | ||
- filesystem_capacity_used_bytes (int) | ||
- filesystem_capacity_free_bytes (int) | ||
- filesystem_inode_total (float) | ||
- filesystem_inode_used (float) | ||
- filesystem_inode_free (float) | ||
- memory_total_bytes (int) | ||
- memory_free_bytes (int) | ||
- memory_buffers_bytes (int) | ||
- memory_cached_bytes (int) | ||
- swap_total_bytes (int) | ||
- swap_free_bytes (int) | ||
- swap_used_bytes (int) | ||
- network_in_bytes (int) | ||
- network_out_bytes (int) | ||
- network_in_packets (float) | ||
- network_out_packets (float) | ||
- network_in_dropped (float) | ||
- network_out_dropped (float) | ||
- network_in_errors (float) | ||
- network_out_errors (float) | ||
- process_count (float) | ||
|
||
- dcos_container | ||
- tags: | ||
- cluster | ||
- hostname | ||
- container_id | ||
- task_name | ||
- fields: | ||
- cpus_limit (float) | ||
- cpus_system_time (float) | ||
- cpus_throttled_time (float) | ||
- cpus_user_time (float) | ||
- disk_limit_bytes (int) | ||
- disk_used_bytes (int) | ||
- mem_limit_bytes (int) | ||
- mem_total_bytes (int) | ||
- net_rx_bytes (int) | ||
- net_rx_dropped (float) | ||
- net_rx_errors (float) | ||
- net_rx_packets (float) | ||
- net_tx_bytes (int) | ||
- net_tx_dropped (float) | ||
- net_tx_errors (float) | ||
- net_tx_packets (float) | ||
|
||
- dcos_app | ||
- tags: | ||
- cluster | ||
- hostname | ||
- container_id | ||
- task_name | ||
- fields: | ||
- fields are application specific | ||
|
||
### Example Output: | ||
|
||
``` | ||
dcos_node,cluster=enterprise,hostname=192.168.122.18,path=/boot filesystem_capacity_free_bytes=918188032i,filesystem_capacity_total_bytes=1063256064i,filesystem_capacity_used_bytes=145068032i,filesystem_inode_free=523958,filesystem_inode_total=524288,filesystem_inode_used=330 1511859222000000000 | ||
dcos_node,cluster=enterprise,hostname=192.168.122.18,interface=dummy0 network_in_bytes=0i,network_in_dropped=0,network_in_errors=0,network_in_packets=0,network_out_bytes=0i,network_out_dropped=0,network_out_errors=0,network_out_packets=0 1511859222000000000 | ||
dcos_node,cluster=enterprise,hostname=192.168.122.18,interface=docker0 network_in_bytes=0i,network_in_dropped=0,network_in_errors=0,network_in_packets=0,network_out_bytes=0i,network_out_dropped=0,network_out_errors=0,network_out_packets=0 1511859222000000000 | ||
dcos_node,cluster=enterprise,hostname=192.168.122.18 cpu_cores=2,cpu_idle=81.62,cpu_system=4.19,cpu_total=13.670000000000002,cpu_user=9.48,cpu_wait=0,load_15min=0.7,load_1min=0.22,load_5min=0.6,memory_buffers_bytes=970752i,memory_cached_bytes=1830473728i,memory_free_bytes=1178636288i,memory_total_bytes=3975073792i,process_count=198,swap_free_bytes=859828224i,swap_total_bytes=859828224i,swap_used_bytes=0i,system_uptime=18874 1511859222000000000 | ||
dcos_node,cluster=enterprise,hostname=192.168.122.18,interface=lo network_in_bytes=1090992450i,network_in_dropped=0,network_in_errors=0,network_in_packets=1546938,network_out_bytes=1090992450i,network_out_dropped=0,network_out_errors=0,network_out_packets=1546938 1511859222000000000 | ||
dcos_node,cluster=enterprise,hostname=192.168.122.18,path=/ filesystem_capacity_free_bytes=1668378624i,filesystem_capacity_total_bytes=6641680384i,filesystem_capacity_used_bytes=4973301760i,filesystem_inode_free=3107856,filesystem_inode_total=3248128,filesystem_inode_used=140272 1511859222000000000 | ||
dcos_node,cluster=enterprise,hostname=192.168.122.18,interface=minuteman network_in_bytes=0i,network_in_dropped=0,network_in_errors=0,network_in_packets=0,network_out_bytes=210i,network_out_dropped=0,network_out_errors=0,network_out_packets=3 1511859222000000000 | ||
dcos_node,cluster=enterprise,hostname=192.168.122.18,interface=eth0 network_in_bytes=539886216i,network_in_dropped=1,network_in_errors=0,network_in_packets=979808,network_out_bytes=112395836i,network_out_dropped=0,network_out_errors=0,network_out_packets=891239 1511859222000000000 | ||
dcos_node,cluster=enterprise,hostname=192.168.122.18,interface=spartan network_in_bytes=0i,network_in_dropped=0,network_in_errors=0,network_in_packets=0,network_out_bytes=210i,network_out_dropped=0,network_out_errors=0,network_out_packets=3 1511859222000000000 | ||
dcos_node,cluster=enterprise,hostname=192.168.122.18,path=/var/lib/docker/overlay filesystem_capacity_free_bytes=1668378624i,filesystem_capacity_total_bytes=6641680384i,filesystem_capacity_used_bytes=4973301760i,filesystem_inode_free=3107856,filesystem_inode_total=3248128,filesystem_inode_used=140272 1511859222000000000 | ||
dcos_node,cluster=enterprise,hostname=192.168.122.18,interface=vtep1024 network_in_bytes=0i,network_in_dropped=0,network_in_errors=0,network_in_packets=0,network_out_bytes=0i,network_out_dropped=0,network_out_errors=0,network_out_packets=0 1511859222000000000 | ||
dcos_node,cluster=enterprise,hostname=192.168.122.18,path=/var/lib/docker/plugins filesystem_capacity_free_bytes=1668378624i,filesystem_capacity_total_bytes=6641680384i,filesystem_capacity_used_bytes=4973301760i,filesystem_inode_free=3107856,filesystem_inode_total=3248128,filesystem_inode_used=140272 1511859222000000000 | ||
dcos_node,cluster=enterprise,hostname=192.168.122.18,interface=d-dcos network_in_bytes=0i,network_in_dropped=0,network_in_errors=0,network_in_packets=0,network_out_bytes=0i,network_out_dropped=0,network_out_errors=0,network_out_packets=0 1511859222000000000 | ||
dcos_app,cluster=enterprise,container_id=9a78d34a-3bbf-467e-81cf-a57737f154ee,hostname=192.168.122.18 container_received_bytes_per_sec=0,container_throttled_bytes_per_sec=0 1511859222000000000 | ||
dcos_container,cluster=enterprise,container_id=cbf19b77-3b8d-4bcf-b81f-824b67279629,hostname=192.168.122.18 cpus_limit=0.3,cpus_system_time=307.31,cpus_throttled_time=102.029930607,cpus_user_time=268.57,disk_limit_bytes=268435456i,disk_used_bytes=30953472i,mem_limit_bytes=570425344i,mem_total_bytes=13316096i,net_rx_bytes=0i,net_rx_dropped=0,net_rx_errors=0,net_rx_packets=0,net_tx_bytes=0i,net_tx_dropped=0,net_tx_errors=0,net_tx_packets=0 1511859222000000000 | ||
dcos_app,cluster=enterprise,container_id=cbf19b77-3b8d-4bcf-b81f-824b67279629,hostname=192.168.122.18 container_received_bytes_per_sec=0,container_throttled_bytes_per_sec=0 1511859222000000000 | ||
dcos_container,cluster=enterprise,container_id=5725e219-f66e-40a8-b3ab-519d85f4c4dc,hostname=192.168.122.18,task_name=hello-world cpus_limit=0.6,cpus_system_time=25.6,cpus_throttled_time=327.977109217,cpus_user_time=566.54,disk_limit_bytes=0i,disk_used_bytes=0i,mem_limit_bytes=1107296256i,mem_total_bytes=335941632i,net_rx_bytes=0i,net_rx_dropped=0,net_rx_errors=0,net_rx_packets=0,net_tx_bytes=0i,net_tx_dropped=0,net_tx_errors=0,net_tx_packets=0 1511859222000000000 | ||
dcos_app,cluster=enterprise,container_id=5725e219-f66e-40a8-b3ab-519d85f4c4dc,hostname=192.168.122.18 container_received_bytes_per_sec=0,container_throttled_bytes_per_sec=0 1511859222000000000 | ||
dcos_app,cluster=enterprise,container_id=c76e1488-4fb7-4010-a4cf-25725f8173f9,hostname=192.168.122.18 container_received_bytes_per_sec=0,container_throttled_bytes_per_sec=0 1511859222000000000 | ||
dcos_container,cluster=enterprise,container_id=cbe0b2f9-061f-44ac-8f15-4844229e8231,hostname=192.168.122.18,task_name=telegraf cpus_limit=0.2,cpus_system_time=8.109999999,cpus_throttled_time=93.183916045,cpus_user_time=17.97,disk_limit_bytes=0i,disk_used_bytes=0i,mem_limit_bytes=167772160i,mem_total_bytes=0i,net_rx_bytes=0i,net_rx_dropped=0,net_rx_errors=0,net_rx_packets=0,net_tx_bytes=0i,net_tx_dropped=0,net_tx_errors=0,net_tx_packets=0 1511859222000000000 | ||
dcos_container,cluster=enterprise,container_id=b64115de-3d2a-431d-a805-76e7c46453f1,hostname=192.168.122.18 cpus_limit=0.2,cpus_system_time=2.69,cpus_throttled_time=20.064861214,cpus_user_time=6.56,disk_limit_bytes=268435456i,disk_used_bytes=29360128i,mem_limit_bytes=297795584i,mem_total_bytes=13733888i,net_rx_bytes=0i,net_rx_dropped=0,net_rx_errors=0,net_rx_packets=0,net_tx_bytes=0i,net_tx_dropped=0,net_tx_errors=0,net_tx_packets=0 1511859222000000000 | ||
dcos_app,cluster=enterprise,container_id=b64115de-3d2a-431d-a805-76e7c46453f1,hostname=192.168.122.18 container_received_bytes_per_sec=0,container_throttled_bytes_per_sec=0 1511859222000000000 | ||
``` |
Oops, something went wrong.