Add Telegraf Chart #269

jackzampolin · 2016-12-01T15:35:00Z

Signed CLA and fulfilled all the requirements. Any feedback is more than welcome!

k8s-ci-robot · 2016-12-01T15:35:03Z

Can a kubernetes member verify that this patch is reasonable to test? If so, please reply with "@k8s-bot ok to test" on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands will still work. Regular contributors should join the org to skip this step.

If you have questions or suggestions related to this bot's behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.

viglesiasce · 2016-12-02T06:30:32Z

Helm install failed with:

helm install .
Error: render error in "telegraf/templates/NOTES.txt": template: telegraf/templates/NOTES.txt:1:61: executing "telegraf/templates/NOTES.txt" at <.Values.inputs.stats...>: can't evaluate field statsd in type interface {}

I dont think that is defined in the values.yaml

viglesiasce · 2016-12-05T19:43:15Z

stable/telegraf/templates/NOTES.txt

+
+To trail the logs for the Telegraf pod run the following:
+
+- kubectl logs -f --namespace {{ .Release.Namespace }} $(kubectl get pods --namespace {{ .Release.Namespace }} -l app={{ template "fullname" . }} -o jsonpath='{ .items[0].metadata.name }')


This command failed for me:

$ kubectl logs -f --namespace default $(kubectl get pods --namespace default -l app=honorary-peahe-telegraf -o jsonpath='{ .items[0].metadata.name }') error executing jsonpath "{ .items[0].metadata.name }": array index out of bounds: index 0, length 0 error: expected 'logs POD_NAME [CONTAINER_NAME]'. POD_NAME is a required argument for the logs command See 'kubectl logs -h' for help and examples.

viglesiasce · 2016-12-05T19:43:24Z

stable/telegraf/templates/NOTES.txt

+
+- kubectl exec -i -t --namespace {{ .Release.Namespace }} $(kubectl get pods --namespace {{ .Release.Namespace }} -l app={{ template "fullname" . }} -o jsonpath='{.items[0].metadata.name}') /bin/sh
+
+To trail the logs for the Telegraf pod run the following:


Same typo as other chart

viglesiasce · 2016-12-05T19:47:12Z

stable/telegraf/values.yaml

+    outputs:
+      influxdb:
+        urls:
+          - "http://influxdb-influxdb.tick:8086"


Since this is a required value I would leave it unset and not deploy the daemonset or other deployment until it gets set explicitly by the user. Check out the Ghost chart for an example of that pattern.

In here? I'm not seeing that.

@jackzampolin take a look at https://github.com/kubernetes/charts/blob/master/stable/minecraft/templates/deployment.yaml#L1 for how the deployment is conditionally skipped, and https://github.com/kubernetes/charts/blob/master/stable/minecraft/templates/NOTES.txt#L1 for how this information is presented to the user.

Fixed! Thanks @prydonius

viglesiasce · 2016-12-05T19:52:37Z

Should single and Daemonset both be enabled by default? When should you use one vs the other? Can the configs be unified?

viglesiasce · 2016-12-05T20:01:55Z

stable/telegraf/values.yaml

+      swap:
+      system:
+      kubernetes:
+        url: "http://$NODE_IP:10255"


This didn't work for me.

Is this looking for the heapster/cadvisor port? Where are you expecting these to be evaluated? You may need to do this swap with an init container as the pod starts up. Example in the Jenkins chart here.

You can reference ENV in the telegraf configuration file. That would get evaluated when telegraf starts up. I haven't had an issue with this approach before. Are you seeing:

2016-12-05T20:56:40Z E! ERROR in input [inputs.kubernetes]: Errors encountered: [error making HTTP request to http://$NODE_IP:10255/stats/summary: dial tcp [::1]:10255: getsockopt: connection refused]```

viglesiasce · 2016-12-05T20:05:00Z

The daemonset is using the pod name as the hostname for the daemonset when it should use the hostname of the node. Otherwise every pod restart causes a new "node" in telegraf/influxdb parlance which makes things confusing. From what I can tell telegraf is monitoring the node not the pod but I could be wrong.

jackzampolin · 2016-12-05T21:14:02Z

Should single and Daemonset both be enabled by default? When should you use one vs the other? Can the configs be unified?

The telegraf daemonset will pull any host level metrics, cpu, mem, disk, docker. Also the kubernetes telegraf plugin works by polling each of the kubelets and needs to be run on each host.

For the services or applications running on the cluster they can be polled by or push data to the telegraf single instance. The configuration for the individual by default only polls an InfluxDB instance (for monitoring the database), uses the prometheus plugin to poll the Kubernetes API server, and exposes a statsd endpoint.

The configs can't really be unified, however we might be able to leave off the daemonset configuration. I would like to leave it so that adding support for other host-level plugins is easy.

The daemonset is using the pod name as the hostname for the daemonset when it should use the hostname of the node. Otherwise every pod restart causes a new "node" in telegraf/influxdb parlance which makes things confusing. From what I can tell telegraf is monitoring the node not the pod but I could be wrong.

I think I fixed this last week.

jackzampolin · 2016-12-08T17:16:01Z

I've gone ahead and made all the requested changes. @viglesiasce @prydonius anything else you need from me here?

viglesiasce · 2016-12-09T19:00:06Z

@k8s-bot ok to test

k8s-ci-robot · 2016-12-09T19:01:20Z

Jenkins Charts e2e failed for commit 7194576. Full PR test history.

The magic incantation to run this job again is @k8s-bot e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

viglesiasce · 2016-12-09T19:03:02Z

@jackzampolin this no longer installs on the first pass. You need to leave (at least) the service as installing. Just leave off the deployment and daemonset.

viglesiasce · 2016-12-09T19:09:13Z

Still seeing the DS coming up with the pod IP name in Chronograf, it should be the node name.

viglesiasce · 2016-12-09T19:09:49Z

In chronograf im still not seeing anything in the Kubernetes dashboard. How does that get populated?

jackzampolin · 2016-12-13T21:21:53Z

@viglesiasce:

@jackzampolin this no longer installs on the first pass. You need to leave (at least) the service as installing. Just leave off the deployment and daemonset.

Done! Should just deploy the service by default with nothing backing it unless the single.config.outputs.influxdb.urls[0] (for single instance) or daemonset.config.outputs.influxdb.urls[0] (for demonset) is set to an InfluxDB server.

Still seeing the DS coming up with the pod IP name in Chronograf, it should be the node name.

🤦 sorry about that! That is also related to the kubernetes dashboard not showing up. I've pushed the fix.

viglesiasce · 2016-12-14T23:45:34Z

Can you add a note to the NOTES.txt that indicates that Telegraf is not yet properly deployed and point them at how they can rectify it? Check out the Ghost chart for an example.

viglesiasce · 2016-12-19T23:09:51Z

This worked for me. In a follow up PR can you add the syntax for the upgrading the infludb values?

Here is what worked for me (since the value is a list):

helm upgrade romping-zebra stable/telegraf --set daemonset.config.outputs.influxdb.urls={http://quelling-quetzal-influxd:8086}

Add telegraf plugin

b31fc32

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Dec 1, 2016

jackzampolin added 2 commits December 1, 2016 07:37

Add NOTES.txt

f091a62

Update README.md

99ecbbf

lachie83 added the awaiting review label Dec 2, 2016

viglesiasce added changes needed and removed awaiting review labels Dec 2, 2016

jackzampolin added 3 commits December 2, 2016 08:00

Fix NOTES.txt errors

7605e38

Fix hostname change on redeploy

67b4471

NodePort -> ClusterIP and remove values.yaml from README.md'

1b73ccf

viglesiasce suggested changes Dec 5, 2016

View reviewed changes

viglesiasce reviewed Dec 5, 2016

View reviewed changes

jackzampolin added 2 commits December 5, 2016 13:15

Fix NOTES.txt

bf7ecdb

Make deployment not work unless influxdb url is set

7194576

jackzampolin added 2 commits December 13, 2016 13:13

Fix kubenetes plugin and deploy only service by default

5699868

Add hostname for telegraf-ds

db8acdf

Fix service names

41ab2c3

viglesiasce added awaiting review and removed changes needed labels Dec 14, 2016

viglesiasce added changes needed and removed awaiting review labels Dec 15, 2016

Add note if .Values.daemonset.config.outputs.influxdb.url isn't set

cc7682d

viglesiasce added awaiting review and removed changes needed labels Dec 19, 2016

viglesiasce approved these changes Dec 19, 2016

View reviewed changes

viglesiasce added code reviewed lgtm Indicates that a PR is ready to be merged. UX reviewed and removed awaiting review labels Dec 19, 2016

prydonius merged commit bcbd8cd into helm:master Dec 19, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Telegraf Chart #269

Add Telegraf Chart #269

jackzampolin commented Dec 1, 2016

k8s-ci-robot commented Dec 1, 2016

viglesiasce commented Dec 2, 2016

viglesiasce Dec 5, 2016

viglesiasce Dec 5, 2016

viglesiasce Dec 5, 2016

jackzampolin Dec 5, 2016

prydonius Dec 7, 2016

jackzampolin Dec 8, 2016

viglesiasce commented Dec 5, 2016

viglesiasce Dec 5, 2016

jackzampolin Dec 5, 2016 •

edited

Loading

viglesiasce commented Dec 5, 2016

jackzampolin commented Dec 5, 2016 •

edited

Loading

jackzampolin commented Dec 8, 2016

viglesiasce commented Dec 9, 2016

k8s-ci-robot commented Dec 9, 2016

viglesiasce commented Dec 9, 2016

viglesiasce commented Dec 9, 2016

viglesiasce commented Dec 9, 2016

jackzampolin commented Dec 13, 2016

viglesiasce commented Dec 14, 2016

viglesiasce commented Dec 19, 2016


		To trail the logs for the Telegraf pod run the following:

		- kubectl logs -f --namespace {{ .Release.Namespace }} $(kubectl get pods --namespace {{ .Release.Namespace }} -l app={{ template "fullname" . }} -o jsonpath='{ .items[0].metadata.name }')


		- kubectl exec -i -t --namespace {{ .Release.Namespace }} $(kubectl get pods --namespace {{ .Release.Namespace }} -l app={{ template "fullname" . }} -o jsonpath='{.items[0].metadata.name}') /bin/sh

		To trail the logs for the Telegraf pod run the following:

Add Telegraf Chart #269

Add Telegraf Chart #269

Conversation

jackzampolin commented Dec 1, 2016

k8s-ci-robot commented Dec 1, 2016

viglesiasce commented Dec 2, 2016

viglesiasce Dec 5, 2016

Choose a reason for hiding this comment

viglesiasce Dec 5, 2016

Choose a reason for hiding this comment

viglesiasce Dec 5, 2016

Choose a reason for hiding this comment

jackzampolin Dec 5, 2016

Choose a reason for hiding this comment

prydonius Dec 7, 2016

Choose a reason for hiding this comment

jackzampolin Dec 8, 2016

Choose a reason for hiding this comment

viglesiasce commented Dec 5, 2016

viglesiasce Dec 5, 2016

Choose a reason for hiding this comment

jackzampolin Dec 5, 2016 • edited Loading

Choose a reason for hiding this comment

viglesiasce commented Dec 5, 2016

jackzampolin commented Dec 5, 2016 • edited Loading

jackzampolin commented Dec 8, 2016

viglesiasce commented Dec 9, 2016

k8s-ci-robot commented Dec 9, 2016

viglesiasce commented Dec 9, 2016

viglesiasce commented Dec 9, 2016

viglesiasce commented Dec 9, 2016

jackzampolin commented Dec 13, 2016

viglesiasce commented Dec 14, 2016

viglesiasce commented Dec 19, 2016

jackzampolin Dec 5, 2016 •

edited

Loading

jackzampolin commented Dec 5, 2016 •

edited

Loading