-
Notifications
You must be signed in to change notification settings - Fork 16.7k
Conversation
Can a kubernetes member verify that this patch is reasonable to test? If so, please reply with "@k8s-bot ok to test" on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands will still work. Regular contributors should join the org to skip this step.
If you have questions or suggestions related to this bot's behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
|
Helm install failed with:
I dont think that is defined in the values.yaml |
|
||
To trail the logs for the Telegraf pod run the following: | ||
|
||
- kubectl logs -f --namespace {{ .Release.Namespace }} $(kubectl get pods --namespace {{ .Release.Namespace }} -l app={{ template "fullname" . }} -o jsonpath='{ .items[0].metadata.name }') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This command failed for me:
$ kubectl logs -f --namespace default $(kubectl get pods --namespace default -l app=honorary-peahe-telegraf -o jsonpath='{ .items[0].metadata.name }')
error executing jsonpath "{ .items[0].metadata.name }": array index out of bounds: index 0, length 0
error: expected 'logs POD_NAME [CONTAINER_NAME]'.
POD_NAME is a required argument for the logs command
See 'kubectl logs -h' for help and examples.
|
||
- kubectl exec -i -t --namespace {{ .Release.Namespace }} $(kubectl get pods --namespace {{ .Release.Namespace }} -l app={{ template "fullname" . }} -o jsonpath='{.items[0].metadata.name}') /bin/sh | ||
|
||
To trail the logs for the Telegraf pod run the following: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same typo as other chart
outputs: | ||
influxdb: | ||
urls: | ||
- "http://influxdb-influxdb.tick:8086" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is a required value I would leave it unset and not deploy the daemonset or other deployment until it gets set explicitly by the user. Check out the Ghost chart for an example of that pattern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In here? I'm not seeing that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jackzampolin take a look at https://github.com/kubernetes/charts/blob/master/stable/minecraft/templates/deployment.yaml#L1 for how the deployment is conditionally skipped, and https://github.com/kubernetes/charts/blob/master/stable/minecraft/templates/NOTES.txt#L1 for how this information is presented to the user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed! Thanks @prydonius
Should single and Daemonset both be enabled by default? When should you use one vs the other? Can the configs be unified? |
swap: | ||
system: | ||
kubernetes: | ||
url: "http://$NODE_IP:10255" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This didn't work for me.
Is this looking for the heapster/cadvisor port? Where are you expecting these to be evaluated? You may need to do this swap with an init container as the pod starts up. Example in the Jenkins chart here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can reference ENV
in the telegraf configuration file. That would get evaluated when telegraf starts up. I haven't had an issue with this approach before. Are you seeing:
2016-12-05T20:56:40Z E! ERROR in input [inputs.kubernetes]: Errors encountered: [error making HTTP request to http://$NODE_IP:10255/stats/summary: dial tcp [::1]:10255: getsockopt: connection refused]```
The daemonset is using the pod name as the hostname for the daemonset when it should use the hostname of the node. Otherwise every pod restart causes a new "node" in telegraf/influxdb parlance which makes things confusing. From what I can tell telegraf is monitoring the node not the pod but I could be wrong. |
The telegraf daemonset will pull any host level metrics, cpu, mem, disk, docker. Also the For the services or applications running on the cluster they can be polled by or push data to the telegraf single instance. The configuration for the individual by default only polls an InfluxDB instance (for monitoring the database), uses the prometheus plugin to poll the Kubernetes API server, and exposes a The configs can't really be unified, however we might be able to leave off the daemonset configuration. I would like to leave it so that adding support for other host-level plugins is easy.
I think I fixed this last week. |
I've gone ahead and made all the requested changes. @viglesiasce @prydonius anything else you need from me here? |
@k8s-bot ok to test |
Jenkins Charts e2e failed for commit 7194576. Full PR test history. The magic incantation to run this job again is |
@jackzampolin this no longer installs on the first pass. You need to leave (at least) the service as installing. Just leave off the deployment and daemonset. |
Still seeing the DS coming up with the pod IP name in Chronograf, it should be the node name. |
In chronograf im still not seeing anything in the Kubernetes dashboard. How does that get populated? |
Done! Should just deploy the service by default with nothing backing it unless the
🤦 sorry about that! That is also related to the kubernetes dashboard not showing up. I've pushed the fix. |
Can you add a note to the NOTES.txt that indicates that Telegraf is not yet properly deployed and point them at how they can rectify it? Check out the Ghost chart for an example. |
This worked for me. In a follow up PR can you add the syntax for the upgrading the infludb values? Here is what worked for me (since the value is a list): helm upgrade romping-zebra stable/telegraf --set daemonset.config.outputs.influxdb.urls={http://quelling-quetzal-influxd:8086} |
Signed CLA and fulfilled all the requirements. Any feedback is more than welcome!