Skip to content

Kubernetes system data not working? #401

Closed
@fr0der1c

Description

@fr0der1c

Describe the bug: I upgraded my python agent to 4.1.0, however, I still not seeing kubernetes-related info in APM. When I click "view pod APM traces" in Infra panel, I was redirected to a page telling me "No services were found".

image

image

Expected behaviour: there should be kubernetes related information in APM.

Environment (please complete the following information)

  • OS: [e.g. Linux] Linux
  • Python version: 3.7.1
  • Framework and version [e.g. Django 2.1]: flask
  • APM Server version: 6.6.0
  • Agent version: 4.1.0

Additional information

I read the PR that introduced kubernetes data (https://github.com/elastic/apm-agent-python/pull/352/files#diff-a16d47fe7d0f7cec237390038b9cd86bR52), and found related code:

    def get_system_info(self):
        system_data = {
            "hostname": keyword_field(socket.gethostname()),
            "architecture": platform.machine(),
            "platform": platform.system().lower(),
        }
        system_data.update(cgroup.get_cgroup_container_metadata())
        pod_name = os.environ.get("KUBERNETES_POD_NAME") or system_data["hostname"]
        changed = False
        if "kubernetes" in system_data:
            k8s = system_data["kubernetes"]
            k8s["pod"]["name"] = pod_name
        else:
            k8s = {"pod": {"name": pod_name}}
        # get kubernetes metadata from environment
        if "KUBERNETES_NODE_NAME" in os.environ:
            k8s["node"] = {"name": os.environ["KUBERNETES_NODE_NAME"]}
            changed = True
        if "KUBERNETES_NAMESPACE" in os.environ:
            k8s["namespace"] = os.environ["KUBERNETES_NAMESPACE"]
            changed = True
        if "KUBERNETES_POD_UID" in os.environ:
            # this takes precedence over any value from /proc/self/cgroup
            k8s["pod"]["uid"] = os.environ["KUBERNETES_POD_UID"]
            changed = True
        if changed:
            system_data["kubernetes"] = k8s
        return system_data

According to the doc of cgroup.get_cgroup_container_metadata, it returns a dict contains "container" and "pod". So there shouldn't have a "kubernetes" key in system_data at all, why to if "kubernetes" in system_data?

If I'm right, environment variables like KUBERNETES_NODE_NAME are not set by kubernetes itself. They have to be manually set using fieldRef. So changed is always False by default, and "kubernetes" is never a key in system_data. (I guess we have to set these environment variables in kubernetes deployment yaml to get this information in APM, but the document is a bit slower than code?)

Also, you can always get pod name(since you can always get hostname) theoretically and save it to system information. But according to the code, if you do not have other environment variable set, pod_name is just saved to k8s and k8s is just thrown away because changed is False.

To show what this function returns, I opened a Python shell in the pod:

>>> import elasticapm.base
>>> client = elasticapm.base.Client()
>>> print(client.get_system_info())
{'hostname': 'everyclass-api-server-74dc77cf45-2qdkw', 'architecture': 'x86_64', 'platform': 'linux', 'container': {'id': 'c7bc01b4eb320bf47fd3158ed2fe012fc1036469341810dd1a4d32ad08346e33'}, 'pod': {'uid': '09b7c667-2947-11e9-939a-0a587f8611e1'}}

I can see "docker.container.id" in Discover, but there is no "pod id" in APM indices. I guess this is the reason I get "No services were found" when I click "view pod APM traces" in Infra panel. Is this caused by APM Server?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions