Description
Describe the bug: I upgraded my python agent to 4.1.0, however, I still not seeing kubernetes-related info in APM. When I click "view pod APM traces" in Infra panel, I was redirected to a page telling me "No services were found".
Expected behaviour: there should be kubernetes related information in APM.
Environment (please complete the following information)
- OS: [e.g. Linux] Linux
- Python version: 3.7.1
- Framework and version [e.g. Django 2.1]: flask
- APM Server version: 6.6.0
- Agent version: 4.1.0
Additional information
I read the PR that introduced kubernetes data (https://github.com/elastic/apm-agent-python/pull/352/files#diff-a16d47fe7d0f7cec237390038b9cd86bR52), and found related code:
def get_system_info(self):
system_data = {
"hostname": keyword_field(socket.gethostname()),
"architecture": platform.machine(),
"platform": platform.system().lower(),
}
system_data.update(cgroup.get_cgroup_container_metadata())
pod_name = os.environ.get("KUBERNETES_POD_NAME") or system_data["hostname"]
changed = False
if "kubernetes" in system_data:
k8s = system_data["kubernetes"]
k8s["pod"]["name"] = pod_name
else:
k8s = {"pod": {"name": pod_name}}
# get kubernetes metadata from environment
if "KUBERNETES_NODE_NAME" in os.environ:
k8s["node"] = {"name": os.environ["KUBERNETES_NODE_NAME"]}
changed = True
if "KUBERNETES_NAMESPACE" in os.environ:
k8s["namespace"] = os.environ["KUBERNETES_NAMESPACE"]
changed = True
if "KUBERNETES_POD_UID" in os.environ:
# this takes precedence over any value from /proc/self/cgroup
k8s["pod"]["uid"] = os.environ["KUBERNETES_POD_UID"]
changed = True
if changed:
system_data["kubernetes"] = k8s
return system_data
According to the doc of cgroup.get_cgroup_container_metadata
, it returns a dict contains "container" and "pod". So there shouldn't have a "kubernetes" key in system_data
at all, why to if "kubernetes" in system_data
?
If I'm right, environment variables like KUBERNETES_NODE_NAME
are not set by kubernetes itself. They have to be manually set using fieldRef
. So changed
is always False
by default, and "kubernetes" is never a key in system_data
. (I guess we have to set these environment variables in kubernetes deployment yaml to get this information in APM, but the document is a bit slower than code?)
Also, you can always get pod name(since you can always get hostname) theoretically and save it to system information. But according to the code, if you do not have other environment variable set, pod_name
is just saved to k8s
and k8s is just thrown away because changed
is False
.
To show what this function returns, I opened a Python shell in the pod:
>>> import elasticapm.base
>>> client = elasticapm.base.Client()
>>> print(client.get_system_info())
{'hostname': 'everyclass-api-server-74dc77cf45-2qdkw', 'architecture': 'x86_64', 'platform': 'linux', 'container': {'id': 'c7bc01b4eb320bf47fd3158ed2fe012fc1036469341810dd1a4d32ad08346e33'}, 'pod': {'uid': '09b7c667-2947-11e9-939a-0a587f8611e1'}}
I can see "docker.container.id" in Discover, but there is no "pod id" in APM indices. I guess this is the reason I get "No services were found" when I click "view pod APM traces" in Infra panel. Is this caused by APM Server?