Inconsistent pod ID returned by px.ip_to_pod_id
function #885
Description
Describe the bug
The px.ip_to_pod_id
function in Pixie occasionally returns an incorrect pod ID for a given remote address. The issue occurs when Pixie incorrectly matches the IP address to a random cron job running within the cluster, instead of a long-running pod that should be matched. The problem happens randomly and is not specific to a certain pod type. The pods affected by this issue have a typical lifecycle of a few hours. We suspect that this issue may be caused by a bug in the metadata server.
To Reproduce
Steps to reproduce the behavior:
- Deploy few cronjobs that does an HTTP call to an random endpoint which runs every minute.
- Deploy few pods which talks to each other in HTTP.
- Get the IP address of those pods from kubectl.
- Check the output of
px.pod_id_to_pod_name(px.ip_to_pod_id(POD_IP_ADDRESS))
for each pod after a while.
Note - This is not a occasionally behaviour and hard to reproduce at once.
Expected behavior
px.ip_to_pod_id
should return the correct IP address.
App information (please complete the following information):
- Pixie version - 0.12.10
- K8s cluster version - v1.23.12
- Node Kernel version - 5.4.0-1091-azure
- Browser version
Additional context
Please refer this slack thread for additional context - https://pixie-community.slack.com/archives/C0405TUGY2C/p1676378960370879