Skip to content

Service Profile does not get detected on emissary ingress deployment #13589

Open
@ghostx31

Description

@ghostx31

What is the issue?

We are running Linkerd 2.14 on our AWS EKS cluster.
We are trying to get per-route metrics using ServiceProfiles working on Emissary api-gateway which is deployed just behind our AWS ALB ingress, but the ServiceProfile does not show any per-route metrics.

How can it be reproduced?

Deploy emissary api-gateway deployment, add it as a service behind Ingress for hosts. Deploy ServiceProfile for the service and send requests to the endpoints.

Logs, error output, etc

As per the docs for ServiceProfiles

The destination service for a request is computed by selecting the value of the first header to exist of, l5d-dst-override, :authority, and Host

I interpret this as whichever header linkerd sees first is used to interpret the ServiceProfile. Please correct me if I am wrong.

Since I can't attach a debug container to an emissary pod, here is the output for another podinfo pod with the linkerd-debug sidecar attached:

Hypertext Transfer Protocol
    GET /healthz HTTP/1.1\r\n
        [Expert Info (Chat/Sequence): GET /healthz HTTP/1.1\r\n]
            [GET /healthz HTTP/1.1\r\n]
            [Severity level: Chat]
            [Group: Sequence]
        Request Method: GET
        Request URI: /healthz
        Request Version: HTTP/1.1
    host: api.client.com\r\n
    x-forwarded-proto: https\r\n
    x-forwarded-port: 443\r\n
    user-agent: curl/8.7.1\r\n
    accept: */*\r\n
    x-envoy-expected-rq-timeout-ms: 30000\r\n
    l5d-dst-override: podinfo-svc.podinfo.svc.cluster.local:9898\r\n
    x-envoy-original-path: /healthz\r\n
    l5d-dst-canonical: podinfo-svc.podinfo.svc.cluster.local:9898\r\n
    l5d-client-id: pg-gateway.pg-gateway.serviceaccount.identity.linkerd.cluster.local\r\n
    \r\n
    [Full request URI: http://api.client.com/healthz]
    [HTTP request 2/2]
    [Prev request in frame: 32]

Seems like since host header is the first to appear, Linkerd picks it up and it does not match the name of the ServiceProfile used for Podinfo.

But if I use Mappings to rewrite the host for the mapping used by Podinfo, I can see ServiceProfile being detected correctly.

Hypertext Transfer Protocol
    GET /healthz HTTP/1.1\r\n
        [Expert Info (Chat/Sequence): GET /healthz HTTP/1.1\r\n]
            [GET /healthz HTTP/1.1\r\n]
            [Severity level: Chat]
            [Group: Sequence]
        Request Method: GET
        Request URI: /healthz
        Request Version: HTTP/1.1
    host: podinfo-svc.podinfo.svc.cluster.local:9898\r\n
    x-forwarded-proto: https\r\n
    x-forwarded-port: 443\r\n
    user-agent: curl/8.7.1\r\n
    accept: */*\r\n
    x-envoy-expected-rq-timeout-ms: 30000\r\n
    x-idfy-gateway-id: pg-gateway\r\n
    l5d-dst-override: podinfo-svc.podinfo.svc.cluster.local:9898\r\n
    x-envoy-original-path: /healthz\r\n
    l5d-dst-canonical: podinfo-svc.podinfo.svc.cluster.local:9898\r\n
    l5d-client-id: pg-gateway.pg-gateway.serviceaccount.identity.linkerd.cluster.local\r\n
    \r\n
    [Full request URI: http://podinfo-svc.podinfo.svc.cluster.local:9898/healthz]
    [HTTP request 1/1]

So l5d-dst-override is not getting read.

Here is the mapping in case required:

apiVersion: getambassador.io/v3alpha1
kind: Mapping
metadata:
  annotations:
  labels:
    argocd.argoproj.io/instance: scorpius-pg-gateway-mapping
  name: podinfo-health-check
  namespace: pg-gateway
spec:
  ambassador_id:
    - pg-api-gateway
  bypass_auth: true
  host_rewrite: 'podinfo-svc.podinfo.svc.cluster.local:9898'
  hostname: '*'
  prefix: /healthz
  rewrite: /healthz
  service: 'podinfo-svc.podinfo.svc.cluster.local:9898'
  timeout_ms: 30000

This works on podinfo because I can use the mapping construct but cannot do the same on Emissary ingress pods because Ingress is directly forwarding traffic to emissary and I cannot use any mapping for rewriting Host on emissary.

output of linkerd check -o short

linkerd-version
---------------
‼ cli is up-to-date
    unsupported version channel: stable-2.14.0
    see https://linkerd.io/2.14/checks/#l5d-version-cli for hints

control-plane-version
---------------------
‼ control plane is up-to-date
    unsupported version channel: stable-2.14.0
    see https://linkerd.io/2.14/checks/#l5d-version-control for hints

linkerd-control-plane-proxy
---------------------------
‼ control plane proxies are up-to-date
    some proxies are not running the current version:
	* linkerd-destination-6f6cbbf6c9-wtvvq (stable-2.14.0)
	* linkerd-identity-66dfc67478-7xdxx (stable-2.14.0)
	* linkerd-proxy-injector-67d54d5c78-7xvm4 (stable-2.14.0)
    see https://linkerd.io/2.14/checks/#l5d-cp-proxy-version for hints

linkerd-viz
-----------
‼ viz extension proxies are up-to-date
    some proxies are not running the current version:
	* metrics-api-5c4f49c9cf-kjkcf (stable-2.14.0)
	* tap-64975d56bc-7fzp7 (stable-2.14.0)
	* tap-injector-6bd696c58b-ld48v (stable-2.14.0)
	* web-556b79cddd-4v494 (stable-2.14.0)
    see https://linkerd.io/2.14/checks/#l5d-viz-proxy-cp-version for hints

Status check results are √

Environment

  • Kubernetes version: 1.31
  • Env: EKS
  • Linkerd version: 2.14.0

Possible solution

No response

Additional context

No response

Would you like to work on fixing this bug?

None

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions