Skip to content

GRPCRoute with no matches results in invalid policy #13425

Open
@zaharidichev

Description

@zaharidichev

What is the issue?

Whenever we create a GRPCRoute that has no matches at all, the proxy sees that as invalid outbound policy.

How can it be reproduced?

Reproduction:

Create a GRPC server:

apiVersion: v1
kind: Service
metadata:
  name: server
spec:
  type: ClusterIP
  selector:
    app: server
  ports:
  - port: 8080
    protocol: TCP
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: server
spec:
  replicas: 1
  selector:
    matchLabels:
      app: server
  template:
    metadata:
      labels:
        app: server
      annotations:
        linkerd.io/inject: enabled
    spec:
      containers:
        - name: server
          image: buoyantio/bb:v0.0.5
          command: [ "sh", "-c"]
          args:
          - "/out/bb terminus --grpc-server-port 8080 --response-text hello --fire-and-forget"
          ports:
            - name: grpc-port
              containerPort: 8080

Create a GRPCRoute for this server:

apiVersion: gateway.networking.k8s.io/v1alpha2
kind: GRPCRoute
metadata:
  name: grpc
spec:
  parentRefs:
    - name: server
      group: core
      kind: Service
      port: 8080

Make sure the status of the Route is accepted and backend refs are resolved.

Now, create a simple client (non GRPC), just to be able to fascilitate a policy resolution for this target.

apiVersion: v1
kind: Pod
metadata:
  name: client
  annotations:
    linkerd.io/inject: enabled
    config.linkerd.io/proxy-log-level: debug
spec:
  containers:
  - name: client
    image: curlimages/curl
    command:
      - "sh"
      - "-c"
      - >
        while true; do
          sleep 3600;
        done

Ssh into the client and issue a simple curl request to the service on port 8080.

kubectl  exec -it client -c client -- sh

~ $ curl -v server:8080
* Host server:8080 was resolved.
* IPv6: (none)
* IPv4: 10.43.248.236
*   Trying 10.43.248.236:8080...
* Connected to server (10.43.248.236) port 8080
* using HTTP/1.x
> GET / HTTP/1.1
> Host: server:8080
> User-Agent: curl/8.11.0
> Accept: */*
>
* Request completely sent off
< HTTP/1.1 500 Internal Server Error
< l5d-proxy-error: unexpected error
< connection: close
< l5d-proxy-connection: close
< content-length: 0
< date: Wed, 04 Dec 2024 07:55:17 GMT
<
* shutting down connection #0
~ $

If you tail the logs of the client proxy you will see that the policy that has been delivered is invalid:

kubectl logs pod/client  -c linkerd-proxy -f | grep ClientPolicy
[    51.027132s] DEBUG ThreadId(01) linkerd_app_outbound::policy::api: policy=ClientPolicy { parent: Default { name: "invalid" }, protocol: Detect { timeout: 10s, http1: Http1 { routes: [Route { hosts: [], rules: [Rule { matches: [MatchRequest { path: None, headers: [], query_params: [], method: None }], policy: RoutePolicy { meta: Default { name: "invalid" }, filters: [InternalError("invalid client policy configuration")], distribution: Empty, params: RouteParams { timeouts: Timeouts { response: None, idle: None, request: None }, retry: None, allow_l5d_request_headers: false } } }] }], failure_accrual: None }, http2: Http2 { routes: [Route { hosts: [], rules: [Rule { matches: [MatchRequest { path: None, headers: [], query_params: [], method: None }], policy: RoutePolicy { meta: Default { name: "invalid" }, filters: [InternalError("invalid client policy configuration")], distribution: Empty, params: RouteParams { timeouts: Timeouts { response: None, idle: None, request: None }, retry: None, allow_l5d_request_headers: false } } }] }], failure_accrual: None }, opaque: Opaque { routes: Some(Route { policy: RoutePolicy { meta: Default { name: "invalid" }, filters: [InternalError("invalid client policy configuration")], distribution: Empty, params: () } }) } }, backends: [] }

You can also observe the following error:

[   111.949112s]  WARN ThreadId(01) linkerd_app_outbound::policy::api: Client policy misconfigured error=invalid gRPC route: invalid route match: missing RPC match

The reason why this happens is because the proxy tries to parse the RPC match but it fails because it is missing. Lets walk through the stack in the policy controller:

  1. It is missing because the policy controller contains the following logic for putting together the API response. This indicates that if our method match is None, then the RPC will be None as well (which is the source of the problem.)
    https://github.com/linkerd/linkerd2/blob/main/policy-controller/grpc/src/routes/grpc.rs#L23

  2. The reason why the method in this case is None can be seen in the following piece of code.This is the code that takes the deserialized K8s API representation and turns it into our internal type.

    let method = method.map(|value| match value {
    .

  3. The K8s representation in turn is deserialized using the following logic in the gateway API repo:

fn deserialize_method_match<'de, D: serde::Deserializer<'de>>(
    deserializer: D,
) -> Result<Option<GrpcMethodMatch>, D::Error> {
    <Option<GrpcMethodMatch> as serde::Deserialize>::deserialize(deserializer).map(|value| {
        match value.as_ref() {
            Some(rule) if rule.is_empty() => None,
            _ => value,
        }
    })
}

...
impl GrpcMethodMatch {
    fn is_empty(&self) -> bool {
        let (method, service) = match self {
            Self::Exact { method, service } => (method, service),
            Self::RegularExpression { method, service } => (method, service),
        };

        method.as_deref().map(str::is_empty).unwrap_or(true)
            && service.as_deref().map(str::is_empty).unwrap_or(true)
    }
}

It is clear at this point that if method is an empty string or service is an empty string then this is an empty GRPC match and None is returned, even if the rype of the match has been specified.

Now, if we look closely, we will see that the version of the GRPCRoute CRDs that we use, has a default for route matches that create a route match of type Method with nothing else specified: https://github.com/linkerd/linkerd2/blob/main/charts/linkerd-crds/templates/gateway.networking.k8s.io_grpcroutes.yaml#L254

This makes it so, whenever we create a GRPCRoute with no explicit matches, we end up interpreting that as an Invalid one.

Logs, error output, etc

already provided

output of linkerd check -o short

n/a

Environment

  • Linkerd version: edge-24.11.8
  • k3d version v5.6.3
  • k3s version v1.28.8-k3s1 (default)

Possible solution

No response

Additional context

No response

Would you like to work on fixing this bug?

None

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions