Description
What is the issue?
Whenever we create a GRPCRoute
that has no matches at all, the proxy sees that as invalid outbound policy.
How can it be reproduced?
Reproduction:
Create a GRPC server:
apiVersion: v1
kind: Service
metadata:
name: server
spec:
type: ClusterIP
selector:
app: server
ports:
- port: 8080
protocol: TCP
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: server
spec:
replicas: 1
selector:
matchLabels:
app: server
template:
metadata:
labels:
app: server
annotations:
linkerd.io/inject: enabled
spec:
containers:
- name: server
image: buoyantio/bb:v0.0.5
command: [ "sh", "-c"]
args:
- "/out/bb terminus --grpc-server-port 8080 --response-text hello --fire-and-forget"
ports:
- name: grpc-port
containerPort: 8080
Create a GRPCRoute for this server:
apiVersion: gateway.networking.k8s.io/v1alpha2
kind: GRPCRoute
metadata:
name: grpc
spec:
parentRefs:
- name: server
group: core
kind: Service
port: 8080
Make sure the status of the Route is accepted and backend refs are resolved.
Now, create a simple client (non GRPC), just to be able to fascilitate a policy resolution for this target.
apiVersion: v1
kind: Pod
metadata:
name: client
annotations:
linkerd.io/inject: enabled
config.linkerd.io/proxy-log-level: debug
spec:
containers:
- name: client
image: curlimages/curl
command:
- "sh"
- "-c"
- >
while true; do
sleep 3600;
done
Ssh into the client and issue a simple curl request to the service on port 8080.
kubectl exec -it client -c client -- sh
~ $ curl -v server:8080
* Host server:8080 was resolved.
* IPv6: (none)
* IPv4: 10.43.248.236
* Trying 10.43.248.236:8080...
* Connected to server (10.43.248.236) port 8080
* using HTTP/1.x
> GET / HTTP/1.1
> Host: server:8080
> User-Agent: curl/8.11.0
> Accept: */*
>
* Request completely sent off
< HTTP/1.1 500 Internal Server Error
< l5d-proxy-error: unexpected error
< connection: close
< l5d-proxy-connection: close
< content-length: 0
< date: Wed, 04 Dec 2024 07:55:17 GMT
<
* shutting down connection #0
~ $
If you tail the logs of the client proxy you will see that the policy that has been delivered is invalid:
kubectl logs pod/client -c linkerd-proxy -f | grep ClientPolicy
[ 51.027132s] DEBUG ThreadId(01) linkerd_app_outbound::policy::api: policy=ClientPolicy { parent: Default { name: "invalid" }, protocol: Detect { timeout: 10s, http1: Http1 { routes: [Route { hosts: [], rules: [Rule { matches: [MatchRequest { path: None, headers: [], query_params: [], method: None }], policy: RoutePolicy { meta: Default { name: "invalid" }, filters: [InternalError("invalid client policy configuration")], distribution: Empty, params: RouteParams { timeouts: Timeouts { response: None, idle: None, request: None }, retry: None, allow_l5d_request_headers: false } } }] }], failure_accrual: None }, http2: Http2 { routes: [Route { hosts: [], rules: [Rule { matches: [MatchRequest { path: None, headers: [], query_params: [], method: None }], policy: RoutePolicy { meta: Default { name: "invalid" }, filters: [InternalError("invalid client policy configuration")], distribution: Empty, params: RouteParams { timeouts: Timeouts { response: None, idle: None, request: None }, retry: None, allow_l5d_request_headers: false } } }] }], failure_accrual: None }, opaque: Opaque { routes: Some(Route { policy: RoutePolicy { meta: Default { name: "invalid" }, filters: [InternalError("invalid client policy configuration")], distribution: Empty, params: () } }) } }, backends: [] }
You can also observe the following error:
[ 111.949112s] WARN ThreadId(01) linkerd_app_outbound::policy::api: Client policy misconfigured error=invalid gRPC route: invalid route match: missing RPC match
The reason why this happens is because the proxy tries to parse the RPC match but it fails because it is missing. Lets walk through the stack in the policy controller:
-
It is missing because the policy controller contains the following logic for putting together the API response. This indicates that if our method match is
None
, then the RPC will beNone
as well (which is the source of the problem.)
https://github.com/linkerd/linkerd2/blob/main/policy-controller/grpc/src/routes/grpc.rs#L23 -
The reason why the method in this case is
.None
can be seen in the following piece of code.This is the code that takes the deserialized K8s API representation and turns it into our internal type. -
The K8s representation in turn is deserialized using the following logic in the gateway API repo:
fn deserialize_method_match<'de, D: serde::Deserializer<'de>>(
deserializer: D,
) -> Result<Option<GrpcMethodMatch>, D::Error> {
<Option<GrpcMethodMatch> as serde::Deserialize>::deserialize(deserializer).map(|value| {
match value.as_ref() {
Some(rule) if rule.is_empty() => None,
_ => value,
}
})
}
...
impl GrpcMethodMatch {
fn is_empty(&self) -> bool {
let (method, service) = match self {
Self::Exact { method, service } => (method, service),
Self::RegularExpression { method, service } => (method, service),
};
method.as_deref().map(str::is_empty).unwrap_or(true)
&& service.as_deref().map(str::is_empty).unwrap_or(true)
}
}
It is clear at this point that if method
is an empty string or service
is an empty string then this is an empty GRPC match and None
is returned, even if the rype of the match has been specified.
Now, if we look closely, we will see that the version of the GRPCRoute
CRDs that we use, has a default for route matches that create a route match of type Method
with nothing else specified: https://github.com/linkerd/linkerd2/blob/main/charts/linkerd-crds/templates/gateway.networking.k8s.io_grpcroutes.yaml#L254
This makes it so, whenever we create a GRPCRoute
with no explicit matches, we end up interpreting that as an Invalid one.
Logs, error output, etc
already provided
output of linkerd check -o short
n/a
Environment
- Linkerd version: edge-24.11.8
- k3d version v5.6.3
- k3s version v1.28.8-k3s1 (default)
Possible solution
No response
Additional context
No response
Would you like to work on fixing this bug?
None