Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HorizontalPodAutoscaler for envoy proxy #703

Closed
gitanuj opened this issue Nov 6, 2022 · 23 comments
Closed

HorizontalPodAutoscaler for envoy proxy #703

gitanuj opened this issue Nov 6, 2022 · 23 comments
Assignees
Labels
area/api API-related issues area/infra-mgr Issues related to the provisioner used for provisioning the managed Envoy Proxy fleet. documentation Improvements or additions to documentation kind/enhancement New feature or request
Milestone

Comments

@gitanuj
Copy link

gitanuj commented Nov 6, 2022

Description:
Does envoy gateway have any design to scale out envoy proxy pods? Are there any best practices on how to handle increased number of requests? I'm mostly looking at deploying this to clouds like AWS/GCP/Azure which will allocate application/network Load Balancers but then how can I make sure that envoy proxy pod is not the bottleneck in the gateway and can scale out when needed.

@danehans danehans added kind/enhancement New feature or request help wanted Extra attention is needed labels Nov 7, 2022
@danehans danehans added this to the Backlog milestone Nov 7, 2022
@danehans
Copy link
Contributor

danehans commented Nov 7, 2022

HPA is currently not supported by Envoy Gateway. It's a feature that would be beneficial to the project, so I've added it to the backlog. Let me know if you're interested in working on this and I can assign it to you.

I'm mostly looking at deploying this to clouds like AWS/GCP/Azure which will allocate application/network Load Balancers...

Envoy Gateway currently creates an ELB when running in AWS. #648 intends to design an API for providing configurability to the Envoy service, PTAL, and comment as needed.

@gitanuj
Copy link
Author

gitanuj commented Nov 7, 2022

  1. Is there any design doc explaining which pods are involved in the data path for EG? I'm trying to understand if there are any other components other than EnvoyProxy which need to be scaled out as well.
  2. Before implementing HPA I wanted to understand if I can simply update the number of replicas here. Is there an API through which this can be exposed?

@danehans
Copy link
Contributor

danehans commented Nov 8, 2022

Is there any design doc explaining which pods are involved in the data path for EG? I'm trying to understand if there are any other components other than EnvoyProxy which need to be scaled out as well.

Only the Envoy proxy deployment needs to be scaled. The design docs are the best place for this type of info, specifically the system design and config API design docs.

Before implementing HPA I wanted to understand if I can simply update the number of replicas here. Is there an API through which this can be exposed?

You can update this field but EG will revert it back to the desired state. This field should be configurable using the EnvoyProxy config API. If you would like to make this field configurable, please create an issue that describes your use case and xref this issue.

@arkodg
Copy link
Contributor

arkodg commented Apr 26, 2023

cc @qicz in case you're interested in adding this to EnvoyProxy resource :)

@qicz
Copy link
Member

qicz commented May 8, 2023

cc @qicz in case you're interested in adding this to EnvoyProxy resource :)

I am sorry for the late reply.

I think we just need to provide the same docs about HPA. current envoy deployment we can configure more fields and #1398 support affinity etc settings.

// KubernetesDeploymentSpec defines the desired state of the Kubernetes deployment resource.
type KubernetesDeploymentSpec struct {
// Replicas is the number of desired pods. Defaults to 1.
//
// +optional
Replicas *int32 `json:"replicas,omitempty"`
// Pod defines the desired annotations and securityContext of container.
//
// +optional
Pod *KubernetesPodSpec `json:"pod,omitempty"`
// Container defines the resources and securityContext of container.
//
// +optional
Container *KubernetesContainerSpec `json:"container,omitempty"`
// TODO: Expose config as use cases are better understood, e.g. labels.
}
// KubernetesPodSpec defines the desired state of the Kubernetes pod resource.
type KubernetesPodSpec struct {
// Annotations are the annotations that should be appended to the pods.
// By default, no pod annotations are appended.
//
// +optional
Annotations map[string]string `json:"annotations,omitempty"`
// SecurityContext holds pod-level security attributes and common container settings.
// Optional: Defaults to empty. See type description for default values of each field.
//
// +optional
SecurityContext *corev1.PodSecurityContext `json:"securityContext,omitempty"`
// NodeSelector is a selector which must be true for the pod to fit on a node.
// Selector which must match a node's labels for the pod to be scheduled on that node.
// More info: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/
// +optional
// +mapType=atomic
NodeSelector map[string]string `json:"nodeSelector,omitempty"`
// If specified, the pod's scheduling constraints.
// +optional
Affinity *corev1.Affinity `json:"affinity,omitempty"`
// If specified, the pod's tolerations.
// +optional
Tolerations []corev1.Toleration `json:"tolerations,omitempty"`
}
// KubernetesContainerSpec defines the desired state of the Kubernetes container resource.
type KubernetesContainerSpec struct {
// List of environment variables to set in the container.
//
// +optional
Env []corev1.EnvVar `json:"env,omitempty"`
// Resources required by this container.
// More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
//
// +optional
Resources *corev1.ResourceRequirements `json:"resources,omitempty"`
// SecurityContext defines the security options the container should be run with.
// If set, the fields of SecurityContext override the equivalent fields of PodSecurityContext.
// More info: https://kubernetes.io/docs/tasks/configure-pod-container/security-context/
//
// +optional
SecurityContext *corev1.SecurityContext `json:"securityContext,omitempty"`
// Image specifies the EnvoyProxy container image to be used, instead of the default image.
//
// +optional
Image *string `json:"image,omitempty"`
}

@qicz
Copy link
Member

qicz commented May 8, 2023

example

---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app.kubernetes.io/component: envoy-demo
    app.kubernetes.io/name: envoy-demo
    app.kubernetes.io/part-of: envoy-demo-ns
  name: envoy-demo
  namespace: envoy-demo-ns
spec:
  progressDeadlineSeconds: 600
  replicas:  1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app.kubernetes.io/component: envoy-demo
      app.kubernetes.io/name: envoy-demo
      app.kubernetes.io/part-of: envoy-demo-ns
  template:
    metadata:
      labels:
        app.kubernetes.io/component: envoy-demo
        app.kubernetes.io/name: envoy-demo
        app.kubernetes.io/part-of: envoy-demo-ns
    spec:
      serviceAccountName: envoy-demo-sa
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: node-role.kubernetes.io/master
                    operator: In
                    values:
                      - ""
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
            - podAffinityTerm:
                labelSelector:
                  matchLabels:
                    app.kubernetes.io/component: envoy-demo
                    app.kubernetes.io/name: envoy-demo
                    app.kubernetes.io/part-of: envoy-demo-ns
                namespaces:
                  - envoy-demo-ns
                topologyKey: kubernetes.io/hostname
              weight: 100
      containers:
        - name: envoy-demo
          image: envoy-demo:latest
          imagePullPolicy: IfNotPresent
          resources:
            requests:
              cpu: 10m
              memory: 64Mi
            limits:
              cpu: 1
              memory: 1024Mi
      tolerations:
      - effect: "NoSchedule"
        key: "node-role.kubernetes.io/master"
        operator: "Exists"

---
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: test-hpa
  namespace: envoy-demo-ns
spec:
  maxReplicas: 3
  minReplicas: 1
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: envoy-demo
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 80

@arkodg
Copy link
Contributor

arkodg commented May 8, 2023

thanks for outlining this @qicz, looks like we just need to outline this using docs, and no API changes are needed

@arkodg arkodg changed the title Can I use HorizontalPodAutoscaler for envoy proxy? docs: HorizontalPodAutoscaler for envoy proxy May 8, 2023
@arkodg arkodg added documentation Improvements or additions to documentation good first issue Good for newcomers labels May 8, 2023
@qicz qicz self-assigned this May 9, 2023
@arkodg arkodg removed good first issue Good for newcomers help wanted Extra attention is needed labels May 9, 2023
@github-actions
Copy link

github-actions bot commented Jun 8, 2023

This issue has been automatically marked as stale because it has not had activity in the last 30 days.

@github-actions github-actions bot added the stale label Jun 8, 2023
@arkodg arkodg added help wanted Extra attention is needed and removed stale help wanted Extra attention is needed labels Jun 15, 2023
@github-actions
Copy link

This issue has been automatically marked as stale because it has not had activity in the last 30 days.

@github-actions github-actions bot added the stale label Jul 16, 2023
@arkodg arkodg changed the title docs: HorizontalPodAutoscaler for envoy proxy HorizontalPodAutoscaler for envoy proxy Aug 2, 2023
@arkodg arkodg removed the stale label Aug 2, 2023
@arkodg arkodg modified the milestones: Backlog, 0.6.0-rc1 Aug 2, 2023
@arkodg arkodg added area/api API-related issues area/infra-mgr Issues related to the provisioner used for provisioning the managed Envoy Proxy fleet. labels Aug 2, 2023
@github-actions
Copy link

github-actions bot commented Sep 2, 2023

This issue has been automatically marked as stale because it has not had activity in the last 30 days.

@github-actions github-actions bot added the stale label Sep 2, 2023
@arkodg arkodg removed the stale label Oct 6, 2023
@arkodg
Copy link
Contributor

arkodg commented Oct 14, 2023

@qicz are you planning on working on this one for v0.6.0-rc1 ?

@arkodg arkodg modified the milestones: 0.6.0-rc1, Backlog Oct 20, 2023
@ardikabs
Copy link
Contributor

ardikabs commented Oct 29, 2023

hi @arkodg @qicz

is there any update on this matter? I could help to work on this. However, I need some advice on how we should support HPA for Envoy Proxy deployment.

Just for a context, I've applied a patch to Envoy Gateway at my organization to make it compatible with HPA when enabled externally. This patch allows EG to determine when the EnvoyProxy spec sets the number of replicas to 0, effectively making EG disregard the replicas field in the Envoy Proxy Deployment. However, we think that this implicit solution might not align with the community's goals.

@arkodg arkodg assigned ardikabs and unassigned qicz Oct 29, 2023
@arkodg
Copy link
Contributor

arkodg commented Oct 29, 2023

hey @ardikabs assigned this issue to you for now, thanks for volunteering !
I think disabling the replica field by default in the deployment is fine

the question I have is - should EG create the HPA resource or should the end user create one and link it to the generated Envoyproxy deployment

@ardikabs
Copy link
Contributor

ardikabs commented Oct 30, 2023

IMO, it would be better if EG also managed the HPA config with introduce another field such as hpaSpec like what Istio did on controlling their gateway through IstioOperator API. So that the existence of the field will override the replicas field on EnvoyProxy API for kubernetes provider. Thoughts @arkodg?

@arkodg
Copy link
Contributor

arkodg commented Oct 30, 2023

ptal @envoyproxy/gateway-maintainers, please share your thoughts on adding envoyHpa field into the EnvoyProxy resource

@zirain
Copy link
Contributor

zirain commented Nov 1, 2023

keep eye on kubernetes-sigs/gateway-api#1355

@gazal-k
Copy link

gazal-k commented Nov 27, 2023

That part that confuses me is that I see it is possible to control the envoy deployment using the EnvoyProxy object, like: https://github.com/envoyproxy/gateway/blob/main/examples/kubernetes/envoy-proxy-config.yaml. However, the helm chart does not seem to do this, instead defining minimal config;

config:
envoyGateway:
gateway:
controllerName: gateway.envoyproxy.io/gatewayclass-controller
provider:
type: Kubernetes
logging:
level:
default: info
and defining the Deployment object separately; https://github.com/envoyproxy/gateway/blob/main/charts/gateway-helm/templates/envoy-gateway-deployment.yaml

This makes me think, we can introduce opt-in autoscaling configuration similar to: https://github.com/kubernetes/ingress-nginx/blob/7f723c59855e82614582ff7b2efd1783b1afc2ee/charts/ingress-nginx/values.yaml#L367-L373

gazal-k added a commit to gazal-k/gateway that referenced this issue Nov 27, 2023
re envoyproxy#703

Signed-off-by: Gazal Gafoor <gazal.gafoor@rea-group.com>
gazal-k added a commit to gazal-k/gateway that referenced this issue Nov 27, 2023
re envoyproxy#703

Signed-off-by: Gazal Gafoor <gazal.gafoor@rea-group.com>
gazal-k added a commit to gazal-k/gateway that referenced this issue Nov 27, 2023
re envoyproxy#703

Signed-off-by: Gazal Gafoor <gazal.gafoor@rea-group.com>
@shawnh2
Copy link
Contributor

shawnh2 commented Nov 27, 2023

hi @gazal-k, seems you have a little misunderstanding here.

the deployment in helm chart is what we used to start envoy-gateway controller, the EnvoyProxy deployment is reconciled by this controller:

deployment := &appsv1.Deployment{

the values in this yaml is used by the controller:

envoyGateway:
gateway:
controllerName: gateway.envoyproxy.io/gatewayclass-controller


but https://github.com/kubernetes/ingress-nginx/blob/7f723c59855e82614582ff7b2efd1783b1afc2ee/charts/ingress-nginx/values.yaml#L367-L373 seems a good example to define hpaSpec.

@gazal-k
Copy link

gazal-k commented Nov 27, 2023

Thanks for that. So, the Deployment object in the chart is the controller. The controller then creates another Deployment object for the envoy proxy fleet?

@shawnh2
Copy link
Contributor

shawnh2 commented Nov 27, 2023

Thanks for that. So, the Deployment object in the chart is the controller. The controller then creates another Deployment object for the envoy proxy fleet?

correct

@arkodg
Copy link
Contributor

arkodg commented Nov 27, 2023

hey @ardikabs are you planning on working on this one ?

@ardikabs
Copy link
Contributor

Yes @arkodg, I am still working on it

@shawnh2
Copy link
Contributor

shawnh2 commented Dec 6, 2023

closed in favor of #2257

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/api API-related issues area/infra-mgr Issues related to the provisioner used for provisioning the managed Envoy Proxy fleet. documentation Improvements or additions to documentation kind/enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

8 participants