Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support configuring number of replicas of Envoy Proxy #713

Closed
gitanuj opened this issue Nov 8, 2022 · 7 comments · Fixed by #924
Closed

Support configuring number of replicas of Envoy Proxy #713

gitanuj opened this issue Nov 8, 2022 · 7 comments · Fixed by #924
Labels
area/api API-related issues area/ir Issues related to Gateway's internal representation, e.g. data model. help wanted Extra attention is needed kind/enhancement New feature or request provider/kubernetes Issues related to the Kubernetes provider
Milestone

Comments

@gitanuj
Copy link

gitanuj commented Nov 8, 2022

Currently EG deploys only a single instance of Envoy Proxy. For scaling to high number of requests it will be useful to be able to increase number of replicas of Envoy Proxy to distribute load. It can be used for the following scenario -

  • Deploy EG with one replica of Envoy Proxy
  • As the traffic increases, metrics report high CPU usage of Envoy Proxy pod
  • Increase the number of replicas of Envoy Proxy to distribute the load and reduce CPU usage per pod

This property can be exposed through Envoy Proxy Config API - #703

@gitanuj gitanuj added the kind/enhancement New feature or request label Nov 8, 2022
@danehans danehans added help wanted Extra attention is needed area/api API-related issues provider/kubernetes Issues related to the Kubernetes provider area/ir Issues related to Gateway's internal representation, e.g. data model. labels Nov 9, 2022
@danehans danehans added this to the Backlog milestone Nov 9, 2022
@danehans
Copy link
Contributor

danehans commented Nov 9, 2022

@gitanuj tya is for creating the issue. Are you willing to work on a PR to fix this issue? If so, I would be happy to help provide guidance.

@gitanuj
Copy link
Author

gitanuj commented Nov 9, 2022

@danehans It'll be great if you can give me some pointers to get started on this and I can work on it sometime next week. Would love for this to be included in the next release.

@danehans
Copy link
Contributor

danehans commented Nov 9, 2022

Since the EnvoyProxy API is currently unimplemented, fixing this issue will require more work than what will be needed in the future.

Implementation Tasks

  • Review the config API dev guide for background on the Envoy Gateway data plane API, e.g. EnvoyProxy.
  • Update the EnvoyProxySpec API to define the schema for managing the Envoy deployment, e.g. the number of replicas. Avoid any Kube-isms as we intend for Envoy Gateway to support managing Envoy in non-Kube environments. Use kubebuilder validation markers as needed. After your changes are complete, generate the DeepCopy methods make generate.
  • Update ProviderResources to include EnvoyProxy. The field should follow the pattern of existing namespaced name fields, e.g. EnvoyProxies watchable.Map[types.NamespacedName, *v1a1.EnvoyProxy]. Update the Envoy Gateway server command to close the EnvoyProxies channel upon shutdown.
  • Infra IR- Update the ProxyInfra methods to include EnvoyProxy. Note that the infra IR already supports EnvoyProxy.
  • Update the resource translator- EnvoyProxy should be added to the Resources type. The Translate() method should be updated to include the EnvoyProxy in the infra IR (if one exists). Update the runner to subscribe to the EnvoyProxies channel.
  • Update the gatewayclass controller of the Kube provider to check if the accepted GatewayClass references an EnvoyProxy. If the reference is invalid, have the statusUpdater update the GatewayClass status with the InvalidParamters condition. If the reference is valid, get and store the EnvoyProxy in the resource map. Delete the EnvoyProxy from the resource map when the GatewayClass is deleted. Create a watch for EnvoyProxy and update the GatewayClass status and the resource map accordingly. You can reference the gateway controller for watches examples. Note that [provider] refactoring kubernetes provider to single reconciler #702 is consolidating the controllers into a single reconciler.
  • Implement the EnvoyProxy API changes in the Kube infra manager.

Notes:

  • Tasks should be implemented as separate PRs.
  • Includes tests for added functionality.
  • Make sure all tests pass before pushing a PR. See the dev guide for details.

@arkodg @skriss @AliceProxy @Xunzhuo feel free to provide any additional guidance.

@arkodg
Copy link
Contributor

arkodg commented Nov 10, 2022

it would be good to design the API with future expansion in mind.
E.g. here are the resource customizations Istio supports https://istio.io/latest/docs/reference/config/istio.operator.v1alpha1/#KubernetesResourcesSpec
and the same KubernetesResourceSpec can be reused in EnvoyProxy as well as EnvoyGateway API

@danehans
Copy link
Contributor

@arkodg thanks for sharing the Istio reference. If we take the KubernetesResourcesSpec approach, EnvoyProxy will need a higher-level abstraction similar to provider in EnvoyGateway to wrap the Kube-isms. For example:

apiVersion: config.gateway.envoyproxy.io/v1alpha1
kind: EnvoyProxy
metadata:
  name: envoyproxy-sample
spec:
  provider:
    type: Kubernetes
    kubernetes:
      replicasCount: 2
...

The KubernetesResourcesSpec approach provides a high level of configurability. However, it also provides a higher support burden and potential UX issues over a more fine-grained API surface.

@arkodg
Copy link
Contributor

arkodg commented Nov 10, 2022

yah @danehans above approach looks good
reg level of customizations, imho EG should provide the subset of deployment & service kubernetes customizations to allow the user to run EG in production, and it will have to be via such a config API because Envoy proxy is being deployed programmatically

@danehans
Copy link
Contributor

danehans commented Dec 1, 2022

it will have to be via such a config API because Envoy proxy is being deployed programmatically

Understood, but I don't necessarily agree that the KubernetesResourcesSpec approach needs to be taken. This approach allows users to configure every aspect of a resource which has the potential for providing a poor UX and instability. We can take more of a surgical approach to exposing Envoy configuration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/api API-related issues area/ir Issues related to Gateway's internal representation, e.g. data model. help wanted Extra attention is needed kind/enhancement New feature or request provider/kubernetes Issues related to the Kubernetes provider
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants