Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move injection settings from .Values. into ProxyConfig #1478

Closed
wants to merge 1 commit into from

Conversation

dgn
Copy link
Contributor

@dgn dgn commented Jun 23, 2020

API changes for istio/istio#24898

@istio-testing istio-testing added the do-not-merge/work-in-progress Block merging of a PR because it isn't ready yet. label Jun 23, 2020
@istio-policy-bot
Copy link

😊 Welcome @dgn! This is either your first contribution to the Istio api repo, or it's been
awhile since you've been here.

You can learn more about the Istio working groups, code of conduct, and contributing guidelines
by referring to Contributing to Istio.

Thanks for contributing!

Courtesy of your friendly welcome wagon.

@istio-testing
Copy link
Collaborator

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@googlebot googlebot added the cla: yes Set by the Google CLA bot to indicate the author of a PR has signed the Google CLA. label Jun 23, 2020
@istio-testing istio-testing added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jun 23, 2020
@dgn
Copy link
Contributor Author

dgn commented Jun 23, 2020

/test all

@dgn
Copy link
Contributor Author

dgn commented Jun 23, 2020

/retest

Copy link
Member

@howardjohn howardjohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall in favor of this type of move, thanks for working on this.

cc @costinm

string proxy_init_image = 33;
string proxy_init_image_pull_policy = 39;

repeated string include_outbound_ip_ranges = 34;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to nest this? Either one level (capture.include_outbound_ip_ranges) or two levels (capture.include.inbound_ip_ranges) or I suppose we could do more?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have existing settings - let's keep consistent and minimize cosmetic change.

// multicluster scenarios).
repeated string pod_dns_search_namespaces = 31;

bool rewrite_app_http_probe = 32;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how WIP this is meant to be, but we should add some comments prior to merge

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, this is just in draft state

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the status of the rewrite_app_http_probe ? Was a feature flag - did we make it default ? Do we still have a use case to turn it off/on ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is default now. I wouldn't be against removing it as as option? @incfly any thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can keep it as an option, for safety - but not as API in ProxyConfig.

If we are sure it can never fail - we can also remove it as an option.

I think it is reasonable to have a number of env variables for infrequent/unlikely cases - just to be sure, for safety.
But if there is no good use case for normal user to interact with it - should not be in the API.


bool rewrite_app_http_probe = 32;

string proxy_init_image = 33;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest we remove these entirely, and just have a single proxy image option

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 - we're not planning separate image for init.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, will remove

@@ -302,6 +302,8 @@ message ProxyConfig {
// Port on which Envoy should listen for administrative commands.
int32 proxy_admin_port = 11;

int32 proxy_status_port = 36;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one already exists, option 26

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if sorting fields by number will affect generated protobuff ?, but I think that will help to avoid numbering confusion in the future.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean ? Leave the original as it is, don't add the new one.

repeated string include_outbound_ip_ranges = 34;
repeated string exclude_outbound_ip_ranges = 35;

repeated string include_inbound_ports = 37;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the ProxyConfig docs are generated based on these API definitions, so it will be good to add a one-liner description about each field just to avoid blank line in description column. :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, that's the plan

Copy link
Contributor

@costinm costinm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a reminder: global.yaml and values.yaml are a big mess, we do not want to blindly move all the crap from values to API. We already added inconsistent/random settings with insufficient review in values.yaml, this is the API repository.

// The mode used to redirect inbound traffic to Envoy.
InboundInterceptionMode interception_mode = 18;

InboundInterceptionTechnique interception_technique = 30;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is a setting we want here: it is not actionable. The proxy can't decide to use CNI - either it is installed and used, or not.

Besides, the 'technique' name is very strange - and we already have interception mode.

We could add an extra setting to the interception mode enum if we really find a use case that is
possible to implement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is acted upon in the injection template - it checks whether .Values.istio_cni.enabled == true, so obviously it is of interest to the control plane whether we're using CNI or not

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That doesn't mean it should be part of the API exposed to user.

ProxyConfig is an API used to define how a proxy behave - if the user sets it to CNI, the expectation is that somehow CNI will be used - which we obviously can't unless CNI is installed in the cluster. And if the CNI is installed - not clear we want to allow users to 'opt out' using ProxyConfig. There is an annotation we can continue to support.

I think it may be fine to have 'cni enabled' in MeshConfig - as a Istiod/injector setting. But probably the semantic wouldn't be if CNI is enabled - Istiod can just check if the CNI deployment is there without user having to specify. We may want to allow the admin to control if pods can opt out or not.

Either way - bulk adding APIs is problematic, we should review each API carefully - we don't add dozen unrelated
fields to Gateway or VirtualService, we shouldn't do it for ProxyConfig either. A RFC on how to expose the settings for CNI would be great, meanwhile please keep it out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I would have seen this as the default value for whether or not to use CNI during injection. You could override this per-proxy to use proxy_init for example. Of course it would only make sense if CNI is installed, I agree. That should be checked during validaton

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And, again, all names are tbd, please don't assume I wanted to merge this as-is, just suggest better names

// This does not apply to gateway pods as they typically need a different
// set of DNS settings than the normal application pods (e.g., in
// multicluster scenarios).
repeated string pod_dns_search_namespaces = 31;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not familiar with this - doesn't k8s already have this in the pod spec ? What is the purpose and use here ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a comment on how it is used in Istio - and we should confirm that

  • we can't use the setting from the pod ( injection should have access to that )
  • we want to use a random name instead of using the same name as k8s. In general we want to use the same syntax when possible - and minimize the 'istio-invented names'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point actually I was thinking this was a container setting which sort of makes sense. For a pod - they should just add this to their own spec..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@costinm this is a draft that just blindly moves settings from .Values.global. to ProxyConfig, nothing was invented here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant: values.yaml and global.yaml (which this PR 'blindly moves' ) invented a lot of inconsistent names - not matching what K8S is using or even our own Istio APIs.

We do not want 'blind move' of settings - what's the point ? We need to do what we failed to do when we let
everything in values/global - and consider the use cases, UX, consistency, etc - on each feature. It is an API.

// multicluster scenarios).
repeated string pod_dns_search_namespaces = 31;

bool rewrite_app_http_probe = 32;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the status of the rewrite_app_http_probe ? Was a feature flag - did we make it default ? Do we still have a use case to turn it off/on ?


bool rewrite_app_http_probe = 32;

string proxy_init_image = 33;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 - we're not planning separate image for init.


repeated string include_inbound_ports = 37;
repeated string exclude_inbound_ports = 38;
repeated string include_outbound_ports = 40;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty sure we already have some of those - excludeOutboundPorts for sure.

Please keep consistent with the current API.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those exist only as annotations, not in the API yet

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, was looking at the wrong proto. For some reason there is another ProxyConfig with different content in operator, wrong autocomplete.

@howardjohn was right in the comment above - it may make sense to nest.

Please add the comments from the annotations.yaml - and include the equivalent annotation name.

Personally I'm not convinced that duplicating annotations in ProxyConfig is useful for the existing annotations.
Pain for the user to move - we should first focus on things that are not already supported. When we're done we
can come back for annotations. Or if we do want to move the annotations in ProxyConfig - we should maybe
do it consistently for all of them ( that we want to be exposed as API ).

Another aspect to keep in mind is that some of the annotations that are commonly used may also be promoted
to Sidecar.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree that duplicating the annotations is strange. However, we already had them duplicated, because the default values could be set by users in .Values.global, I just moved that over to here. I think if we want to allow users to set defaults, it should be in MeshConfig

@dgn
Copy link
Contributor Author

dgn commented Jun 23, 2020

Thanks for the comments @costinm @howardjohn, please note that this is an early draft, with the intention of moving us away from the spaghetti in .Values.global towards a well-defined API in MeshConfig/ProxyConfig. Please do suggest new names, structure etc. but please keep in mind that I didn't invent any of these settings and cannot simply remove them without replacement

@howardjohn
Copy link
Member

@dgn I am very happy to remove the spaghetti, but this is not removing it its just moving it. We can use this as an opportunity to make it a better API.

@costinm
Copy link
Contributor

costinm commented Jun 23, 2020

@dgn I am very happy to remove the spaghetti, but this is not removing it its just moving it. We can use this as an opportunity to make it a better API.

Well said !

I don't mind removing things from ProxyConfig that no longer make sense or belong to MeshConfig or are no longer used.

@dgn
Copy link
Contributor Author

dgn commented Jun 24, 2020

@dgn I am very happy to remove the spaghetti, but this is not removing it its just moving it. We can use this as an opportunity to make it a better API.

That's exactly what I meant by asking for suggestions for better names and structure 🙂

I understand the urge to re-design the API but tbh that should have happened when these fields were first added.

IMO, they are now API, and changing them should be treated as API changes. So I'd suggest us to carefully move settings, possibly with new name and structure, but without removing any at this point. If we then want to remove settings later on, let's do that in a follow-up PR. If you make it a condition to have a perfect API before formalizing it, I'm afraid we'll simply never formalize it. Please don't say we should 'use the opportunity' to change it now, because that to me sounds like "I don't want to look at this again tomorrow" - this will be an ongoing process.

So my proposal would be: move everything we have today into MeshConfig v1alpha, e.g. into an InjectionConfig block where adequate. Then, in a design doc, come up with a plan for MeshConfig v1alpha2 (or even v1beta1). That's how APIs should evolve in my opinion.

@costinm
Copy link
Contributor

costinm commented Jun 24, 2020

@dgn I am very happy to remove the spaghetti, but this is not removing it its just moving it. We can use this as an opportunity to make it a better API.

That's exactly what I meant by asking for suggestions for better names and structure slightly_smiling_face

I understand the urge to re-design the API but tbh that should have happened when these fields were first added.

No, the helm options were never considered an API - just install options for helm.

There was no API review.

IMO, they are now API, and changing them should be treated as API changes. So I'd suggest us to carefully move settings, possibly with new name and structure, but without removing any at this point. If we then want to remove settings later on, let's do that in a follow-up PR. If you make it a condition to have a perfect API before formalizing it, I'm afraid we'll simply never formalize it. Please don't say we should 'use the opportunity' to change it now, because that to me sounds like "I don't want to look at this again tomorrow" - this will be an ongoing process.

I don't think this is good for our users or for us. Values.yaml is a mess - I hope nobody disagree - and moving it to
API is not acceptable.

So my proposal would be: move everything we have today into MeshConfig v1alpha, e.g. into an InjectionConfig block where adequate. Then, in a design doc, come up with a plan for MeshConfig v1alpha2 (or even v1beta1). That's how APIs should evolve in my opinion.

We already have them in values.yaml - users are not blocked in any way.

The idea that 'later we'll clean up' was also proposed for global.yaml and values.yaml - and we ended up with the
mess we have. If we 'move the mess' - next time the argument will be 'it's in broad use, cost to create tooling to convert is too high'. And users will correctly complain that Istio APIs are a mess and too complicated.

@costinm
Copy link
Contributor

costinm commented Jun 24, 2020

To be clear: for proper inject config settings, please write a RFC. People working on CNI must also comment, and we need to be sure we want to support long term all the options.

A lot of the implementation was based on the limitations of the injector - which was pretty dumb. Now in istiod we have access
to PushContext - so we can make different decisions. For example a feature request was to allow the mesh owner to control
the ability of users to bypass interception. With CNI it will also be possible to enforce this.

It is a pretty critical aspect of Istio covering multiple WG - it's a big difference between experiments and hacks in the install
and a proper long term API.

@dgn
Copy link
Contributor Author

dgn commented Jun 24, 2020

The idea that 'later we'll clean up' was also proposed for global.yaml and values.yaml - and we ended up with the
mess we have. If we 'move the mess' - next time the argument will be 'it's in broad use, cost to create tooling to convert is too high'. And users will correctly complain that Istio APIs are a mess and too complicated.

I am trying to clean up the mess here Costin, and the only thing I'm looking for is constructive feedback to move things forward, and I'm afraid what you're suggesting is going to stall this effort forever, basically making your statement here a self-fulfilling prophecy.

@costinm
Copy link
Contributor

costinm commented Jun 24, 2020 via email

@costinm
Copy link
Contributor

costinm commented Jun 24, 2020 via email

@costinm
Copy link
Contributor

costinm commented Jun 24, 2020 via email

@istio-testing
Copy link
Collaborator

@dgn: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@istio-testing istio-testing added the needs-rebase Indicates a PR needs to be rebased before being merged label Jun 27, 2020
@costinm
Copy link
Contributor

costinm commented Jun 29, 2020

#1488 is also related to injection settings - please sync on the same RFC.

@dgn dgn closed this May 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes Set by the Google CLA bot to indicate the author of a PR has signed the Google CLA. do-not-merge/work-in-progress Block merging of a PR because it isn't ready yet. needs-rebase Indicates a PR needs to be rebased before being merged size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants