Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve fn authoring #4319

Closed
wants to merge 1 commit into from
Closed

improve fn authoring #4319

wants to merge 1 commit into from

Conversation

mengqiy
Copy link
Member

@mengqiy mengqiy commented Nov 30, 2021

To try it out, add the following in your go.mod:

replace sigs.k8s.io/kustomize/kyaml => github.com/mengqiy/kustomize/kyaml v0.16.0

godoc can be found at:

The problems with the existing go SDK in kyaml:

  • The surface in kyaml is way too big. It contains the things to build not only KRM functions but also KRM function orchestrators (e.g. kustomize and kpt).
    • "kyaml is pretty huge, so I'm finding it challenging to navigate" -- Brian Grant.
  • It exposes too much of the low-level implementation details. e.g. To use a ResourseList type, a use has to learn how to use RNode . Learning how to deal with RNode and YNode is not trivial.
  • No strong type support when dealing with k8s objects.
  • If I want to embed a KRM function in a Go program (e.g. HTTP server), there is no easy way to do it. I have to use kio package directly to dealing with (de)serialization. Learning kio can be another burden for a function author.

The goals are:

  • Make the common developer use cases much easier
  • Easier to getting started
  • Smaller surface that is focusing on KRM function developers
  • Hide the complexity of low-level implementation details, but still provide ways to access it for advanced users.
  • Provide strong type support when reading, generating or manipulating objects.

Things to highlight:

  • Introduction of KubeObject
    • It provides Get, Set, Remove and a bunch of other methods
    • Get and Set allow users to work with typed object (e.g. corev1.Pod)
  • ResourceList uses KubeObject
  • Added several helper functions for creating Result.
  • Added several package level examples to demonstrate how to use the new interface

Known limitation:

After a field round tripping through the typed object, we lose comments. Some tests are failing because we drop the license header comment. There are at least 2 ways to work around it:

  • Rely on the KRM function orchestrator to preserve the comments. (kpt relies on the id annotation to copy over the comments).
  • Make our fn framework preserve comments automatically. e.g. Record the comments before passing it to the ResourceListProcessor.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Nov 30, 2021
@mengqiy mengqiy marked this pull request as draft November 30, 2021 18:00
@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Nov 30, 2021
@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 30, 2021
@mengqiy mengqiy force-pushed the gosdk branch 2 times, most recently from 879ec29 to 1c5e196 Compare December 1, 2021 06:08
@mengqiy mengqiy marked this pull request as ready for review December 1, 2021 06:10
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 1, 2021
@mengqiy
Copy link
Member Author

mengqiy commented Dec 1, 2021

/cc @KnVerey @droot @natasha41575

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 3, 2021
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 3, 2021
@mengqiy
Copy link
Member Author

mengqiy commented Dec 3, 2021

I just rebased it to resolve a merge conflict

Copy link
Contributor

@droot droot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking much better @mengqiy

kyaml/fn/framework/example_filter_GVK_test.go Outdated Show resolved Hide resolved
kyaml/fn/framework/example_filter_GVK_test.go Outdated Show resolved Hide resolved
kyaml/fn/framework/example_filter_GVK_test.go Outdated Show resolved Hide resolved
kyaml/fn/framework/example_filter_GVK_test.go Show resolved Hide resolved
kyaml/fn/framework/example_generator_test.go Outdated Show resolved Hide resolved
kyaml/fn/framework/example_read_field_test.go Show resolved Hide resolved
kyaml/fn/framework/example_read_functionConfig_test.go Outdated Show resolved Hide resolved
kyaml/fn/framework/example_set_field_test.go Show resolved Hide resolved
kyaml/fn/framework/resourcelist.go Outdated Show resolved Hide resolved
@k8s-ci-robot
Copy link
Contributor

@mengqiy: This PR has multiple commits, and the default merge method is: merge.
You can request commits to be squashed using the label: tide/merge-method-squash

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@mengqiy mengqiy changed the title PoC: improve fn authoring improve fn authoring Dec 7, 2021
@natasha41575
Copy link
Contributor

/hold

to make sure @KnVerey can take a look before it merges

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 8, 2021
@k8s-ci-robot k8s-ci-robot removed the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 10, 2021
@mengqiy
Copy link
Member Author

mengqiy commented Jan 13, 2022

The original intent of doing it in a separate package is to provide KRM function developers an alpha package with a much smaller but versatile surface. After chatting with @KnVerey offline, Katrina pointed out that kyaml/fn/framework is still an alpha package. It can be confusing to users when we have 2 alpha package that tries to do the same thing. So I refactor the sdk package into the kyaml/yaml and kyaml/fn/framework packages.

The problems with the existing fn SDK and the goals of this PR are listed in the PR description.

@jeremyrickard
Copy link
Member

👋 I'm back from vacation and covid shenanigans in December. I'm going to try and rebase a few of our internal functions on this, I'll report back with an refactoring experience report and give a more thorough PR review.

kyaml/yaml/rnode.go Outdated Show resolved Hide resolved
kyaml/yaml/rnode.go Outdated Show resolved Hide resolved
kyaml/yaml/rnode.go Outdated Show resolved Hide resolved
kyaml/yaml/rnode.go Outdated Show resolved Hide resolved
kyaml/yaml/rnode.go Outdated Show resolved Hide resolved
kyaml/yaml/rnode.go Outdated Show resolved Hide resolved
kyaml/yaml/rnode.go Outdated Show resolved Hide resolved
kyaml/yaml/rnode.go Outdated Show resolved Hide resolved
kyaml/yaml/const.go Outdated Show resolved Hide resolved
@mengqiy
Copy link
Member Author

mengqiy commented Jan 21, 2022

@KnVerey @jeremyrickard Any feedback so far?

@yuwenma
Copy link
Contributor

yuwenma commented Jan 21, 2022

(Moved from previous resolved comments)
I have some concerns about changing ResourceList.Items from type []*RNode to []*KubeObject. This change will require all users who interact with resourcelist.Items to make code changes once updating the kyaml library. Considering the resourcelist.Items is previously the key attribute to call RNode functions, it could be a non-trivial backward compatibility breaking change.

Right now, framework lib is imported by 35 public repos. Some are owned by us so it is less a pain to fix, but repos like this https://github.com/kumorilabs/konvert/blob/917abd1eee5d6b0b234ca5efc0442377b8363187/internal/functions/functions.go#L56 would definitely be broken once updating its kyaml. What's more, the lib is imported by our customers' private repos.

If Pipe is not intuitive and we do need to improve that, do you think we can attempt a safer way to make the change? How about defining another attribute in ResourceList for []*KubeObject and mark the Items as DEPRECATED?

// AsMain reads the resourceList in yaml format from stdin, evaluates the
// function and write the updated resourceList in yaml to stdout. Errors if any
// will be printed to stderr.
func AsMain(p ResourceListProcessor) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This strikes me as a less powerful version of the command subpackage, which is intended to make it easy to generate entrypoints: https://pkg.go.dev/sigs.k8s.io/kustomize/kyaml@v0.13.1/fn/framework/command. If we need this as well (please comment as to why), I think it should move to that subpackage, which is where we've isolated assumptions that the IO is from the command line.

@@ -48,14 +54,159 @@ type ResourceList struct {
// kind: Example
// spec:
// foo: var
FunctionConfig *yaml.RNode `yaml:"functionConfig,omitempty" json:"functionConfig,omitempty"`
FunctionConfig *yaml.KubeObject `yaml:"functionConfig,omitempty" json:"functionConfig,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I played with this branch a bunch yesterday, and I will comment in more detail later. But I wanted to start by sharing the central challenge I encountered in attempting to use this branch for a non-trivial use case, which comes down to the changes to these fields. In addition to the compatibility concerns @yuwenma raised (we can making breaking changes since this is alpha, but we should still be somewhat conservative about it for existing consumers' sake), not having these be RNodes breaks compatibility with the existing suite of tools. Everything from Processors to Matchers/Selectors revolves around the kyaml.Filter interface, which has a simple "rnodes in, rnodes out" contract that makes it really powerful. I'm in favour of making simple use cases easier by layering on top of this, but I don't think we should move away from RNode and Filter being the primitive concepts in the foundational layers (e.g. in the type definition here).

Related to the above, I continue to be skeptical that it will be a net benefit to introduce KubeObject at the same level of abstraction as RNode on the basis that the existing method surface of RNode is too big. Having two types represent the same thing but work with different parts of the package could be more complexity overall, as it is something consumers need to learn to decide (and convert!) between to use our tools. For example, in a method below, you have to re-invent the signature for selector as selector func(obj *yaml.KubeObject) bool, which is then incompatible with the large suite of high-level selectors and low-level selector builders that the framework already has (see selector.go and matchers.go).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm OK with merging KubeObject with RNode for now.

IMO when users uses kyaml/fn/framework package, they don't need to deal with io directly. IO should be handled by the framework package.
Also kio.Filter is doing similar things as ResourceListProcessor. This is another confusing thing for users. IMO the framework package should move to ResourceListProcessor (ResourceList in, ResourceList out) eventually, since ResourceList is in the contract of KRM function spec and it allows structured results to be passed.

@mengqiy
Copy link
Member Author

mengqiy commented Jan 25, 2022

Changing ResourceList.Items from []*RNode to []*KubeObject is breaking change. If we decide to go with this change, we will need to release it in the next minor release since kyaml is still pre-v1.0.0. It won't break existing users unless the users choose to upgrade kyaml. But the users will need to migrate eventually.
It depends on if it worths the efforts to require the users to migrate. @yuwenma @KnVerey If you don't believe so, then maybe we should keep using []*RNode. Please let me know what do you think. I'm happy to discuss it offline if you want.

EDIT: I didn't see your comment in #4319 (comment) earlier, I should refresh my page.

@mengqiy
Copy link
Member Author

mengqiy commented Jan 28, 2022

@KnVerey @yuwenma PTAL

Copy link
Contributor

@yuwenma yuwenma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor comments. Overall LGTM

@@ -84,6 +84,10 @@ func (i Result) String() string {
return fmt.Sprintf(formatString, list...)
}

func (i Result) Error() string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add docstring for public functions? Here and the below functions.

return anno
}

// GetIndexAnnotation checks the index annotation first and then the legacy index
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super nit: GetIndexAnnotation gets the index from annotation "internal.config.kubernetes.io/index" or "config.kubernetes.io/index" (legacy).
readability rationale: function docstring should explain what the function is for, not how the code is implemented.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ I agree. Why are function authors interested in these? We should explain here (including the "should not mutate" aspect) rather than sending to an external doc.

}

// toRNode converts the ResourceList to a RNode.
func (rl *ResourceList) toRNode() (*yaml.RNode, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

qq: Do we want to make this function public?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I don't think that would be useful.

@yuwenma yuwenma mentioned this pull request Jan 29, 2022
1 task
// AsMain reads the resourceList in yaml format from stdin, evaluates the
// function and write the updated resourceList in yaml to stdout. Errors if any
// will be printed to stderr.
func AsMain(p framework.ResourceListProcessor) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the intended use case for this vs. command.Build above? It seems like they do the same thing, but Build has more batteries included (e.g. it wires up a CLI with debugging flags for you, helps you create a dockerfile for your function, and accepts deconstructed input as well as ResourceList on stdin). Most of the functions I've written myself have entrypoints that look like:

func main() {
	cmd := command.Build(provider.New(), command.StandaloneEnabled, false)
	cmd.Execute()
}

That strikes me as pretty easy, though the concepts of standalone mode and error printing might be unnecessary (we could always enable standalone mode, always print the error and eliminate the legacy env variable).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO the concepts of standalone mode and error printing are unnecessary complexity for users, I'd suggest dropping them if they don't affect kustomize.

Quoted from godoc of Build method:

// By default, invoking the returned cobra.Command with arguments triggers "standalone" mode.
// In this mode:
// - The first argument must be the name of a file containing the FunctionConfig.
// - The remaining arguments must be filenames containing input resources for ResourceList.Items.
// - The argument "-", if present, will cause resources to be read from STDIN as well.
// The output will be a raw stream of resources (not wrapped in a List type).
// Example usage: cat input1.yaml | go run main.go config.yaml input2.yaml input3.yaml -

Reading from files is not the standard way to read resources, can we drop it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with dropping it as an option, but I think it should be always enabled. It doesn't interfere with the primary (fn specification) way of running the program, and it adds a new capability: you can use the same program directly. That is really powerful. It means you can build a generator plugin that is both distributed on its own for simple use case and pluggable into any Kustomization. I personally have a use case like that. I also find standalone mode very handy for testing any function.


// Run evaluates the function. input must be a resourceList in yaml format. An
// updated resourceList will be returned.
func Run(p ResourceListProcessor, input []byte) ([]byte, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same as func Execute(p ResourceListProcessor, rlSource *kio.ByteReadWriter) error { but isn't pluggable. I remember at last week's meeting you mentioned not liking the exposure of kio to framework users. The command package's entrypoint builder already provide one way to hide this, since it takes care of invoking framework.Execute itself. It is already possible to call framework.Execute(p, nil) to make it construct the read-writer for you. Alternatively, we could get rid of that, but add a constructor func like framework.NewReadWriter() (*kio.ByteReadWriter) that is initialized with stdin/out? So the invocation would become framework.Execute(p, framework.NewReadWriter()), and those needing to customize the RW could also use the framework.NewReadWriter() as a starting point instead of knowing about kio directly. As much as the kio RW has a lot of extra features to it that probably aren't useful to the average function author, it still does useful things like automatic handling of ResourceList, List and yaml stream formats. That approach would also prevent all the duplication below, plus the increasing of the surface area of the package, where end users would need to choose between Run and Execute.

}

// Chain chains a list of ResourceListProcessor as a single ResourceListProcessor.
func Chain(processors ...ResourceListProcessor) ResourceListProcessor {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you provide an example of when this is helpful, ideally in the form of a test? I've used chaining either at a lower level (i.e. of Filter calls) when the logic in question is config-free, or at a higher-level (i.e. with VersionedAPIProcessor) when dealing with a series of embedded micro-APIs that need the relevant part of the config extracted into the input ResourceList. Chain implies you'd have a series of processors that all operate against the same config?

}

// ChainFunctions chains a list of ResourceListProcessorFunc as a single ResourceListProcessorFunc.
func ChainFunctions(fns ...ResourceListProcessorFunc) ResourceListProcessorFunc {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar question--not used in tests, and I'm also unsure why we'd need this in addition to Chain given that ResourceListProcessorFunc are ResourceListProcessor.

func GetIndexAnnotation(rn *yaml.RNode) int {
anno := rn.GetAnnotation(kioutil.IndexAnnotation)
if anno == "" {
anno = rn.GetAnnotation(kioutil.LegacyIndexAnnotation)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is net new, do we need to support the deprecated annotations?


// SortItems sorts the ResourceList.items by apiVersion, kind, namespace and name.
func (rl *ResourceList) SortItems() {
sort.Sort(sortRNodeObjects(rl.Items))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kio's filters package already includes FormatFilter, which also sorts fields. Should we wrap that instead?

idi := o[i].GetResourceIdentifier()
idj := o[j].GetResourceIdentifier()
idStrI := fmt.Sprintf("%s %s %s %s", idi.GetAPIVersion(), idi.GetKind(), idi.GetNamespace(), idi.GetName())
idStrJ := fmt.Sprintf("%s %s %s %s", idj.GetAPIVersion(), idj.GetKind(), idj.GetNamespace(), idj.GetName())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've implemented a sort like this before, so I'm definitely in favour of the framework providing one!

  • IIRC there are edge cases (e.g. involving fields or GV parts being empty) where the sort doesn't come out the way you'd intuitively want, and I vaguely remember switching to a more field-aware strategy as a result. @jeremyrickard does this sound familiar?
  • Should the heart of this go in kio's filters collection? It already has FormatFilter, which is for sorting within objects--it would be neat to have this be a formal filter that exposes field sorting as an option for a one-stop solution to making your output stable. Framework could still have func (rl *ResourceList) SortItems() { to wrap it of course.

}

// toRNode converts the ResourceList to a RNode.
func (rl *ResourceList) toRNode() (*yaml.RNode, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I don't think that would be useful.

rl.SortItems()
var yml string
if err := func() error {
ko, err := rl.toRNode()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of manually going through RNodes, couldn't we construct a kio.ByteWriter with a byte buffer? I think it does everything you're doing in toRNode automatically.

Copy link
Member Author

@mengqiy mengqiy Feb 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kio.ByteWriter doesn't seem to do what I want. IIUC it only writes the ResourceList.Items to Writer. But what I want to do here is to write the whole ResourceList to Writer.

Copy link
Member Author

@mengqiy mengqiy Feb 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... It seems I need to copy ResourceList.FunctionConfig to ByteWriter.FunctionConfig and convert ResourceList.Results (framework.Results type) to ByteWriter.Results (yaml.RNode type).
That's another reason I don't want to expose kio to function developers. But I'm OK using it as internal implementation details.
IMO ideally the framework package and subpackages should be ResourceList centric.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, using it as an implementation detail is what I meant :)

@@ -56,6 +63,150 @@ type ResourceList struct {
Results Results `yaml:"results,omitempty" json:"results,omitempty"`
}

// ParseResourceList parses a ResourceList from the input byte array.
func ParseResourceList(in []byte) (*ResourceList, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to my comment below, I feel like this could at least be leveraging kio.ByteReader under the hood, and maybe should take one as input (if having a constructor for that) depending on the circumstances under which we see this being useful. Could you clarify about that? This seems reasonable to have in general, but in practice I'd expect the framework to be taking care of parsing behind one of the higher-level entrypoints.


// updateReplicas sets a field in resources selecting by GVK.
func updateReplicas(rl *framework.ResourceList) error {
if rl.FunctionConfig == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: what matters is not the function config missing, but rather it having a replicas field specifically. I would suggest making GetNestedInt handle the nil case so all function authors don't have to. And then returning the typed error (ideally with the required path in it!) at L26 instead.

return err
}
for i, obj := range rl.Items {
if obj.GetApiVersion() == "apps/v1" && obj.GetKind() == "Deployment" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the framework's matchers for this: framework.GVKMatcher("apps/v1/Deployment").Match(items[i])
(same for other similar places--we have lots of stock matchers written, and even ways to combine them easily).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for bringing this up. GVKMatcher().Match() works here.
But thinking more about it, why do we make it return a ResourceTemplateMatcher? The signature is func GVKMatcher(names ...string) ResourceTemplateMatcher and it gives me the impression that it's not what I want, since I don't need templating at all.
We (Google) have built many KRM functions. But we haven't use this ResourceTemplateMatcher at all. I wonder what is the use cases for that? And do we really need it? ResourceMatcher seems to be good enough.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't used the template aspect either. It was something I preserved from the previous iteration of the framework because of the test coverage. If y'all don't use it either I'm open to 🔥 it.

// This example implements a function that reads the desired replicas from the
// functionConfig and updates the replicas field for all deployments.

func Example_filterGVK() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just so we're all on the same page about what's already possible, an API-oriented version of the same thing would look like the code below. Personally I prefer to write my functions this way, because it is only slightly more code and it is much clearer about the function's API, validation rules and processing steps.

func Example_filterGVK() {
	if err := command.AsMain(&ReplicaUpdater{}); err != nil {
		os.Exit(1)
	}
}

type ReplicaUpdater struct {
	Replicas *int `yaml:"replicas" json:"replicas"`
}

func (u *ReplicaUpdater) Validate() error {
	if u.Replicas == nil {
		return framework.ErrMissingFnConfig{RequiredPaths: []string{".Replicas"}}
	}
	return nil
}

func (u *ReplicaUpdater) Process(rl *framework.ResourceList) error {
	if err := framework.LoadFunctionConfig(rl.FunctionConfig, u); err != nil {
		return errors.Wrap(err)
	}
	for i := range rl.Items {
		if framework.GVKMatcher("apps/v1/Deployment").Match(rl.Items[i]) {
			rl.Items[i].SetOrDie(u.Replicas, "spec", "replicas")
		}
	}
	return nil
}

And since this processor doesn't do anything special outside the filtering logic, here's the same thing again using SimpleProcessor:

func Example_filterGVK() {
	updater := &ReplicaUpdater{}
	if err := command.AsMain(framework.SimpleProcessor{Config: updater, Filter: updater}); err != nil {
		os.Exit(1)
	}
}

type ReplicaUpdater struct {
	Replicas *int `yaml:"replicas" json:"replicas"`
}

func (u *ReplicaUpdater) Validate() error {
	if u.Replicas == nil {
		return framework.ErrMissingFnConfig{RequiredPaths: []string{".Replicas"}}
	}
	return nil
}

func (u *ReplicaUpdater) Filter(items []*yaml.RNode) ([]*yaml.RNode, error) {
	for i := range items {
		if framework.GVKMatcher("apps/v1/Deployment").Match(items[i]) {
			items[i].SetOrDie(u.Replicas, "spec", "replicas")
		}
	}
	return items, nil
}

Writing that out, it is a little weird to have to provide the updater to both fields... maybe a nice optimization could be that if a Config is provided, a Filter isn't, and Config is a filter, we use it automatically. WDYT about that enhancement?

return err
}
for i, obj := range rl.Items {
if obj.GetApiVersion() == "apps/v1" && (obj.GetKind() == "Deployment" || obj.GetKind() == "StatefulSet" || obj.GetKind() == "DaemonSet" || obj.GetKind() == "ReplicaSet") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment about taking advantage of the framework's matchers for this.

Btw, have you seen TestTemplateProcessor_ContainerPatchTemplates_MultipleWorkloadKinds, which shows how to use TemplateProcessor for a very similar use case? That processor handles the matter of where the container path is for you, to make it easy to support all the workload types. It also makes it so you don't need to worry about the found/new case, since it uses SMP.

)

// Log prints to stderr.
func Log(in ...interface{}) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To your own point about kyaml having too many responsibilities, I'm not sure we need to provide our own logging mechanism as part of the framework. There's nothing kyaml or k8s resource specific about this.


// As converts a RNode to the desired typed object. ptr must be
// a pointer to a typed object.
func (rn *RNode) As(ptr interface{}) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this! Can you please convert LoadFunctionConfig, which does a superset of this, to use this internally, to make sure that works as expected?

Copy link
Contributor

@KnVerey KnVerey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still haven't made it through to the end of this. I know you said the previous iteration would be difficult to split, but having looked at most of this version, I'm hoping you might reconsider. I'm on board with the spirit of the changes, but it is really challenging to review something this big in both diff and scope. While it is good to have the whole picture for reference, having this PR split up into more manageable chunks would really help it move forward in a more reasonable way. Perhaps the RNode enhancements could be split into a series of initial PRs (it seems to me it needs more test coverage, which would further increase the size of even those changes on their own), and then the framework stuff that makes use of it could go in after? 🙏

ko = yaml.NewRNode(obj)
default:
var err error
ko, err = yaml.NewRNodeFrom(obj)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: move this switch statement inside NewRNodeFrom (and have it make a copy if it is already an RNode, maybe?)


// UpsertObjectToItems adds an object to ResourceList.items. The input object can
// be a RNode or any typed object (e.g. corev1.Pod).
func (rl *ResourceList) UpsertObjectToItems(obj interface{}, checkExistence func(obj, another *yaml.RNode) bool, replaceIfAlreadyExist bool) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you give an example of when checkExistence would be used? I'm not seeing one in the tests. What constitutes the same object from a k8s point of view isn't really customizable, so I'm surprised this would be needed.

It seems to me this can do a variety of different things depending on the existing list state and the boolean param. I'd suggest splitting it into dedicated function depending on the desired functionality. Perhaps we could make a list type to define these on, so that the invocation could be e.g. rl.Items.Append(thing)

type RNodeList []*yaml.RNode

func (n *RNodeList) Append(*yaml.RNode) error {
	// err if already exists
	return nil
}

func (n *RNodeList) Replace(*yaml.RNode) error {
	// err if does NOT already exist
	return nil
}

func (n *RNodeList) Absorb(*yaml.RNode) error {
	// replaces if exists, appends if it doesn't
	return nil
}

rl.Items = append(rl.Items, ko)
} else if replaceIfAlreadyExist {
rl.Items[idx] = ko
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you don't take my suggestion above to split this, I think the final case here should be an error. Otherwise if you give the new item, you do not say to replace, and it does already exists, this method will silently do nothing.

}
}

func ErrorConfigObjectResult(err error, rn *yaml.RNode) *Result {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To your point about wanting a more surface for kyaml packages, I'm not sure about these Error___ variants being necessary. Their single-line contents look simple enough to author.

// This affects both k8s built-in types (e.g. appsv1.Deployment) and any types
// that depends on built-in types (e.g. metav1.ObjectMeta, corev1.PodTemplate).
// To work around it, we rely on the json tags. We first convert mv to json
// and then unmarshal it to ptr.
Copy link
Member Author

@mengqiy mengqiy Feb 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To work around this problem, we can change the yaml decoding logic (https://github.com/kubernetes-sigs/kustomize/blob/master/kyaml/internal/forked/github.com/go-yaml/yaml/yaml.go#L558). We can fall back to use json tag if yaml tag is not available. It makes sense in practice.
@KnVerey WDYT?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quote from GoogleContainerTools/kpt-functions-catalog#739 (comment)

Use sigs.k8s.io/yaml for yaml formatting, to avoid a few bugs in sigs.k8s.io/kustomize/kyaml/yaml

sigs.k8s.io/yaml relies on json tag. And users consider it as a bug in sigs.k8s.io/kustomize/kyaml/yaml.

@mengqiy mengqiy mentioned this pull request Feb 9, 2022
@mengqiy
Copy link
Member Author

mengqiy commented Feb 9, 2022

The RNode part was split out in #4462

@mengqiy
Copy link
Member Author

mengqiy commented Mar 22, 2022

Unfortunately, I don't have enough time to work on this any more.
And we (Google) have decided to iterate on this more internally. @yuwenma is working in this area.

I'm going to close this PR for now.
If anyone want to pick up this PR in OSS, please feel free to do it.

@mengqiy mengqiy closed this Mar 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants