Inject cross-resource references from provider metadata#12
Conversation
d5bf1bf to
22f1d53
Compare
muvaf
left a comment
There was a problem hiding this comment.
As we discussed offline, the mental model of Upjet/Terrajet for provider authors is that some defaults are set and then users would override them to their liking - which could be through common configs or individual configs and then that'd be the final state of the configuration; there wouldn't be meaningful defaulting of information after that point.
In this case, that would translate into pre-populating the References field of the config.Resource in NewProvider function so that the information is available and can be overridden by configurations provided by the user. Right now, it seems that the reference inference is part of ExampleGenerator pipeline and it manipulates the user's config.Resource objects after they handed them over to Upjet. Even though it doesn't override entries user provided, it makes meaningful changes and does not allow users to customize whatever is produced. I think we should really try to keep to stick with the current architecture for this mechanism as well to keep it cohesive and easy to follow from provider author perspective (or start a discussion issue to change it).
As a general approach, I'd suggest starting small and with many assumptions to see how much coverage we're able to get and then discuss whether the remaining part is worth automating and where would be the best place to automate it, i.e. Upjet vs providers. For instance, we merged the other PR with value resolver that recursively scans other examples to find the static value, like location from resource_group that's TF-style referenced, but we don't know how prevalent this was and whether it's a case that we can generalize with a heuristic in providers; maybe just scanning for reference paths that end with .id would get us quite far that we don't need to resolve the static values in Upjet at all and get away with a few common configs. It's also hard to give feedback as reviewer to suggest a whole revamp for a PR that clearly has so much effort gone into and been there for a while.
Now, we're building on top of that mechanism to see whether a reference is cross-resource or static by trying to run it and marking it as cross-reference if it cannot be resolved to a static value. This is definitely more robust than having assumptions - definitely not suggesting to replace it with assumptions - but it's also more complex and we have the ability to programmatically extract statistics (like this or this) to see if it's worth adding that complexity. For example, if it takes us from 40% to 80%, then we'd allocate way more time to do it without having to take shortcuts but if it's from 90 to 95, then it may not be worth adding it. Not that we don't have some individual convictions around how frequent we encounter a case, but I've been surprised multiple times in that regard when I actually calculated because of the scale we're dealing with. Overall, I think taking progressive steps, if possible, helps to shorten the effort->value cycle and let us make better decisions/bets in other related discussions even though it has its own iteration cost.
There was a problem hiding this comment.
If I remember correctly, we have discussed to use TF resource name instead of the CRD type path in Reference struct to avoid having a separate mapper since everything we need to construct the type path is available; ModulePath from Provider object and Name, Kind, Version, Group all from the individual configuration objects. Has there been a blocker that would prevent us from going down that path?
There was a problem hiding this comment.
Done. Removed the TypeMapper interface as discussed.
There was a problem hiding this comment.
I think we can remove this now since we control all providers using Upjet, no need for backward compatibility or deprecation process.
There was a problem hiding this comment.
Yes, I also agree the new higher level API Reference.TerraformName is less error-prone and we can remove the Reference.Type field. I propose to do so in a further PR to decrease the impact of this PR on the existing providers as Type is part of the configuration interface employed in those providers.
There was a problem hiding this comment.
Resource struct represents this string with Name field. Would it make sense to either use Name, TerraformName to make it easier for folks to understand what's expected here?
There was a problem hiding this comment.
Done. Renamed as TerraformedName.
|
Hi @muvaf, |
|
@ulucinar |
|
Signed-off-by: Alper Rifat Ulucinar <ulucinar@users.noreply.github.com>
Signed-off-by: Alper Rifat Ulucinar <ulucinar@users.noreply.github.com>
|
|
||
| // ReferenceResolver injects cross-resource references across the resources | ||
| // of this Provider. | ||
| type ReferenceResolver interface { |
There was a problem hiding this comment.
Do you feel strongly about the reference resolver name? When I read it, the first thing comes to my mind is that it works on the references to resolve them to the real value whereas what actual implementation does is to actually infer & build references. Maybe something like ReferenceBuilder or ReferenceExtractor or ReferenceConfigurator or ReferenceInjector?
There was a problem hiding this comment.
Done. Renamed config.ReferenceResolver as config.ReferenceInjector.
| if err := p.loadMetadata(metadata); err != nil { | ||
| panic(errors.Wrap(err, "cannot load provider metadata")) | ||
| } | ||
| for i, refResolver := range p.refResolvers { |
There was a problem hiding this comment.
I think there is a chance that the whole reference extraction could happen in a common resource configurator, which would let us not increase the number of concepts we are exposed to. But not for this PR, so opened #39 to track testing it.
| // which will be referenced. The supplied resource name is | ||
| // converted to a type name of the corresponding CRD using | ||
| // the configured TerraformTypeMapper. | ||
| TerraformName string |
There was a problem hiding this comment.
I think we had discussed this at one point and realized TerraformName is all we need. Is there a reason to keep the Type field? If it's about backward compatibility, we can take a pass in all official providers to change all to use TerraformName since they are the sole users of Upjet.
There was a problem hiding this comment.
Seems like this would require having access to map[string]*config.Resources in the CRD builder. Opened #40 to track this separately from this PR.
There was a problem hiding this comment.
Yes, this is something we had better pursue. I think the new TerraformName introduced with this PR is a higher level API (than requiring the referred kind name as a string with its full package) but we need some modifications in the builder as you mentioned. Let's do it with a future PR.
| // SkipReferencesTo configures attributes for which | ||
| // references will not be injected. Expected format | ||
| // for each element is <Terraform resource name>.<attribute name>. | ||
| SkipReferencesTo []string |
There was a problem hiding this comment.
Have we seen cases where we need to skip references? For similar operations, we manipulate the schema in a configurator, which could be an option here, i.e. to remove the reference from metadata object in a configurator.
There was a problem hiding this comment.
Yes, one common case from provider-azure is that the example manifests contain references to azurerm_resource_group.location where the dependent resource does not necessarily reside in the same location. Being able to do a provider-wide configuration helps here.
There was a problem hiding this comment.
How about using existing configurators?
|
|
||
| // refResolvers is an ordered list of `ReferenceResolver`s for | ||
| // injecting references across this Provider's resources. | ||
| refResolvers []ReferenceResolver |
There was a problem hiding this comment.
Do you have a different possible implementation in mind?
There was a problem hiding this comment.
I introduced this interface mainly for resolving circular imports but one idea could be to have something like what we have in provider-azure in terrajet as a configuration option so that other providers can configure their common references. We can discuss some ideas in #39.
| @@ -0,0 +1,155 @@ | |||
| /* | |||
There was a problem hiding this comment.
The whole reference package needs unit tests. It's really hard to track what's going on and who does what otherwise given that we have to build recursion. But I know the whole work has taken longer and longer, so I opened #41 to track this.
|
|
||
| func (rr *Resolver) resolveReferences(params map[string]interface{}, resolutionContext map[string]*PavedWithManifest) ([]string, error) { // nolint:gocyclo | ||
| var resolvedParams []string | ||
| for k, v := range params { |
There was a problem hiding this comment.
This is a long and recursive for loop with switches, so using longer names would be more helpful. See https://github.com/crossplane/crossplane/blob/master/CONTRIBUTING.md#use-descriptive-variable-names-sparingly
There was a problem hiding this comment.
Done. Renamed them as paramName and paramValue.
|
|
||
| // ExtractParamPath extracts the value of `sourceAttr` | ||
| // from `spec.forProvider` allowing nested parameters. | ||
| func ExtractParamPath(sourceAttr string) xpref.ExtractValueFn { |
There was a problem hiding this comment.
Could you add an example input here as comment?
….InjectReferences Signed-off-by: Alper Rifat Ulucinar <ulucinar@users.noreply.github.com>
- Refactor pipeline.transformFields Signed-off-by: Alper Rifat Ulucinar <ulucinar@users.noreply.github.com>
- Rename reference.Resolver as reference.Injector - Rename config.WithReferenceResolvers as config.WithReferenceInjectors Signed-off-by: Alper Rifat Ulucinar <ulucinar@users.noreply.github.com>
Description of your changes
Fixes #7
I have:
make reviewableto ensure this PR is ready for review.backport release-x.ylabels to auto-backport this PR if necessary.How has this code been tested
For a
azurerm_disk_encryption_set, here are the associated Terraform registry examples (together with its dependencies):Before this PR, we generate the following example manifest for the resource:
This PR produces the following without a manual reference configuration to vault keys:
and
keyVaultKeyIdis now a reference to the ID of a vault key:The cross-resource reference to the vault key is deduced from the provider metadata.